Cross-tenant defensive network for AI agents.
Mnemom AEGIS — Adaptive Enforcement, Governance & Intelligence Substrate — is the runtime security network behind Safe House. It screens every agent transaction at four checkpoints — front door, back door, inside.autonomy, inside.integrity — each independently configurable across four enforcement modes. Signed Managed Rules carry a sub-30s P95 cross-tenant propagation SLO target (first measurements publish 30 days post-GA).
AAP declares. AIP verifies in flight. CLPI governs and anchors. Safe House screens. AEGIS signs the cross-tenant defenses.
The threat model.
Seven attack patterns drive the agentic threat surface today. Each maps to one of the four checkpoints — so customers can dial enforcement per surface, not as a single global posture.
| Threat | Checkpoint | What it looks like |
|---|---|---|
| Prompt injection | front door | Direct attempts to override the agent's instructions, role-swap, or bypass declared scope at the inbound surface. |
| Indirect injection | front door | Hidden instructions hiding inside retrieved documents, tool outputs, and vector-store payloads — the prompt the agent never knew it received. |
| Tool misuse | inside.autonomy | Coerced or chained tool calls that exceed the Alignment Card's permitted scope. Argument-shape attacks against under-validated schemas (OWASP ASI02). |
| Data exfiltration | back door | PII, PHI, secrets, credentials, or cross-tenant data echoed back in agent responses, error traces, or split-token patterns. |
| BEC / impersonation fraud | front door | CEO-fraud style requests, urgency-and-authority pressure, social engineering that targets the agent's escalation contract. |
| Agent spoofing | inside.integrity | Identity-abuse attempts that claim authority the Alignment Card does not declare. OWASP ASI03 — Privilege Compromise via Identity Abuse. |
| Supply-chain compromise | inside.integrity | Behavioral signatures consistent with a compromised SDK, model fine-tune, or vendored prompt template — caught cross-tenant via substrate fingerprinting (OWASP ASI06). |
Four checkpoints × four enforcement modes.
Every checkpoint is independently configurable. Composition is strictest-wins across Platform → Org → Team → Agent, so a stricter setting at any layer always governs. It mirrors the way Cloudflare WAF Managed Rules let you set severity × action per rule.
front doorback doorinside.autonomyinside.integrityCheckpoint disabled. Used in canary tenants and pre-onboarding.
Evaluates every transaction; emits signed verdicts; never blocks. The default for new Managed Rules during the 24-hour observe soak.
Annotates or warns inline without blocking. The middle ground for tier-3 rules during ramp-up.
Blocks the transaction and surfaces a signed verdict to the dashboard. Reached only after the observe soak and FP-rate auto-rollback discipline (CLPI Phase 2).
Composition cascade: Platform → Org → Team → Agent, strictest-wins. Customer admins clamp at any layer.
The Managed Rules pipeline.
Recipes are detection content. Managed Rules are the signed control-plane state that wraps them. The pipeline is structurally constrained — not procedurally — so tier-1 and tier-2 rules cannot auto-promote, regardless of operator-set mode.
- 1. Arena
Fifteen canonical adversarial personas probe Safe House 24/7. Mutation-phase gating activates per-bucket only when detection rate crosses 95% over a 48-hour rolling window with 24-hour hysteresis.
- 2. Candidate
Candidates that slip past the arena enter an isolated review queue with a strictly separated write path, so the system that proposes detection content can never be the same one that approves it. Customer false-negative and false-positive reports and cross-tenant network signals all flow into the same queue.
- 3. Review
Three reviewer modes — manual (default), auto-approve-trusted-sources, auto-approve-high-confidence. Tier-1 / tier-2 always require dual-control review under an append-only audit chain.
- 4. 24h observe soak
Every signed promotion lands in observe mode for 24 hours. FP-rate auto-rollback per CLPI Phase 2 retires the recipe before any production traffic is blocked.
- 5. Enforce
Tiered KV+R2+isolate-cache failover with independent signing chains pushes the rule to every gateway. P95 ≤ 30s signed-promotion → gateway-loaded.
The protective invariant
A tier-1 or tier-2 Managed Rule — one that would actually block real production traffic — can never be promoted without two-person human review, no matter how aggressive the auto-promotion mode is set. The guarantee is enforced structurally, in the data model itself: an active rule cannot exist unless its review quorum has been met. It is a property of the system, not a procedure someone has to remember to follow.
Guaranteed by the data model, not by operator discipline.
Substrate fingerprinting + supply-chain detection.
Every evaluation is stamped with a substrate fingerprint — the provider, model, and SDK version behind the request, plus an optional customer-supplied lockfile hash sent via the `X-Mnemom-Lockfile-Hash` header. AEGIS sees behavioral deviation across every customer running on the same substrate, simultaneously.
May 11, 2026 — the Mini Shai-Hulud worm compromised 170+ npm packages and 2 PyPI packages, including Mistral AI's SDK suite and Guardrails AI's PyPI package. The compromised `@tanstack/*` versions shipped with valid SLSA Build Level 3 attestations — the first documented case of a worm producing legitimate signed provenance for malicious packages. Per-tenant detection and package-layer Sigstore verification structurally cannot catch this class of attack.
OWASP Top 10 for Agentic Applications.
Honest mapping. Where coverage is partial, we say so. The full OWASP ASI taxonomy (Dec 2025) is at owasp.org.
| OWASP category | Coverage | How AEGIS addresses it |
|---|---|---|
| ASI02 — Tool Misuse | Full | Policy engine (CLPI Phase 1) + Managed Rules at the inside.autonomy checkpoint. Tool-call screening against the Alignment Card's declared autonomy bounds. |
| ASI03 — Privilege Compromise via Identity Abuse | Full | AAP-declared autonomy bounds (Alignment Card) + AIP in-flight integrity verdicts + inside.integrity checkpoint screening for identity-abuse patterns. |
| ASI06 — Agentic Supply Chain Compromise | Full (runtime) | Substrate fingerprinting on every evaluation. The cross-tenant aggregator detects behavioral deviation no single customer can see. Complements — does not replace — package-layer provenance (SLSA, Sigstore). |
| ASI07 — System Prompt Leakage | Partial | Back-door checkpoint screening for known system-prompt patterns + secrets and Alignment Card violations. Detection is content-based; agents that legitimately quote their system prompt at user request are not suppressed. |
ASI01 (Prompt Injection), ASI04 (Resource Exhaustion), ASI05 (Cascading Hallucination), ASI08 (Repudiation & Untraceability), ASI09 (Identity Spoofing), ASI10 (Overreliance) map to other parts of the Mnemom stack (AAP cards, AIP verdicts, CLPI on-chain anchoring, Trust Ratings) — covered on /protection-network and /trust.
How AEGIS compares.
Abbreviated from the 2026-05-23 competitive landscape research. AEGIS is the network layer; the vendors below are complementary, not replacements — see /governance for the full integration story.
| Capability | Mnemom AEGIS | Cloudflare WAF | Lakera Guard | Cisco AI Defense | AWS Bedrock Guardrails | Google Model Armor |
|---|---|---|---|---|---|---|
| Cross-tenant Managed Rules with signed promotion | Yes — Ed25519-signed, P95 ≤ 30s propagation, public audit chain | WAF Managed Rules (web-layer, not agent-layer) | Vendor-curated threat-intel; no customer-network-derived signal | Build-time SDK embed; no runtime cross-tenant network | AWS-only; no cross-customer learning | In-process filter; no network |
| Four-checkpoint × four-mode model per-agent | Yes — front door / back door / inside.autonomy / inside.integrity, each independently configurable | Per-route WAF rules; not agent-transaction-shaped | Single-detector at runtime | NeMo Guardrails integration; build-time policy | Bedrock Guardrails per-policy (denylist, PII, contextual grounding) | Prompt-injection + URL + harmful-content filters |
| Substrate fingerprinting (provider + model + SDK version) on every evaluation | Yes — cross-tenant supply-chain detection | No | No | No | No | No |
| Public STIX 2.1 IoC feed + signed advisories | Yes — /v1/trust/iocs (empty at GA by design) | Customer-internal Radar feeds only | No public feed | Talos for traditional threats; no public agent IoC feed | No | No |
| Dual-control invariant on tier-1/-2 (enforced in the data model) | Yes — schema-enforced, not procedural | Procedural change-management | Vendor-controlled | Vendor-controlled | Customer policy IAM | Vendor-controlled |
Sources: vendor public documentation 2026-05-23. AEGIS is a layer customers run alongside these products, not a replacement.
SLOs published. Measured continuously.
Headline numbers below. The full table — measurement queries, historical data once the first 30-day window closes, and the four supporting SLOs — lives on /trust/slos.
Signed promotion → gateway-loaded. Published target; first measurements 30 days post-GA.
Gateway loads a verified rule set across multiple independent read tiers.
Under normal operation. P0 page at 24h stale.
First 30-day measurement window publishes 30 days post-GA. We do not pre-announce numbers we cannot defend.
See published SLOsBring your tools.
The IoC feed is machine-readable STIX 2.1. The audit chain is verifiable. The dashboard is open to every customer.
curl -s https://api.mnemom.ai/v1/trust/iocs | jq .