Mnemom AEGIS

Cross-tenant defensive network for AI agents.

Mnemom AEGIS — Adaptive Enforcement, Governance & Intelligence Substrate — is the runtime security network behind Safe House. It screens every agent transaction at four checkpoints — front door, back door, inside.autonomy, inside.integrity — each independently configurable across four enforcement modes. Signed Managed Rules carry a sub-30s P95 cross-tenant propagation SLO target (first measurements publish 30 days post-GA).

AAP declares. AIP verifies in flight. CLPI governs and anchors. Safe House screens. AEGIS signs the cross-tenant defenses.

Customer dashboard curl /v1/trust/iocs Contact sales

The threat model.

Seven attack patterns drive the agentic threat surface today. Each maps to one of the four checkpoints — so customers can dial enforcement per surface, not as a single global posture.

Threat	Checkpoint	What it looks like
Prompt injection	`front door`	Direct attempts to override the agent's instructions, role-swap, or bypass declared scope at the inbound surface.
Indirect injection	`front door`	Hidden instructions hiding inside retrieved documents, tool outputs, and vector-store payloads — the prompt the agent never knew it received.
Tool misuse	`inside.autonomy`	Coerced or chained tool calls that exceed the Alignment Card's permitted scope. Argument-shape attacks against under-validated schemas (OWASP ASI02).
Data exfiltration	`back door`	PII, PHI, secrets, credentials, or cross-tenant data echoed back in agent responses, error traces, or split-token patterns.
BEC / impersonation fraud	`front door`	CEO-fraud style requests, urgency-and-authority pressure, social engineering that targets the agent's escalation contract.
Agent spoofing	`inside.integrity`	Identity-abuse attempts that claim authority the Alignment Card does not declare. OWASP ASI03 — Privilege Compromise via Identity Abuse.
Supply-chain compromise	`inside.integrity`	Behavioral signatures consistent with a compromised SDK, model fine-tune, or vendored prompt template — caught cross-tenant via substrate fingerprinting (OWASP ASI06).

Four checkpoints × four enforcement modes.

Every checkpoint is independently configurable. Composition is strictest-wins across Platform → Org → Team → Agent, so a stricter setting at any layer always governs. It mirrors the way Cloudflare WAF Managed Rules let you set severity × action per rule.

Mode →

off

observe

nudge

enforce

front door

Inbound message screening — every prompt, retrieval payload, and tool response before the agent processes it.

back door

Outbound response screening — PII, secrets, Alignment Card violations, regulated advice before the response leaves the perimeter.

inside.autonomy

Tool-call screening — every action the agent takes against the autonomy bounds the Alignment Card declares.

inside.integrity

Reasoning-integrity screening — AIP verdicts on thinking-block payloads; substrate-deviation signatures; identity-abuse patterns.

off

Checkpoint disabled. Used in canary tenants and pre-onboarding.

observe

Evaluates every transaction; emits signed verdicts; never blocks. The default for new Managed Rules during the 24-hour observe soak.

nudge

Annotates or warns inline without blocking. The middle ground for tier-3 rules during ramp-up.

enforce

Blocks the transaction and surfaces a signed verdict to the dashboard. Reached only after the observe soak and FP-rate auto-rollback discipline (CLPI Phase 2).

Composition cascade: Platform → Org → Team → Agent, strictest-wins. Customer admins clamp at any layer.

The Managed Rules pipeline.

Recipes are detection content. Managed Rules are the signed control-plane state that wraps them. The pipeline is structurally constrained — not procedurally — so tier-1 and tier-2 rules cannot auto-promote, regardless of operator-set mode.

1. Arena
Fifteen canonical adversarial personas probe Safe House 24/7. Mutation-phase gating activates per-bucket only when detection rate crosses 95% over a 48-hour rolling window with 24-hour hysteresis.
2. Candidate
Candidates that slip past the arena enter an isolated review queue with a strictly separated write path, so the system that proposes detection content can never be the same one that approves it. Customer false-negative and false-positive reports and cross-tenant network signals all flow into the same queue.
3. Review
Three reviewer modes — manual (default), auto-approve-trusted-sources, auto-approve-high-confidence. Tier-1 / tier-2 always require dual-control review under an append-only audit chain.
4. 24h observe soak
Every signed promotion lands in observe mode for 24 hours. FP-rate auto-rollback per CLPI Phase 2 retires the recipe before any production traffic is blocked.
5. Enforce
Tiered KV+R2+isolate-cache failover with independent signing chains pushes the rule to every gateway. P95 ≤ 30s signed-promotion → gateway-loaded.

The protective invariant

A tier-1 or tier-2 Managed Rule — one that would actually block real production traffic — can never be promoted without two-person human review, no matter how aggressive the auto-promotion mode is set. The guarantee is enforced structurally, in the data model itself: an active rule cannot exist unless its review quorum has been met. It is a property of the system, not a procedure someone has to remember to follow.

Guaranteed by the data model, not by operator discipline.

Substrate fingerprinting + supply-chain detection.

Every evaluation is stamped with a substrate fingerprint — the provider, model, and SDK version behind the request, plus an optional customer-supplied lockfile hash sent via the `X-Mnemom-Lockfile-Hash` header. AEGIS sees behavioral deviation across every customer running on the same substrate, simultaneously.

May 11, 2026 — the Mini Shai-Hulud worm compromised 170+ npm packages and 2 PyPI packages, including Mistral AI's SDK suite and Guardrails AI's PyPI package. The compromised `@tanstack/*` versions shipped with valid SLSA Build Level 3 attestations — the first documented case of a worm producing legitimate signed provenance for malicious packages. Per-tenant detection and package-layer Sigstore verification structurally cannot catch this class of attack.

Full threat model on /supply-chain

OWASP Top 10 for Agentic Applications.

Honest mapping. Where coverage is partial, we say so. The full OWASP ASI taxonomy (Dec 2025) is at owasp.org.

OWASP category	Coverage	How AEGIS addresses it
ASI02 — Tool Misuse	Full	Policy engine (CLPI Phase 1) + Managed Rules at the inside.autonomy checkpoint. Tool-call screening against the Alignment Card's declared autonomy bounds.
ASI03 — Privilege Compromise via Identity Abuse	Full	AAP-declared autonomy bounds (Alignment Card) + AIP in-flight integrity verdicts + inside.integrity checkpoint screening for identity-abuse patterns.
ASI06 — Agentic Supply Chain Compromise	Full (runtime)	Substrate fingerprinting on every evaluation. The cross-tenant aggregator detects behavioral deviation no single customer can see. Complements — does not replace — package-layer provenance (SLSA, Sigstore).
ASI07 — System Prompt Leakage	Partial	Back-door checkpoint screening for known system-prompt patterns + secrets and Alignment Card violations. Detection is content-based; agents that legitimately quote their system prompt at user request are not suppressed.

ASI01 (Prompt Injection), ASI04 (Resource Exhaustion), ASI05 (Cascading Hallucination), ASI08 (Repudiation & Untraceability), ASI09 (Identity Spoofing), ASI10 (Overreliance) map to other parts of the Mnemom stack (AAP cards, AIP verdicts, CLPI on-chain anchoring, Trust Ratings) — covered on /protection-network and /trust.

How AEGIS compares.

Abbreviated from the 2026-05-23 competitive landscape research. AEGIS is the network layer; the vendors below are complementary, not replacements — see /governance for the full integration story.

Capability	Mnemom AEGIS	Cloudflare WAF	Lakera Guard	Cisco AI Defense	AWS Bedrock Guardrails	Google Model Armor
Cross-tenant Managed Rules with signed promotion	Yes — Ed25519-signed, P95 ≤ 30s propagation, public audit chain	WAF Managed Rules (web-layer, not agent-layer)	Vendor-curated threat-intel; no customer-network-derived signal	Build-time SDK embed; no runtime cross-tenant network	AWS-only; no cross-customer learning	In-process filter; no network
Four-checkpoint × four-mode model per-agent	Yes — front door / back door / inside.autonomy / inside.integrity, each independently configurable	Per-route WAF rules; not agent-transaction-shaped	Single-detector at runtime	NeMo Guardrails integration; build-time policy	Bedrock Guardrails per-policy (denylist, PII, contextual grounding)	Prompt-injection + URL + harmful-content filters
Substrate fingerprinting (provider + model + SDK version) on every evaluation	Yes — cross-tenant supply-chain detection	No	No	No	No	No
Public STIX 2.1 IoC feed + signed advisories	Yes — /v1/trust/iocs (empty at GA by design)	Customer-internal Radar feeds only	No public feed	Talos for traditional threats; no public agent IoC feed	No	No
Dual-control invariant on tier-1/-2 (enforced in the data model)	Yes — schema-enforced, not procedural	Procedural change-management	Vendor-controlled	Vendor-controlled	Customer policy IAM	Vendor-controlled

Sources: vendor public documentation 2026-05-23. AEGIS is a layer customers run alongside these products, not a replacement.

SLOs published. Measured continuously.

Headline numbers below. The full table — measurement queries, historical data once the first 30-day window closes, and the four supporting SLOs — lives on /trust/slos.

Managed Rule propagation

P95 ≤ 30s

Signed promotion → gateway-loaded. Published target; first measurements 30 days post-GA.

Failover availability

99.99%

Gateway loads a verified rule set across multiple independent read tiers.

Rule-set freshness

P99 ≤ 5 min

Under normal operation. P0 page at 24h stale.

First 30-day measurement window publishes 30 days post-GA. We do not pre-announce numbers we cannot defend.

See published SLOs

Bring your tools.

The IoC feed is machine-readable STIX 2.1. The audit chain is verifiable. The dashboard is open to every customer.

curl -s https://api.mnemom.ai/v1/trust/iocs | jq .

Customer dashboard curl /v1/trust/iocs Contact sales