Agent Containment Engine: A Kill-Switch for Rogue Agents

Mnemom Research | February 2026

Integrity monitoring tells you when an agent goes wrong. The containment engine lets you do something about it.

Today we ship the Agent Containment Engine -- real-time pause, kill, and resume controls for agents running through the Mnemom gateway. When an agent crosses a boundary, you can stop it in milliseconds. When the issue is resolved, you can bring it back. Every action is audited, role-gated, and surfaced through webhooks for SIEM integration.

This is the operational counterpart to the integrity checking and cryptographic proof layers we shipped earlier this month. Those systems answer "is this agent behaving?" The containment engine answers "what do I do when it isn't?"

The Problem

Monitoring without intervention is a spectator sport. You can detect that an agent has violated a boundary, log the violation, alert the team -- and then watch it continue operating while someone figures out what to do. The gap between detection and response is where damage happens.

For enterprise teams running agent fleets, this gap is a compliance risk. The EU AI Act requires mechanisms for human oversight of high-risk AI systems. Logging that a violation occurred is necessary but not sufficient -- you need the ability to intervene. Regulators will ask not just "did you detect the problem?" but "what did you do about it, and how fast?"

How It Works

The containment engine adds a three-state lifecycle to every agent: active, paused, and killed.

Active is the default. The agent operates normally through the gateway. Integrity checks run. Verdicts flow. Everything works as before.

Paused is a reversible suspension. The gateway immediately begins returning HTTP 403 responses with a structured containment_error body to any request routed through the contained agent. The agent's API keys still authenticate -- it's the agent's operational status that's suspended, not its identity. Pausing is instant: the next request after containment is enforced.

Killed is a hard stop. Same gateway enforcement as paused, but resumption requires explicit reactivation by an org owner. Kill is for situations where you don't just want to pause operations -- you want to ensure the agent cannot resume without a deliberate decision by someone with authority.

The API

Four endpoints under each organization's agent routes:

POST /v1/orgs/:org_id/agents/:agent_id/pause
POST /v1/orgs/:org_id/agents/:agent_id/resume
POST /v1/orgs/:org_id/agents/:agent_id/kill
POST /v1/orgs/:org_id/agents/:agent_id/reactivate

Each accepts an optional reason field (up to 500 characters) that gets recorded in the audit log. The response includes the agent's new containment_status, the timestamp, the actor who performed the action, and the reason.

A read endpoint returns the current containment state plus the most recent audit log entries:

GET /v1/orgs/:org_id/agents/:agent_id/containment

Gateway Enforcement

When a contained agent (paused or killed) makes a request through the gateway, it receives:

{
  "error": "agent_contained",
  "containment_status": "paused",
  "contained_at": "2026-02-19T14:30:00Z",
  "message": "This agent has been paused. Contact your organization admin."
}

Enforcement happens at the gateway layer, before the request reaches any upstream provider. There is no window between containment and enforcement -- the get_quota_context_for_agent function that the gateway calls on every request now includes containment_status in its return value. If the status is anything other than active, the request is rejected.

Role-Based Access

Not everyone should be able to kill an agent. The containment engine uses the existing org RBAC system:

Pause / Resume: Available to org owner and admin roles
Kill / Reactivate: Restricted to org owner only
View containment status: Available to all org roles including viewer and auditor

This separation matters. An admin responding to an alert can pause an agent immediately. But permanently killing and reactivating an agent -- which may have downstream dependencies, SLAs, or contractual implications -- requires owner authority.

Auto-Containment

Manual intervention requires someone to be watching. Auto-containment doesn't.

Each agent can be configured with an auto_containment_threshold -- an integer >= 2 that specifies how many consecutive boundary violations trigger an automatic pause. When the threshold is hit, the system pauses the agent with action: "auto_pause" and the reason includes the violation count.

PUT /v1/orgs/:org_id/agents/:agent_id/containment-policy
{ "auto_containment_threshold": 3 }

Set to null to disable. The threshold is intentionally set at a minimum of 2 -- a single boundary violation could be a false positive from the analysis model. Two or more consecutive violations represent a pattern.

Webhook Events

Every containment action fires a webhook event:

agent.paused -- Agent was paused (manual or auto)
agent.resumed -- Agent was resumed from paused state
agent.killed -- Agent was killed
agent.reactivated -- Agent was reactivated from killed state (owner only)

Events include the actor, reason, previous status, new status, and timestamp. Route these to your SIEM, PagerDuty, Slack, or incident management system.

Audit Log

Every containment action is recorded in a dedicated audit table:

Action: pause, resume, kill, reactivate, auto_pause
Actor: The user ID of whoever performed the action (or "system" for auto-containment)
Reason: Free-text explanation
Previous / new status: The state transition
Timestamp: When it happened

The audit log is append-only and immutable. It's queryable through the containment status endpoint, which returns the most recent 50 entries by default.

The Dashboard

The fleet dashboard now includes containment controls alongside integrity scores. Pause and resume are one click. Kill requires confirmation. The containment status badge (green/amber/red) is visible in the agent list view so operators can see fleet-wide containment state at a glance.

Availability

The containment engine is available on Team and Enterprise plans. It requires the organization feature set -- containment operates at the org level because agent governance is inherently a team concern.

Team: Full containment (pause/resume/kill/reactivate), auto-containment policies, containment webhooks, audit log
Enterprise: Same, plus custom containment policies and SLA guarantees on enforcement latency

Get Started

If you're on a Team or Enterprise plan, the containment API is live now. No migration needed -- the containment_status field defaults to active for all existing agents.

API docs: docs.mnemom.ai -- full endpoint reference for containment operations
Dashboard: The fleet dashboard at mnemom.ai/dashboard includes containment controls
Changelog: docs.mnemom.ai/changelog -- technical details of the release

Integrity monitoring without containment is a warning light with no brakes. Now you have both.

Mnemom builds alignment and integrity infrastructure for autonomous agents. The containment engine is part of the Smoltbot gateway, available on Team and Enterprise plans.

GitHub: github.com/mnemom