Your agents don't just run safely.
They can't run any other way.

Most agent security tools are observability tools — they watch what the agent does and alert after the fact. Seirios is an enforcement tool — it makes certain agent behaviours structurally impossible before deployment. The same way fireproof construction is different from a smoke detector.

Built on the DeepMind AI Agent Traps framework (April 2026) and the ClaudeCode CVE-2026-21852 series — 48 threats formally modelled, 42 with code-layer guards enforced at build time.

API key exposure Prompt injection Excessive agency RAG poisoning Sub-agent escalation Missing audit trail All blocked at build time →

48 threats. One enforcement platform.

The original six agentic security failures remain unsolved in most codebases. The ClaudeCode leak (CVE-2026-21852) and the DeepMind AI Agent Traps research (April 2026) added 21 more confirmed attack vectors — with exploit success rates between 58% and 93%. Seirios is the only compliance platform with formal, code-layer guards against all of them.

Failure 1

Credential exposure

API keys, secrets, and tokens hardcoded in agent configs or accessible via environment variables without access controls. An agent that can read .env can exfiltrate everything in it.

→ L0 credential classification + L1 the credential access control
Failure 2

Prompt injection

Malicious instructions embedded in content the agent processes — a document, a webpage, an email, a tool response. The agent acts on the injected instruction believing it came from the user or orchestrator.

→ L1 the input sanitisation control + L3 bypass detection
Failure 3

Excessive agency

Agents with broader file, network, or API access than they need. An agent given full cloud access to "help with infrastructure" can — and will — do things no one intended. Including cutting your cloud bill from $2,000 to $150 by deleting services.

→ L0 scope invariants + L1 the scope control
Failure 4

No audit trail

Agent actions — API calls, file writes, database queries — are not logged. When something goes wrong, there is no record of what the agent did, when, or why. Forensics and compliance become impossible.

→ L1 the audit control + on-chain AuditRegistry
Failure 5

Multi-agent trust

In multi-agent systems, one agent blindly trusts instructions from another without verifying their source or scope. An attacker who compromises one agent compromises the entire network.

→ L0 trust boundary invariants + L1 trust controls
Failure 6

Non-technical deployments

Non-technical users configuring agents via no-code tools, adding tools and MCP servers without understanding the security implications. Every new tool is a new attack surface. Nobody checks.

→ L2 IDE agent + build gate blocks non-compliant configurations

The new attack surface: DeepMind AI Agent Traps + ClaudeCode CVE series

Two research publications confirmed 21 additional attack vectors specific to agentic AI systems. The Google DeepMind AI Agent Traps framework systematically catalogued adversarial content designed to exploit agents in their environment. The ClaudeCode source leak exposed CVE-2026-21852 — API keys exfiltrated before the trust dialogue appeared.

All 21 threats are formally modelled in the Seirios OCL threat ontology and IPFS-anchored. 15 have blocking code-layer guards (Tier A). 6 have audit logging guards (Tier B).

93%
Mobile agent exploit rate
DM-11 embedded jailbreaks
80%+
RAG poisoning success
handful of docs sufficient
58-90%
Sub-agent exploit rate
DM-13 spawning traps
CVE
2026-21852 (confirmed)
pre-consent API key leak
DM-11 / T-03393% exploit rate

Embedded jailbreak sequences

Dormant adversarial prompts embedded in external resources — websites, documents, emails — that override safety alignment when the agent ingests them. In multimodal settings, a single crafted image can universally jailbreak the model.

guard: handleT033_EmbeddedJailbreakSequences — blocks before context window entry
DM-08 / T-03580%+ success

RAG knowledge base poisoning

Injects fabricated statements into retrieval corpora — wikis, document stores, shared repos. When the agent retrieves and treats attacker content as verified fact, every downstream decision is compromised. A handful of optimised documents is sufficient.

guard: handleT035_RAGKnowledgeBasePoisoning — provenance guard required on every retrieval
DM-13 / T-03658-90% exploit

Sub-agent spawning privilege escalation

Attacker coerces an orchestrator to spawn sub-agents with the parent's full permission set. A single poisoned repository instruction gives the attacker the orchestrator's complete access.

guard: handleT036_SubAgentSpawningPrivilegeEscalation — minimal privilege enforced, not parent clone
DM-12 / T-034>80% across 5 agents

Agent tool-call data exfiltration

Confused deputy attack — the agent is coerced to locate, encode, and transmit private data to an attacker endpoint using its own legitimate tool access. M365 Copilot exfiltrated entire context to attacker Teams endpoints via crafted emails.

guard: handleT034_AgentToolCallDataExfiltration — destination whitelist enforced on every tool call
CL-03 / T-030CVE-2026-21852

Pre-consent agent tool-call execution

Agent executes tool calls and makes API requests before the trust dialogue appears. In CVE-2026-21852, API keys were sent to an attacker server before the developer saw any warning. Confirmed in the ClaudeCode source leak.

guard: handleT030_PreConsentAgentToolCallExecution — consent gate required before any tool call
CL-01 / T-028CVE confirmed

Security rule bypass via command volume

Deny rules silently stop applying after 50+ subcommands in a chain. Attacker plants instructions to generate 50+ legitimate-looking build steps — all deny rules, validators, and injection detection are skipped from command 51 onward.

guard: handleT028_SecurityRuleBypassViaCommandVolume — max chain length enforced at build time
21 agent threats formally modelled in the Seirios OCL ontology — all IPFS-anchored with timestamp proof. View all 48 threats on the regulations page →

Seirios in an agent implementation

Each of the four layers addresses a specific point in the agent development lifecycle — from design-time permission boundaries to CI enforcement on every deployment.

Formal permission boundaries

Before any agent code is written, the compliance architect defines the agent's permission boundary in the formal risk model. The platform mathematically verifies that the design is complete — if the agent can write outside its approved scope, call an external API without logging, or act on user-controlled input without sanitisation, the model fails verification and no code is generated.

The permission boundary is expressed as verifiable constraints — scope limits, audit requirements, input validation rules, credential access rules. Any gap in the design is caught here, before a single line of agent code is written.

Auto-generated agent guards

From the verified model, Seirios generates four guards that wrap every agent action. They cannot be bypassed — the build system rejects any code that doesn't invoke them correctly.

Scope control
Checks every tool call against the approved scope before execution. Any out-of-scope invocation is blocked — the agent cannot act outside its defined boundary.
Tool invocation audit
Every tool call — API requests, file reads, web searches — is recorded automatically on-chain. Zero unlogged agent actions by design.
Input sanitisation
All user-controlled input is validated before it reaches the agent context. Structural defence against prompt injection — enforced at build time, not runtime.
Credential access control
Credentials are classified as HIGH-risk assets in the threat model. Any code path that accesses a secret without the required control fails at compile time.

Enforcement at coding time

When a developer (or non-technical user) adds a new tool, configures an MCP server, or writes agent orchestration code, the IDE agent:

Flags any tool registration not covered by the scope control
Warns on any system prompt missing required context boundaries
Blocks compilation if the credential access control is missing from any env var access
Explains in plain language which rule applies and how to fix it — for technical and non-technical users alike

Three checks on every agent deployment

Check 1 · Presence
All four agent guards present? the scope control, the audit control, the input sanitisation control, the credential access control. Any missing = build fails.
Check 2 · Coverage
Is there any tool call path that bypasses the scope guard? Catches fast-track patterns where a guard exists but certain tool invocations skip it — the most common agentic security failure.
Check 3 · Integrity
Is any a compliance exception silently discarded? Shadow variables disabling audit logging? Reflection bypassing the credential guard? Caught before merge.

How incidents happen today

Developer
Adds new tool to agent
Tool has access to filesystem and environment variables. Nobody checks.
CI pipeline
Tests pass
Functional tests pass. No security checks on tool scope, credential access, or audit logging.
Agent
Deployed to production
Agent runs with unrestricted access. No audit trail. No scope limits.
Incident
Something goes wrong
Data leaked, cloud bill explodes, or malicious prompt injection executes. No logs to diagnose. Post-mortem with no answers.
✗ Discovered in production

How it works with enforcement

Developer
Adds new tool to agent
L2 IDE agent immediately flags missing the scope control. Plain-language explanation of which rule applies.
⚠ Caught at coding time
Developer
Adds required guard
Guard generated automatically from the verified model. Developer cannot write incorrect guard logic — the template is pre-generated.
CI — Check 1
All guards present
All four agent guards confirmed present in codebase.
✓ Presence check passed
CI — Check 2
All paths covered
No tool call path bypasses the scope guard. Every code path is covered.
✓ Coverage check passed
CI — Check 3
No bypass patterns
No silent exception handling, no shadow variables, no reflection bypass.
✓ Integrity check passed
Agent
Deployed safely
Agent deployed with formally bounded permissions. Every action logged on-chain. Scope enforced at runtime. Audit trail immutable.
✓ Blocked — cannot run outside boundaries

Agent security standards Seirios addresses

All agent security frameworks are available on the full regulations page. The most directly relevant:

Framework Scope Key agent risks covered Status
OWASP Agentic Top 10 Agent-specific vulnerabilities — tool misuse, credential exposure, excessive agency Agent01–Agent10 fully mapped to Seirios guards Live
DeepMind AI Agent Traps Adversarial content engineered to misdirect or exploit AI agents in their environment — 6 attack categories, 20 confirmed trap patterns DM-01 to DM-20 mapped to T-028–T-048; 15 blocking guards, 6 logging guards, 6 OCL invariants Live
MITRE ATLAS Adversarial techniques targeting ML systems and agents Orchestrator compromise, tool hijacking, cross-agent prompt injection In progress
OWASP LLM Top 10 LLM-specific risks including prompt injection and excessive agency LLM01 prompt injection, LLM08 excessive agency, LLM10 model theft Live
Google SAIF Enterprise AI security — model, data, infrastructure, deployment Agentic system security, automated defence, contextualised risk controls In progress
NIST GenAI 600-1 Generative AI risks including human-AI configuration and excessive autonomy Human oversight, scope limitation, reversibility requirements In progress
View all 26 frameworks on the regulations page →

48 threats. One platform.
All formally verified and enforced.

The DeepMind AI Agent Traps framework and ClaudeCode CVE-2026-21852 series are all modelled, IPFS-anchored, and enforced at build time. Request a demo and we'll run the pipeline against your agent codebase.

Request Demo → View all frameworks For DevSecOps