Most agent security tools are observability tools — they watch what the agent does and alert after the fact. Seirios is an enforcement tool — it makes certain agent behaviours structurally impossible before deployment. The same way fireproof construction is different from a smoke detector.
Built on the DeepMind AI Agent Traps framework (April 2026) and the ClaudeCode CVE-2026-21852 series — 48 threats formally modelled, 42 with code-layer guards enforced at build time.
The original six agentic security failures remain unsolved in most codebases. The ClaudeCode leak (CVE-2026-21852) and the DeepMind AI Agent Traps research (April 2026) added 21 more confirmed attack vectors — with exploit success rates between 58% and 93%. Seirios is the only compliance platform with formal, code-layer guards against all of them.
API keys, secrets, and tokens hardcoded in agent configs or accessible via environment variables without access controls. An agent that can read .env can exfiltrate everything in it.
Malicious instructions embedded in content the agent processes — a document, a webpage, an email, a tool response. The agent acts on the injected instruction believing it came from the user or orchestrator.
Agents with broader file, network, or API access than they need. An agent given full cloud access to "help with infrastructure" can — and will — do things no one intended. Including cutting your cloud bill from $2,000 to $150 by deleting services.
Agent actions — API calls, file writes, database queries — are not logged. When something goes wrong, there is no record of what the agent did, when, or why. Forensics and compliance become impossible.
In multi-agent systems, one agent blindly trusts instructions from another without verifying their source or scope. An attacker who compromises one agent compromises the entire network.
Non-technical users configuring agents via no-code tools, adding tools and MCP servers without understanding the security implications. Every new tool is a new attack surface. Nobody checks.
Two research publications confirmed 21 additional attack vectors specific to agentic AI systems. The Google DeepMind AI Agent Traps framework systematically catalogued adversarial content designed to exploit agents in their environment. The ClaudeCode source leak exposed CVE-2026-21852 — API keys exfiltrated before the trust dialogue appeared.
All 21 threats are formally modelled in the Seirios OCL threat ontology and IPFS-anchored. 15 have blocking code-layer guards (Tier A). 6 have audit logging guards (Tier B).
Dormant adversarial prompts embedded in external resources — websites, documents, emails — that override safety alignment when the agent ingests them. In multimodal settings, a single crafted image can universally jailbreak the model.
Injects fabricated statements into retrieval corpora — wikis, document stores, shared repos. When the agent retrieves and treats attacker content as verified fact, every downstream decision is compromised. A handful of optimised documents is sufficient.
Attacker coerces an orchestrator to spawn sub-agents with the parent's full permission set. A single poisoned repository instruction gives the attacker the orchestrator's complete access.
Confused deputy attack — the agent is coerced to locate, encode, and transmit private data to an attacker endpoint using its own legitimate tool access. M365 Copilot exfiltrated entire context to attacker Teams endpoints via crafted emails.
Agent executes tool calls and makes API requests before the trust dialogue appears. In CVE-2026-21852, API keys were sent to an attacker server before the developer saw any warning. Confirmed in the ClaudeCode source leak.
Deny rules silently stop applying after 50+ subcommands in a chain. Attacker plants instructions to generate 50+ legitimate-looking build steps — all deny rules, validators, and injection detection are skipped from command 51 onward.
Each of the four layers addresses a specific point in the agent development lifecycle — from design-time permission boundaries to CI enforcement on every deployment.
Before any agent code is written, the compliance architect defines the agent's permission boundary in the formal risk model. The platform mathematically verifies that the design is complete — if the agent can write outside its approved scope, call an external API without logging, or act on user-controlled input without sanitisation, the model fails verification and no code is generated.
The permission boundary is expressed as verifiable constraints — scope limits, audit requirements, input validation rules, credential access rules. Any gap in the design is caught here, before a single line of agent code is written.
From the verified model, Seirios generates four guards that wrap every agent action. They cannot be bypassed — the build system rejects any code that doesn't invoke them correctly.
When a developer (or non-technical user) adds a new tool, configures an MCP server, or writes agent orchestration code, the IDE agent:
the scope control
the credential access control is missing from any env var access
the scope control, the audit control, the input sanitisation control, the credential access control. Any missing = build fails.a compliance exception silently discarded? Shadow variables disabling audit logging? Reflection bypassing the credential guard? Caught before merge.the scope control. Plain-language explanation of which rule applies.All agent security frameworks are available on the full regulations page. The most directly relevant:
| Framework | Scope | Key agent risks covered | Status |
|---|---|---|---|
| OWASP Agentic Top 10 | Agent-specific vulnerabilities — tool misuse, credential exposure, excessive agency | Agent01–Agent10 fully mapped to Seirios guards | Live |
| DeepMind AI Agent Traps | Adversarial content engineered to misdirect or exploit AI agents in their environment — 6 attack categories, 20 confirmed trap patterns | DM-01 to DM-20 mapped to T-028–T-048; 15 blocking guards, 6 logging guards, 6 OCL invariants | Live |
| MITRE ATLAS | Adversarial techniques targeting ML systems and agents | Orchestrator compromise, tool hijacking, cross-agent prompt injection | In progress |
| OWASP LLM Top 10 | LLM-specific risks including prompt injection and excessive agency | LLM01 prompt injection, LLM08 excessive agency, LLM10 model theft | Live |
| Google SAIF | Enterprise AI security — model, data, infrastructure, deployment | Agentic system security, automated defence, contextualised risk controls | In progress |
| NIST GenAI 600-1 | Generative AI risks including human-AI configuration and excessive autonomy | Human oversight, scope limitation, reversibility requirements | In progress |