Microsoft Open-Sources Runtime Security for AI Agents

The gap between shipping an AI agent and governing what it does in production has been growing for months. On April 2, Microsoft put out its answer: the Agent Governance Toolkit, a seven-package open-source system that drops a runtime policy layer between your agent framework and every action the agent takes.

TL;DR

Seven independently installable packages for Python, TypeScript, Rust, Go, and .NET
Policy enforcement at less than 0.1ms p99 - roughly 10,000x faster than an LLM API call
Covers all 10 OWASP Agentic Top 10 risks through deterministic controls
MIT license, public preview, framework-agnostic (LangChain, CrewAI, OpenAI Agents SDK, and more)
Plans to donate the project to a foundation for community governance

The timing is deliberate. NIST published its AI agent standards framework in February, OWASP finalized its Agentic Top 10 in December 2025, and enterprises are now launching agents to book flights, execute trades, and write and ship code. Governance tooling hasn't kept pace.

What the Toolkit Actually Does

This isn't a content filter. It does not sit between your prompt and the model. The Agent Governance Toolkit intercepts actions - the things an agent tries to do after the LLM has responded - and assesses them against policies before they run. That distinction matters.

The core component is Agent OS, a stateless policy engine that runs every action through configured rules and returns allow or deny in under 0.1 milliseconds. You write policies in YAML, OPA Rego, or Cedar - whichever you already use. No new DSL to learn.

from agent_os.policies import PolicyEvaluator

evaluator = PolicyEvaluator()
evaluator.load_policies("policies/")

decision = evaluator.evaluate({
    "agent_id": "analyst-1",
    "action": "tool_call",
    "tool_name": "web_search",
})

if decision.allowed:
    # proceed
    pass

For more complex scenarios, you can use Rego directly:

evaluator.load_rego(rego_content="""
package agentos
default allow = false
allow { input.tool_name == "web_search" }
allow { input.role == "admin" }
""")

The Seven Packages

The toolkit ships as a monorepo with independently installable components. You don't have to use all seven.

Agent OS - the policy engine described above. This is the core. Everything else builds on top of it.

Agent Mesh - handles cryptographic identity for agent-to-agent communication using decentralized identifiers (DIDs) with Ed25519 signing, plus SPIFFE/SVID support. Each agent gets a trust score on a 0-1000 scale that decays over time and degrades when the agent's behavior drifts.

Agent Runtime - implements execution rings (inspired by CPU privilege levels) and saga orchestration for multi-step agent tasks. It also includes an emergency termination kill switch.

Agent SRE - brings standard reliability engineering to agents: SLOs, error budgets, circuit breakers, and chaos engineering mode for testing agent behavior under failure conditions.

Agent Compliance - maps your policies to specific regulatory frameworks and produces a compliance grade. Currently covers EU AI Act, HIPAA, and SOC2.

Agent Marketplace - manages plugin lifecycle with Ed25519-signed packages, addressing supply chain risk in agentic systems.

Agent Lightning - governs reinforcement learning training workflows by shaping rewards based on policy constraints. This is the most experimental of the seven.

Microsoft Agent Governance Toolkit architecture overview The toolkit intercepts agent actions before execution, not after, which means policy decisions happen before any external system is touched. Source: socket.dev

Installation and Framework Support

The quickest way to install the full Python package:

pip install agent-governance-toolkit[full]

For other languages:

Language	Package	Install Command
Python 3.10+	agent-governance-toolkit	`pip install agent-governance-toolkit[full]`
TypeScript/Node	@agentmesh/sdk	`npm install @agentmesh/sdk`
.NET	Microsoft.AgentGovernance	`dotnet add package Microsoft.AgentGovernance`
Rust	agentmesh	`cargo add agentmesh`
Go 1.21+	agent-governance-toolkit	`go get github.com/microsoft/agent-governance-toolkit/sdks/go`

The framework integrations hook into native extension points rather than requiring code rewrites. LangChain support uses callback handlers, CrewAI uses task decorators, Google ADK uses the plugin system, and Microsoft Agent Framework uses middleware. The OpenAI Agents SDK and LangGraph integrations are published to PyPI. Haystack and PydanticAI integrations are also shipped.

import { PolicyEngine } from "@agentmesh/sdk";

const engine = new PolicyEngine([
    { action: "web_search", effect: "allow" },
    { action: "shell_exec", effect: "deny" },
]);

const decision = engine.evaluate("web_search");

How It Maps to the OWASP Agentic Top 10

OWASP published the Agentic Top 10 for 2026 in December after more than 100 security researchers reviewed the framework. The toolkit claims to address all ten.

OWASP Risk	Toolkit Mitigation
ASI01 - Goal Hijacking	Semantic intent classifier detecting DESTRUCTIVE_DATA, DATA_EXFILTRATION, PRIVILEGE_ESCALATION
ASI02 - Tool Misuse	Capability sandboxing and MCP security gateway
ASI03 - Identity Abuse	DID-based identity with behavioral trust scoring
ASI04 - Supply Chain	Plugin signing with Ed25519
ASI05 - Code Execution	Execution rings with resource limits
ASI06 - Memory Poisoning	Cross-model verification kernel with majority voting
ASI07 - Insecure Communications	Inter-agent trust protocol encryption
ASI08 - Cascading Failures	Circuit breakers and SLO enforcement
ASI09 - Human-Agent Trust Exploitation	Approval workflows with quorum logic
ASI10 - Rogue Agents	Ring isolation, trust decay, and automated kill switch

The decision to cover all ten rather than cherry-pick is a deliberate design choice - the threat surface of a launched agent isn't optional.

Imran Siddique, principal group engineering manager at Microsoft, described the toolkit as adding "a runtime security layer that enforces policies to mitigate" prompt injection and improve visibility into agent behavior.

Microsoft Agent Governance Toolkit component breakdown The seven packages can be installed independently - you do not need the full stack to get value from Agent OS alone. Source: socket.dev

Where It Falls Short

The toolkit has real gaps worth flagging before you assess it for production.

No Independent Validation of the Intent Classifier

The semantic intent classifier is the mechanism Agent OS uses to catch goal hijacking. It classifies actions as DESTRUCTIVE_DATA, DATA_EXFILTRATION, or PRIVILEGE_ESCALATION. But the classifier's performance against actual adversarial inputs hasn't been independently confirmed. Microsoft ships benchmarks against internal test suites. That isn't the same as third-party red-team results.

Sample Configs Are Not Production-Ready

The documentation says it explicitly: "All policy rules, detection patterns, and sensitivity thresholds are externalized to YAML configuration files. Organizations must customize sample configurations before production." The toolkit ships with example configs, not hardened defaults. If you deploy it as-is, you'll have gaps.

Still Public Preview

Breaking changes are possible before general availability. For teams launching in regulated environments - healthcare, finance - this matters. The roadmap for a stable v1.0 isn't published.

No Track Record Yet

The 9,500 tests and continuous fuzzing in the repo are meaningful signals of engineering quality. Production deployments that have stress-tested the system at scale don't exist yet, at least not publicly. Cisco's DefenseClaw has a similar runtime interception approach and has at least some disclosed production deployments to reference. The Agent Governance Toolkit doesn't.

Despite those caveats, this is the most complete open-source attempt at the agentic governance problem to date. The OWASP alignment gives security teams a framework for requirements. The multi-language support removes the "Python-only" constraint that kills adoption in polyglot shops. And the framework-agnostic design means you can slot it in next to whatever agent orchestration you already run.

Anthropic's Managed Agents and similar production agent platforms are pushing the capability envelope. The governance layer to match them has been overdue. Microsoft's toolkit isn't a finished answer, but it's a credible starting point and it's open-source, which means the missing pieces can be built in public.

Sources: