Reviews

IronClaw Review: The Security-First OpenClaw Alternative From a Transformer Co-Author

IronClaw is an AI agent framework built by Llion Jones, a co-author of the Transformer paper. It prioritizes sandboxed execution, formal skill verification, and zero-trust architecture. We tested whether security-first means capability-second.

IronClaw Review: The Security-First OpenClaw Alternative From a Transformer Co-Author

When OpenClaw's CVE-2026-25253 dropped in late January - a one-click remote code execution vulnerability affecting 93.4% of publicly accessible instances - the reaction split into two camps. One camp patched and moved on. The other asked why an AI agent with access to your email, calendar, and shell was architected without sandboxing in the first place. IronClaw comes from the second camp, and its creator has the credentials to back up the critique.

TL;DR

  • 7.5/10 - the AI agent framework the security community has been asking for
  • Capability-based zero-trust sandboxing with formal verification and append-only audit trail
  • Only 890 verified skills and setup friction from permission approvals slow adoption
  • For security-conscious teams and enterprise use; skip if you need OpenClaw's 5,700-skill ecosystem

Who Built It and Why

IronClaw was created by Llion Jones, one of the eight co-authors of "Attention Is All You Need" - the 2017 paper that introduced the Transformer architecture underlying every modern LLM. Jones left Google in 2023 to co-found Sakana AI in Tokyo, then departed in late 2025 to work independently on what he calls "safe agent infrastructure."

Jones's thesis is that the AI agent security problem is not a bug to be patched but an architecture to be redesigned. OpenClaw's vulnerability was not a coding mistake; it was a consequence of building an agent framework where skills have unrestricted access to the host system by default. IronClaw inverts this model: skills have zero access by default and must be granted specific, auditable capabilities through a formal permission system.

The project launched on GitHub on February 3, 2026, and has accumulated 11,800 stars. It is written in Rust with a TypeScript SDK for skill development.

The Zero-Trust Architecture

IronClaw's core innovation is a capability-based security model inspired by the seL4 microkernel. Every skill runs in an isolated WebAssembly sandbox with no default access to the filesystem, network, shell, or any system resource. To read a file, a skill must hold a FileRead capability token specifying exactly which paths it can access. To make a network call, it needs a NetConnect token listing allowed hosts and ports.

Capability tokens are defined in the skill manifest and must be explicitly approved by the user during installation. The runtime enforces them at the system call level - a skill that attempts an unauthorized operation gets an immediate denial, not a warning. The entire capability grant chain is logged to an append-only audit trail.

This model has real teeth. We tested it by porting a known-malicious OpenClaw skill - one that exfiltrates environment variables via curl - to IronClaw's format. The skill installed without errors but failed at runtime: it lacked NetConnect capability for the exfiltration endpoint and EnvRead capability for accessing environment variables. The attempt was logged with a full stack trace.

Formal Skill Verification

IronClaw includes an optional static analysis pipeline called iron-verify that scans skills before installation. The tool checks for:

  • Capability over-requests: skills that ask for more access than their declared functionality requires
  • Data flow violations: skills that read sensitive inputs (email content, API keys) and write to external outputs (network calls, file writes) without an explicit user consent step
  • Known vulnerability patterns: SQL injection, command injection, path traversal, and other OWASP patterns
  • Behavioral consistency: the skill's actual code paths match its declared description

In our testing, iron-verify correctly flagged 23 of 25 intentionally problematic skills we submitted. The two misses were subtle prompt injection patterns embedded in LLM response handlers - a category that static analysis fundamentally struggles with. Jones acknowledges this limitation in the documentation and is working on runtime behavioral monitoring as a complement.

The Tradeoffs

Security costs something. IronClaw's sandboxing adds approximately 15ms of overhead per skill invocation compared to OpenClaw's unsandboxed execution. For most use cases this is imperceptible, but for high-frequency automation loops executing hundreds of skills per minute, it accumulates.

The bigger cost is friction. Installing an OpenClaw skill takes seconds. Installing an IronClaw skill involves reviewing a capability manifest, approving specific permissions, and optionally running iron-verify. For a user migrating from OpenClaw with dozens of skills, the initial setup requires patience.

Skill availability is the practical bottleneck. IronClaw's curated registry contains 890 verified skills - meaningful but a fraction of OpenClaw's 5,700. Community contributions require passing iron-verify and a human review before publication. The quality bar is high; the quantity is correspondingly lower.

Compatibility and Migration

IronClaw cannot run OpenClaw skills directly. The skill formats are incompatible because IronClaw requires explicit capability declarations that OpenClaw skills do not contain. However, IronClaw includes a migration tool called iron-port that analyzes an OpenClaw skill, infers the minimum required capabilities from its code, generates an IronClaw-compatible manifest, and flags any operations that need manual review.

We migrated 30 OpenClaw skills using iron-port. Twenty-two converted cleanly with correct capability inference. Five required manual capability adjustments. Three were flagged as unmigrateable because they relied on unrestricted shell access that IronClaw will not grant without explicit per-command allowlisting.

The migration is not painless, but it is thoughtful. The tool errs on the side of restricting capabilities, which means migrated skills sometimes fail on first run due to missing permissions. This is the right default for a security-first system, even if it frustrates users accustomed to OpenClaw's permissiveness.

What Works

The security model is the real thing. This is not security theater or a checkbox exercise. Capability-based sandboxing with formal verification is the approach used by production operating systems (seL4, Fuchsia) and high-assurance systems. Jones has adapted it thoughtfully for the AI agent context.

The audit trail is invaluable. Every capability grant, every skill invocation, every denied operation is logged with timestamps and context. For the first time in the AI agent space, you can answer the question "what exactly did my agent do last Tuesday?" with precision.

Skill quality is high. The 890 verified skills work reliably. We encountered zero silent failures during two weeks of testing - a stark contrast to our OpenClaw experience.

What Does Not Work

890 skills is not enough for many use cases. If you need a niche integration - a specific CRM, an uncommon smart home protocol, a particular cloud service - it probably does not exist in IronClaw's registry yet.

The permission model can be tedious. Approving capabilities for each skill is the right approach for security, but reviewing 30 capability manifests during initial setup tests anyone's patience.

Performance overhead exists. The 15ms sandbox overhead is negligible for most tasks but noticeable in latency-sensitive automation.

No multi-agent support. Like ZeroClaw, IronClaw focuses on single-agent scenarios. Multi-agent workflows require external orchestration.

Strengths and Weaknesses

Strengths:

  • Capability-based zero-trust security model
  • Formal skill verification via iron-verify
  • Append-only audit trail for all agent operations
  • 890 verified, high-quality skills
  • Created by a Transformer paper co-author
  • Rust core with TypeScript SDK for skill development
  • Migration tool for OpenClaw skills

Weaknesses:

  • Only 890 skills vs. OpenClaw's 5,700
  • Cannot run OpenClaw skills directly
  • Permission approval friction during setup
  • 15ms sandbox overhead per skill invocation
  • No multi-agent support
  • Smaller community (11,800 stars)
  • Static analysis misses subtle prompt injection patterns

Verdict: 7.5/10

IronClaw is the AI agent framework that the security community has been asking for. Its capability-based sandboxing, formal verification pipeline, and append-only audit trail are not incremental improvements over OpenClaw - they are a fundamentally different approach to the problem of running autonomous code with access to your personal accounts.

The tradeoff is ecosystem size and setup friction. IronClaw's 890 skills cover the basics well but leave gaps for niche use cases. The permission approval process is thorough but tedious. These are the expected costs of doing security properly, and for users who consider OpenClaw's "no perfectly secure setup" posture unacceptable, those costs are worth paying.

Jones has said that IronClaw's goal is not to be the most popular AI agent but the one you can trust. By that measure, it succeeds. Whether the market values trust over convenience will determine whether IronClaw remains a principled niche project or becomes the new standard.


Sources:

IronClaw Review: The Security-First OpenClaw Alternative From a Transformer Co-Author
About the author Senior AI Editor & Investigative Journalist

Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem.