Grok Build 0.1
Grok Build 0.1 is xAI's first model built specifically for agentic coding workflows, with a 256K context window, native MCP support, and always-on reasoning at $1/M input tokens.

Grok Build 0.1 is xAI's first model purpose-built for agentic software engineering rather than general-purpose chat. Released on May 20, 2026, it's the same model that powers the Grok Build CLI - a terminal-native coding agent written in Rust that xAI launched in beta on May 14. The API became publicly available on May 29, letting developers access it without a SuperGrok or X Premium+ subscription.
TL;DR
- Purpose-built coding agent with always-on reasoning, native MCP support, and 256K context window
- $1.00/M input tokens and $2.00/M output tokens; cached input at $0.20/M
- SWE-Bench Verified score of 70.8%, roughly 17 points behind Claude Code (Opus 4.7 at 87.6%) and GPT-5.5 (88.7%), but closer to Claude Sonnet 4.7 (72.7%)
What separates this from xAI's conversational Grok models is the design intent. Grok Build 0.1 is trained to plan, write, refactor, and iterate across multi-step workflows. It accepts both text and image inputs - diagrams, UI mockups, and error screenshots are all valid context. The reasoning chain can't be disabled; every call runs chain-of-thought internally, which is a different tradeoff than models like Grok 4.3 where you can dial reasoning effort down to zero.
xAI describes the parameter count as undisclosed, but third-party analysis from chatforest.com puts the architecture at about 314B parameters in a Mixture of Experts configuration. Treat that as an estimate until xAI publishes official numbers.
Key Specifications
| Specification | Details |
|---|---|
| Provider | xAI |
| Model Family | Grok Build |
| Parameters | ~314B (MoE, estimated - not officially confirmed) |
| Context Window | 256K tokens |
| Input Price | $1.00/M tokens |
| Cached Input Price | $0.20/M tokens |
| Output Price | $2.00/M tokens |
| Release Date | 2026-05-20 |
| License | Proprietary |
| Modalities | Text + image input, text output |
| Reasoning | Always-on (can't be disabled) |
Rate limits on the public API stand at 1,800 requests per minute and 10 million tokens per month. There's no output token cap, which matters for long autonomous coding sessions. Regional availability covers us-east-1, eu-west-1, and us-west-2.
Agentic coding workflows - plan, execute, iterate - are the primary design target for grok-build-0.1.
Source: unsplash.com
Benchmark Performance
Official xAI benchmarks for Grok Build 0.1 are limited. The most-cited number is 70.8% on SWE-Bench Verified, based on xAI's internal evaluation harness. BenchLM currently excludes the model from its public leaderboard for lacking sufficient independently-sourced benchmark coverage, which is a fair flag.
| Benchmark | Grok Build 0.1 | Claude Sonnet 4.7 | Claude Opus 4.7 | GPT-5.5 (Codex CLI) |
|---|---|---|---|---|
| SWE-Bench Verified | 70.8% | 72.7% | 87.6% | 88.7% |
| Kilo Bench (completion) | 50.6% | N/A | N/A | N/A |
| PinchBench overall | 88.9% (#7/50) | N/A | N/A | N/A |
| Coding accuracy (Benchable) | 95.0% | N/A | N/A | N/A |
The SWE-Bench numbers deserve a caveat: the 70.8% figure comes from the predecessor model grok-code-fast-1, which xAI has since aligned closely with grok-build-0.1. Both the Kilo.ai and benchable.ai figures were collected on the production API after May 20. The Kilo Bench completion rate of 50.6% with an average cost of $30.70 per task is on the expensive side - Claude Code and Codex CLI complete similar tasks for less.
On PinchBench (which uses OpenClaw tasks), grok-build-0.1 ranks 7th out of 50 official models with a 88.9% overall score. Coding scores 95.0% at the 90th percentile. Instruction following is the identified weak spot at 60% (53rd percentile) - something to watch when building agents that depend on precise output formatting.
The comparison against Claude Sonnet 4.6 is instructive: nearly identical SWE-Bench numbers, nearly identical pricing, but grok-build-0.1 runs at 100+ tokens/second on xAI's infrastructure versus Claude's lower typical throughput. For agentic tasks that spawn many small tool calls, that speed difference adds up.
Key Capabilities
The headline differentiator is native MCP support. Most coding models bolt MCP on through proxy functions, but grok-build-0.1 declares MCP servers directly in the tools array using "type": "mcp". You point the model at your internal knowledge base, proprietary API, or MCP gateway and it handles the rest. The catch: MCP servers must be publicly accessible - local stdio instances aren't supported in the current API.
The model also supports parallel tool invocation and the Agent Client Protocol (ACP), which means orchestration platforms can treat Grok Build as a callable primitive. That's the same integration pattern used by Claude Code and Codex CLI - and it's why Notion AI added grok-build-0.1 support on June 2, 2026 with their existing model lineup.
For web development work specifically, the image input capability has practical value beyond novelty. You can drop in a Figma screenshot or a browser rendering and have the model write the corresponding HTML/CSS without describing the layout in text. The 256K context window holds a mid-sized codebase comfortably, though it trails GPT-4.1's 1M context for very large monorepos.
xAI claims 100+ tokens/second throughput on its own infrastructure, which meaningfully reduces wall-clock time for multi-step coding agents.
Source: unsplash.com
Pricing and Availability
At $1.00/M input and $2.00/M output, grok-build-0.1 is priced competitively within the coding-focused tier. The $0.20/M cached input price is the number that matters for agents running long context repeatedly - a 256K context cached costs about $0.05 per re-read.
New accounts at console.x.ai get $25 in promotional credits with just an email signup. No subscription is required for API access, which removed the previous requirement for SuperGrok Heavy ($299/month) or X Premium+.
Access via third-party platforms is expanding. The model is available through Vercel AI Gateway, OpenRouter (listed as x-ai/grok-build-0.1), Cloudflare AI Gateway, and Kilo Code's VS Code and JetBrains IDE extensions. Puter.js added it on the client-side framework side. The Grok Build CLI itself runs as a terminal TUI with headless scripting support for CI environments.
Compared to the broader xAI model lineup, grok-build-0.1 is cheaper than Grok 4.3 (which runs $300/month for SuperGrok Heavy), though Grok 4.3 brings video input, document generation, and a 2M context window for workloads that need those features.
Strengths and Weaknesses
Strengths
- Native MCP server support via
"type": "mcp"tools array - no proxy wrapper needed - Always-on reasoning baked into every call, no configuration overhead
- 100+ tokens/second throughput on xAI infrastructure, meaningfully faster than most comparable models
- Image input for reading UI mockups, architecture diagrams, and error screenshots
- Competitive SWE-Bench score (70.8%) at a price point below the top-tier agents
- No output token cap for extended autonomous sessions
- ACP support enables drop-in use in orchestration platforms that already run Claude Code or Codex CLI
Weaknesses
- Reasoning can't be disabled - increases cost and latency for simple tasks that don't need it
- Instruction following scores at the 53rd percentile, which can cause issues with structured output requirements
- SWE-Bench Verified 70.8% is 17 points below Claude Opus 4.7 and GPT-5.5 - the gap is real on complex multi-file fixes
- 256K context trails GPT-4.1's 1M for very large codebases
- Local stdio MCP servers not supported - limits on-premise integrations
- Limited independent benchmark coverage; most performance claims trace back to xAI's own harness
Related Coverage
- Best AI Coding Agents 2026 - where grok-build-0.1 fits against Claude Code, Codex CLI, Cline, and others
- SWE-Bench Coding Agent Leaderboard - full ranking context for the 70.8% score
- Best AI Coding CLI Tools 2026 - CLI comparison including the Grok Build CLI
- MCP Server Ecosystem Leaderboard - MCP integration landscape
- Grok 4.3 - xAI's current flagship with 2M context and video input
- Grok 4.20 - prior xAI flagship for comparison context
FAQ
What is Grok Build 0.1 best used for?
Agentic coding workflows where the model plans, writes, and refactors code autonomously across multiple steps. Strong fit for web development, debugging, and pipelines that connect to external tools via MCP.
Does Grok Build 0.1 require a subscription?
No. The API is accessible at console.x.ai with an email signup. New accounts receive $25 in promotional credits. SuperGrok or X Premium+ is no longer required.
How does grok-build-0.1 compare to Claude Code?
SWE-Bench scores are similar to Claude Sonnet 4.7 (70.8% vs 72.7%) but below Claude Opus 4.7 (87.6%). Grok Build is faster at 100+ tokens/second. Claude Code handles complex multi-file fixes more reliably at the top tier.
What is the context window for Grok Build 0.1?
256K tokens, which fits a mid-sized codebase. No output token cap applies to API calls, useful for long autonomous sessions.
Does Grok Build 0.1 support MCP natively?
Yes. MCP servers are declared directly in the tools array using "type": "mcp". Servers must be publicly accessible; local stdio MCP is not supported in the current API.
What is the pricing for Grok Build 0.1?
$1.00/M input tokens, $2.00/M output tokens, and $0.20/M for cached input tokens.
Sources:
- Grok Build 0.1 on API - xAI official announcement
- Grok Build 0.1 - xAI Docs
- Grok Build 0.1 - OpenRouter pricing and specs
- xAI Launches grok-build-0.1 - Basenor
- Grok Build 0.1 - Benchable.ai benchmarks
- Grok Build 0.1 - Kilo.ai benchmarks and pricing
- Grok Build 0.1 MCP-Native API guide - ChatForest
- Grok Build vs Claude Code vs Codex CLI - Codersera
- xAI Opens Grok Build 0.1 to Developers via API - DevOps.com
- Grok Build 0.1 - Vercel AI Gateway
- Grok Build 0.1 Benchmarks - BenchLM.ai
✓ Last verified June 8, 2026
