Name: Grok Build 0.1
Author: xAI

Grok Build 0.1 is xAI's first model purpose-built for agentic software engineering rather than general-purpose chat. Released on May 20, 2026, it's the same model that powers the Grok Build CLI - a terminal-native coding agent written in Rust that xAI launched in beta on May 14. The API became publicly available on May 29, letting developers access it without a SuperGrok or X Premium+ subscription.

TL;DR

Purpose-built coding agent with always-on reasoning, native MCP support, and 256K context window
$1.00/M input tokens and $2.00/M output tokens; cached input at $0.20/M
SWE-Bench Verified score of 70.8%, roughly 17 points behind Claude Code (Opus 4.7 at 87.6%) and GPT-5.5 (88.7%), but closer to Claude Sonnet 4.7 (72.7%)

What separates this from xAI's conversational Grok models is the design intent. Grok Build 0.1 is trained to plan, write, refactor, and iterate across multi-step workflows. It accepts both text and image inputs - diagrams, UI mockups, and error screenshots are all valid context. The reasoning chain can't be disabled; every call runs chain-of-thought internally, which is a different tradeoff than models like Grok 4.3 where you can dial reasoning effort down to zero.

xAI describes the parameter count as undisclosed, but third-party analysis from chatforest.com puts the architecture at about 314B parameters in a Mixture of Experts configuration. Treat that as an estimate until xAI publishes official numbers.

Key Specifications

Specification	Details
Provider	xAI
Model Family	Grok Build
Parameters	~314B (MoE, estimated - not officially confirmed)
Context Window	256K tokens
Input Price	$1.00/M tokens
Cached Input Price	$0.20/M tokens
Output Price	$2.00/M tokens
Release Date	2026-05-20
License	Proprietary
Modalities	Text + image input, text output
Reasoning	Always-on (can't be disabled)

Rate limits on the public API stand at 1,800 requests per minute and 10 million tokens per month. There's no output token cap, which matters for long autonomous coding sessions. Regional availability covers us-east-1, eu-west-1, and us-west-2.

Code running in a terminal interface - the kind of multi-step workflow Grok Build 0.1 is designed for Agentic coding workflows - plan, execute, iterate - are the primary design target for grok-build-0.1. Source: unsplash.com

Benchmark Performance

Official xAI benchmarks for Grok Build 0.1 are limited. The most-cited number is 70.8% on SWE-Bench Verified, based on xAI's internal evaluation harness. BenchLM currently excludes the model from its public leaderboard for lacking sufficient independently-sourced benchmark coverage, which is a fair flag.

Benchmark	Grok Build 0.1	Claude Sonnet 4.7	Claude Opus 4.7	GPT-5.5 (Codex CLI)
SWE-Bench Verified	70.8%	72.7%	87.6%	88.7%
Kilo Bench (completion)	50.6%	N/A	N/A	N/A
PinchBench overall	88.9% (#7/50)	N/A	N/A	N/A
Coding accuracy (Benchable)	95.0%	N/A	N/A	N/A

The SWE-Bench numbers deserve a caveat: the 70.8% figure comes from the predecessor model grok-code-fast-1, which xAI has since aligned closely with grok-build-0.1. Both the Kilo.ai and benchable.ai figures were collected on the production API after May 20. The Kilo Bench completion rate of 50.6% with an average cost of $30.70 per task is on the expensive side - Claude Code and Codex CLI complete similar tasks for less.

On PinchBench (which uses OpenClaw tasks), grok-build-0.1 ranks 7th out of 50 official models with a 88.9% overall score. Coding scores 95.0% at the 90th percentile. Instruction following is the identified weak spot at 60% (53rd percentile) - something to watch when building agents that depend on precise output formatting.

The comparison against Claude Sonnet 4.6 is instructive: nearly identical SWE-Bench numbers, nearly identical pricing, but grok-build-0.1 runs at 100+ tokens/second on xAI's infrastructure versus Claude's lower typical throughput. For agentic tasks that spawn many small tool calls, that speed difference adds up.

Key Capabilities

The headline differentiator is native MCP support. Most coding models bolt MCP on through proxy functions, but grok-build-0.1 declares MCP servers directly in the tools array using "type": "mcp". You point the model at your internal knowledge base, proprietary API, or MCP gateway and it handles the rest. The catch: MCP servers must be publicly accessible - local stdio instances aren't supported in the current API.

The model also supports parallel tool invocation and the Agent Client Protocol (ACP), which means orchestration platforms can treat Grok Build as a callable primitive. That's the same integration pattern used by Claude Code and Codex CLI - and it's why Notion AI added grok-build-0.1 support on June 2, 2026 with their existing model lineup.

For web development work specifically, the image input capability has practical value beyond novelty. You can drop in a Figma screenshot or a browser rendering and have the model write the corresponding HTML/CSS without describing the layout in text. The 256K context window holds a mid-sized codebase comfortably, though it trails GPT-4.1's 1M context for very large monorepos.

Close-up of circuit board traces and chips - representing the hardware powering high-speed model inference xAI claims 100+ tokens/second throughput on its own infrastructure, which meaningfully reduces wall-clock time for multi-step coding agents. Source: unsplash.com

Pricing and Availability

At $1.00/M input and $2.00/M output, grok-build-0.1 is priced competitively within the coding-focused tier. The $0.20/M cached input price is the number that matters for agents running long context repeatedly - a 256K context cached costs about $0.05 per re-read.

New accounts at console.x.ai get $25 in promotional credits with just an email signup. No subscription is required for API access, which removed the previous requirement for SuperGrok Heavy ($299/month) or X Premium+.

Access via third-party platforms is expanding. The model is available through Vercel AI Gateway, OpenRouter (listed as x-ai/grok-build-0.1), Cloudflare AI Gateway, and Kilo Code's VS Code and JetBrains IDE extensions. Puter.js added it on the client-side framework side. The Grok Build CLI itself runs as a terminal TUI with headless scripting support for CI environments.

Compared to the broader xAI model lineup, grok-build-0.1 is cheaper than Grok 4.3 (which runs $300/month for SuperGrok Heavy), though Grok 4.3 brings video input, document generation, and a 2M context window for workloads that need those features.

Strengths and Weaknesses

Strengths

Native MCP server support via "type": "mcp" tools array - no proxy wrapper needed
Always-on reasoning baked into every call, no configuration overhead
100+ tokens/second throughput on xAI infrastructure, meaningfully faster than most comparable models
Image input for reading UI mockups, architecture diagrams, and error screenshots
Competitive SWE-Bench score (70.8%) at a price point below the top-tier agents
No output token cap for extended autonomous sessions
ACP support enables drop-in use in orchestration platforms that already run Claude Code or Codex CLI

Weaknesses

Reasoning can't be disabled - increases cost and latency for simple tasks that don't need it
Instruction following scores at the 53rd percentile, which can cause issues with structured output requirements
SWE-Bench Verified 70.8% is 17 points below Claude Opus 4.7 and GPT-5.5 - the gap is real on complex multi-file fixes
256K context trails GPT-4.1's 1M for very large codebases
Local stdio MCP servers not supported - limits on-premise integrations
Limited independent benchmark coverage; most performance claims trace back to xAI's own harness

Best AI Coding Agents 2026 - where grok-build-0.1 fits against Claude Code, Codex CLI, Cline, and others
SWE-Bench Coding Agent Leaderboard - full ranking context for the 70.8% score
Best AI Coding CLI Tools 2026 - CLI comparison including the Grok Build CLI
MCP Server Ecosystem Leaderboard - MCP integration landscape
Grok 4.3 - xAI's current flagship with 2M context and video input
Grok 4.20 - prior xAI flagship for comparison context

FAQ

What is Grok Build 0.1 best used for?

Agentic coding workflows where the model plans, writes, and refactors code autonomously across multiple steps. Strong fit for web development, debugging, and pipelines that connect to external tools via MCP.

Does Grok Build 0.1 require a subscription?

No. The API is accessible at console.x.ai with an email signup. New accounts receive $25 in promotional credits. SuperGrok or X Premium+ is no longer required.

How does grok-build-0.1 compare to Claude Code?

SWE-Bench scores are similar to Claude Sonnet 4.7 (70.8% vs 72.7%) but below Claude Opus 4.7 (87.6%). Grok Build is faster at 100+ tokens/second. Claude Code handles complex multi-file fixes more reliably at the top tier.

What is the context window for Grok Build 0.1?

256K tokens, which fits a mid-sized codebase. No output token cap applies to API calls, useful for long autonomous sessions.

Does Grok Build 0.1 support MCP natively?

Yes. MCP servers are declared directly in the tools array using "type": "mcp". Servers must be publicly accessible; local stdio MCP is not supported in the current API.

What is the pricing for Grok Build 0.1?

$1.00/M input tokens, $2.00/M output tokens, and $0.20/M for cached input tokens.

Sources: