Xiaomi MiMo-V2-Pro - Agentic 1T MoE Model
Xiaomi's MiMo-V2-Pro is a 1-trillion-parameter MoE model with 42B active params, 1M context, and agentic coding performance that rivals Claude Sonnet 4.6 at a fraction of the cost.

Xiaomi entered the frontier AI race on March 18, 2026, not with a press conference, but with a reveal. For roughly a week before the official announcement, MiMo-V2-Pro had been running on OpenRouter under the codename Hunter Alpha - a mystery model that topped usage charts, consumed hundreds of billions of tokens, and had the AI community convinced it was DeepSeek V4.
TL;DR
- Agentic 1T-parameter MoE model with 42B active params, built specifically for coding and agent workflows
- 1M token context window; $1/$3 per million input/output tokens at standard context - roughly 5x cheaper than Claude Sonnet 4.6
- SWE-bench Verified 78.0% vs. Claude Sonnet 4.6's 79.6% - near-identical coding performance at a fraction of the price
Overview
MiMo-V2-Pro is the flagship of Xiaomi's second-generation MiMo family, a series of models the company says is designed explicitly for agentic and coding workloads. The lead researcher, Luo Fuli, previously worked at DeepSeek - which explains why so many people jumped to the wrong conclusion when Hunter Alpha appeared.
The model uses a Mixture-of-Experts architecture with over 1 trillion total parameters, activating 42 billion per token during inference. That's a 2.8x increase in active capacity over MiMo-V2-Flash's 15B active parameters. Architecturally, the model adds a 7:1 hybrid attention ratio (local sliding window to global attention, up from 5:1 in Flash) and a lightweight Multi-Token Prediction layer that cuts latency on agentic workflows by generating multiple tokens in parallel.
The weights aren't public - MiMo-V2-Pro is API-only, accessible through Xiaomi's own platform at platform.xiaomimimo.com, and through OpenRouter as xiaomi/mimo-v2-pro. The companion MiMo-V2-Flash model (310B total, 15B active) does carry a MIT license on HuggingFace.
Key Specifications
| Specification | Details |
|---|---|
| Provider | Xiaomi |
| Model Family | MiMo V2 |
| Total Parameters | Over 1 trillion (exact count not disclosed) |
| Active Parameters | 42B per token |
| Architecture | Mixture-of-Experts, Hybrid Attention 7:1, MTP layer |
| Context Window | 256K standard, 1M extended |
| Max Completion | 131,072 tokens |
| Input Price | $1.00/M (0-256K), $2.00/M (256K-1M) |
| Output Price | $3.00/M (0-256K), $6.00/M (256K-1M) |
| Release Date | March 18, 2026 |
| License | Proprietary (API-only) |
Benchmark Performance
Xiaomi's published benchmarks focus on agentic and coding tasks, which is where the model's training emphasis shows.
| Benchmark | MiMo-V2-Pro | Claude Sonnet 4.6 | Claude Opus 4.6 |
|---|---|---|---|
| SWE-bench Verified | 78.0% | 79.6% | 80.8% |
| ClawEval (agentic) | 61.5 (#3 globally) | ~58 | 66.3 |
| Terminal-Bench 2.0 | 86.7 | - | - |
| GPQA Diamond | 87% | - | - |
| GDPval-AA Elo | 1426 | - | - |
| AI Intelligence Index | #8 global, #2 Chinese | - | - |
On SWE-bench Verified - the coding benchmark that has become the de facto standard for assessing coding models - MiMo-V2-Pro sits 1.6 points below Sonnet 4.6 and 2.8 points below Opus 4.6. Given that the Pro costs roughly 5x less than Sonnet and 25x less than Opus at comparable context lengths, that gap matters less than the pricing math.
The ClawEval result is worth watching. ClawEval assesses complex agent scaffold performance - multi-turn tool use, long-horizon planning, agentic recovery from errors - and MiMo-V2-Pro's 61.5 puts it clearly above the agentic benchmark performance of GPT-5.2 (50.0). Opus 4.6 still leads at 66.3, but the gap to Sonnet is now effectively closed.
VentureBeat ran their full benchmark index and found the total cost was $348 for MiMo-V2-Pro versus $2,304 for GPT-5.2 and $2,486 for Claude Opus 4.6. That's the kind of number that moves procurement decisions.
The Hunter Alpha Story
Before the official launch, MiMo-V2-Pro ran anonymously on OpenRouter as "Hunter Alpha" starting around March 11, 2026. The model built up roughly 500 billion tokens per week of usage - extraordinary for a unannounced model. Developers noted the Chinese language preference when probed, the agentic optimization, and the performance profile that sat between Sonnet and Opus on coding tasks.
Most speculation pointed to DeepSeek V4. That made sense: Luo Fuli, who led development of MiMo-V2-Pro, had previously worked at DeepSeek, and the DeepSeek V4 speculation had been building for months. The reveal on March 18 surprised most people who had been following the story.
Xiaomi confirmed the identity and simultaneously announced that developers partnering with OpenClaw, OpenCode, KiloCode, Blackbox, and Cline would get free API access during the launch week. The full Hunter Alpha mystery story has additional context on the community speculation that preceded the announcement.
"I am a Chinese AI model primarily trained in Chinese," Hunter Alpha told a developer who asked who built it. It wouldn't say more.
Key Capabilities
Agentic Coding
MiMo-V2-Pro's training was explicitly tuned for agentic workloads via supervised fine-tuning and reinforcement learning across complex agent scaffolds. The Multi-Token Prediction layer reduces latency in multi-step agentic loops - each completion produces multiple tokens in parallel, which adds up over hundreds of tool calls.
SWE-bench Verified at 78.0% puts it in the top tier for autonomous code repair on real-world GitHub issues. Xiaomi's internal coding accuracy benchmark showed 92.5%, which they report as surpassing Claude Sonnet 4.6 - though the exact methodology isn't published, so treat that as a directional signal rather than a comparable number to third-party evaluations.
Long-Context Reasoning
The 1M token extended context window puts MiMo-V2-Pro alongside Claude Opus 4.6 and Claude Sonnet 4.6 in the small group of frontier models that can ingest a full codebase, a long legal document, or an extended conversation history without chunking. The extended range (256K-1M) does carry a 2x price premium on input tokens, so the cost advantage narrows for very long context workloads.
Extended Reasoning
The model supports configurable thinking via <think> / </think> tags, similar to DeepSeek R1 and Mistral Small 4's reasoning mode. This allows developers to trade latency for accuracy on tasks where extended deliberation improves output quality.
Pricing and Availability
Access is available through two channels:
- Xiaomi MiMo Platform: platform.xiaomimimo.com with OpenAI-compatible endpoints (
https://api.xiaomimimo.com/v1, model IDxiaomi/mimo-v2-pro) - OpenRouter: Listed as
xiaomi/mimo-v2-pro, with third-party providers offering the model
Pricing tiers:
| Context Range | Input | Output | Cache Read |
|---|---|---|---|
| 0 - 256K tokens | $1.00/M | $3.00/M | $0.20/M |
| 256K - 1M tokens | $2.00/M | $6.00/M | $0.40/M |
For context on where this sits relative to the competition: Claude Sonnet 4.6 is $3.00/$15.00 and Opus 4.6 is $5.00/$25.00 at standard context. The cost comparison on the cost efficiency leaderboard will be updated once the model's performance stabilizes.
Cache write is free (temporary caching). Xiaomi hasn't published enterprise or volume pricing.
Strengths
- Near-Sonnet-level SWE-bench performance at roughly 5x lower cost per token
- Top-tier agentic performance: ClawEval 61.5, #3 globally
- 1M token context window with tiered pricing
- Configurable extended reasoning via thinking tags
- MTP layer for reduced latency in agentic loops
- Available on OpenRouter day-one with third-party support
Weaknesses
- Weights not public - no self-hosting option for the Pro tier
- Exact total parameter count not disclosed, which complicates infrastructure planning
- No multimodal input for the Pro tier (MiMo-V2-Omni handles image/video/audio but costs less and scores lower on benchmarks)
- MMLU-Pro standalone score not published, limiting apples-to-apples comparison on knowledge benchmarks
- Xiaomi's long-term API reliability and geographic availability are unproven at frontier scale
The MiMo-V2 Family
MiMo-V2-Pro is one of three models released simultaneously:
MiMo-V2-Pro - 1T+ total / 42B active, text-only, $1/$3 per 1M tokens. The flagship for coding and agentic workflows. API-only.
MiMo-V2-Omni - Multimodal (text, image, video, audio). Processes 10+ hours of continuous audio in a single request. ClawEval 54.8. $0.40 input / $2.00 output per 1M tokens. Leads on MM-BrowserComp web navigation benchmarks.
MiMo-V2-Flash - 310B total / 15B active. MIT license, open weights on HuggingFace. The self-hostable option for teams running local inference.
Related Coverage
- Hunter Alpha on OpenRouter - Is This DeepSeek V4? - Elena Marchetti's coverage of the mystery model before the Xiaomi reveal
- Coding Benchmarks Leaderboard - Full SWE-bench rankings including MiMo-V2-Pro
- Agentic AI Benchmarks Leaderboard - ClawEval and Terminal-Bench rankings
- Cost Efficiency Leaderboard - Price-performance comparisons across frontier models
- Claude Sonnet 4.6 - Closest benchmark competitor
- DeepSeek V4 - The model everyone thought Hunter Alpha was
FAQ
Is MiMo-V2-Pro open source?
No. The Pro tier is API-only with no public weights. The smaller MiMo-V2-Flash (310B total, 15B active) is MIT-licensed and available on HuggingFace.
How does MiMo-V2-Pro compare to Claude Sonnet 4.6 on coding?
SWE-bench Verified: MiMo-V2-Pro 78.0% vs. Sonnet 4.6 79.6%. The gap is 1.6 points. At standard context, MiMo costs $1/$3 vs. Sonnet's $3/$15 per million tokens.
What was Hunter Alpha?
Hunter Alpha was the anonymous OpenRouter deployment of MiMo-V2-Pro that ran from about March 11 to March 18, 2026. It consumed 500B tokens per week and was widely mistaken for DeepSeek V4.
Does MiMo-V2-Pro support multimodal input?
No. The Pro model handles text only. MiMo-V2-Omni handles text, image, video, and audio at $0.40/$2.00 per million tokens.
What is the maximum context window?
Standard context is 256K tokens; extended context is 1M tokens. Maximum completion is 131,072 tokens. Extended context carries a 2x price premium ($2.00/$6.00 per million tokens).
Where can I access the API?
Directly at platform.xiaomimimo.com (model ID: xiaomi/mimo-v2-pro, endpoint: https://api.xiaomimimo.com/v1). Also available on OpenRouter.
Sources:
- Xiaomi MiMo-V2-Pro Official Page
- OpenRouter - MiMo-V2-Pro
- Artificial Analysis - Model Rankings
- VentureBeat - Xiaomi stuns with MiMo-V2-Pro
- The Decoder - Xiaomi launches three MiMo AI models
- The Japan Times - Mystery AI model is Xiaomi MiMo-V2-Pro
- Quasa.io - Xiaomi Unleashes MiMo-V2 Family
- Apidog - MiMo-V2-Pro pricing and API
- HuggingFace - XiaomiMiMo organization
- DEV Community - Hunter Alpha mystery solved
✓ Last verified March 24, 2026
