Ministral 3 14B
Mistral AI's largest Ministral 3 model - 14B parameters, 256K context, Apache 2.0 license, multimodal, built for local deployment and agentic workflows.

Ministral 3 14B is the largest model in Mistral AI's Ministral 3 family, released on December 2, 2025 with the 3B and 8B variants. It combines a 13.5B-parameter language core with a 0.4B vision encoder for a total of 14 billion parameters - multimodal, Apache 2.0 licensed, and targeting local deployments where the weight class matters.
TL;DR
- Best-in-class reasoning at 14B: 85.0% on AIME 2025 and 71.2% on GPQA Diamond using the reasoning variant
- 256K context window, native function calling, vision support, 40+ languages, Apache 2.0 license
- Beats Qwen3-14B (73.7% AIME 2025) and Gemma 3 12B on core benchmarks despite having fewer nominal parameters
Mistral released the Ministral 3 series as its answer to edge and private deployment demand. The 14B sits at the top of a three-model stack (3B, 8B, 14B), sharing the same dense architecture, base training curriculum, and Apache 2.0 license. Each size comes in base, instruct, and reasoning variants. Mistral's claim is that the 14B delivers "performance comparable to its larger Mistral Small 3.2 24B counterpart" - a comparison that holds on instruction-following benchmarks but breaks down on some coding tasks where the 24B holds an edge.
The model is 68.6% smaller than Mistral Small 3.2 by parameter count but pushes past it on MMLU Redux (82.0% base vs lower), Arena Hard (55.1%), and AIME 2025 reasoning - while fitting in 24 GB VRAM at FP8, a threshold reachable on a single consumer GPU tier like the RTX 4090. That tradeoff is Ministral 3 14B's core pitch.
Key Specifications
| Specification | Details |
|---|---|
| Provider | Mistral AI |
| Model Family | Ministral |
| Parameters | 14B (13.5B language model + 0.4B vision encoder) |
| Architecture | Dense transformer, GQA (32Q / 8KV heads), SwiGLU, RMSNorm, RoPE + YaRN |
| Layers | 40 transformer layers, hidden dim 5120, FFN dim 16384 |
| Context Window | 256K tokens (262,144 exact) |
| Input Price | $0.20/M tokens |
| Output Price | $0.20/M tokens |
| Release Date | December 2, 2025 |
| License | Apache 2.0 |
| Modalities | Text + image input |
| Languages | 40+ (including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic) |
Benchmark Performance
All numbers below come from Mistral's official HuggingFace model card for Ministral-3-14B-Instruct-2512 and the arXiv technical report (2601.08584). Reasoning variant scores use the Ministral-3-14B-Reasoning-2512 checkpoint.
Instruct Variant
| Benchmark | Ministral 3 14B | Qwen3 14B | Gemma 3 12B |
|---|---|---|---|
| Arena Hard | 55.1% | 42.7% | 43.6% |
| WildBench | 68.5 | 65.1 | 63.2 |
| MATH Maj@1 | 90.4% | 87.0% | 85.4% |
| MM MTBench | 8.49 | N/A | 6.70 |
Reasoning Variant
| Benchmark | Ministral 3 14B (Reasoning) | Qwen3-14B (Thinking) |
|---|---|---|
| AIME 2025 | 85.0% | 73.7% |
| AIME 2024 | 89.8% | 83.7% |
| GPQA Diamond | 71.2% | 66.3% |
| LiveCodeBench v6 | 64.6% | 59.3% |
Base Model
| Benchmark | Ministral 3 14B Base | Qwen3 14B Base |
|---|---|---|
| MMLU Redux | 82.0% | 83.7% |
| Multilingual MMLU | 74.2% | 75.4% |
| MATH CoT | 67.6% | 62.0% |
| ARC-Challenge | 89.9% | N/A |
| TriviaQA | 74.9% | 70.3% |
| MBPP | 71.6% | N/A |
| GPQA Diamond (base) | 39.9% | N/A |
The instruct model leads Qwen3-14B on every disclosed benchmark. The base model comparison is closer - Qwen3-14B Base edges it on MMLU Redux and Multilingual MMLU but trails on math and trivia. Where Ministral 3 14B separates cleanly from the field is the reasoning variant: 85.0% on AIME 2025 is the headline number, beating Qwen3-14B Thinking's 73.7% by 11 points. That gap matters for users who need chain-of-thought depth rather than raw chat quality.
Artificial Analysis places the model at intelligence rank #15 of 73 assessed models in its class (score: 16, vs median 12 for non-reasoning open-weight models of similar size). Output speed at 82.8 tokens/second is below the class median of 97.8 t/s - not a concern for offline batch or agentic workflows, but worth noting for latency-sensitive chat applications. For a broader view of 14B-class reasoning models, see our reasoning benchmarks leaderboard.
Key Capabilities
Vision and Multimodal
The model's 0.4B vision encoder handles image captioning, document OCR with bounding box extraction, chart analysis, and visual question answering. The architecture is the same vision stack used in Mistral Small 3.1 and shares components across the Ministral 3 family. Input resolution is optimized for roughly square aspect ratios (1:1), and the recommended approach for non-square inputs is to maintain aspect ratio rather than force a crop.
Agentic Workflows and Function Calling
Native function calling and structured JSON output are first-class features. Mistral designed the Ministral family with tool use in mind - the instruct checkpoint is tuned for dialogue, tool invocation, and structured output rather than raw pretraining perplexity. The 256K context window means multi-step agentic sessions with long tool result chains don't require manual truncation. That's a different posture than most 14B models, which cap at 32K or 128K tokens. See our function calling benchmarks leaderboard for how the field compares on structured output.
Local and Private Deployment
At 24 GB VRAM in FP8, the model fits a single RTX 4090 or a Mac with 32 GB unified memory. Further quantization with GGUF brings requirements down to 12-16 GB for INT4/INT8 variants available via Ollama (ollama pull ministral-3:14b) and LM Studio. The Apache 2.0 license puts no commercial restrictions on local deployment - no royalty requirements, no usage reports required.
Token Efficiency
Mistral's technical report notes that the Ministral 3 family "often produces an order of magnitude fewer tokens" than competing models while matching performance. The 14B instruct variant's WildBench score of 68.5 with a lower average output length than Qwen or Gemma equivalents matters directly for API cost in production workloads.
Pricing and Availability
At $0.20/M tokens for both input and output, Ministral 3 14B isn't the cheapest option in its size class. Artificial Analysis ranks it #60 of 73 models on input pricing, meaning most 14B-class models are cheaper via third-party providers. The official Mistral endpoint price is $0.20/M; the model is also available at the same price via OpenRouter.
| Provider | Input Price | Output Price | Notes |
|---|---|---|---|
| Mistral AI (La Plateforme) | $0.20/M | $0.20/M | Official API |
| OpenRouter | $0.20/M | $0.20/M | Multi-provider routing |
| Amazon Bedrock | Varies | Varies | Available since Dec 2025 |
| IBM WatsonX | Varies | Varies | Enterprise tier |
| Together AI | Varies | Varies | Available |
| Fireworks | Varies | Varies | Available |
| Ollama / LM Studio | Free | Free | Self-hosted GGUF |
For cost-sensitive applications, the Ministral 3B at $0.04/M or the 8B variant at lower pricing are the better picks. The 14B earns its price when reasoning depth, vision, or long-context (>128K tokens) is required. Our cost efficiency leaderboard tracks per-provider pricing comparisons across the full Mistral lineup.
Strengths and Weaknesses
Strengths
- Reasoning variant hits 85.0% on AIME 2025, best-in-class at 14B
- 256K context window - 2x larger than Mistral Small 3.2's 128K
- Apache 2.0 with no commercial restrictions
- Multimodal (vision + text) in the base release
- 40+ language support including strong multilingual MMLU
- Fits in 24 GB VRAM (FP8) - single consumer GPU tier
- Native function calling and JSON output for agentic use
- Three variants (base, instruct, reasoning) in one model family
Weaknesses
- $0.20/M pricing is expensive relative to comparable 14B-class open-weight options
- Output speed (82.8 t/s) is below average for the class - slower than Qwen3-14B at comparable quality
- Arena Hard score of 55.1% trails Mistral Large 3 notably
- Vision capability is limited relative to dedicated multimodal models
- Requires 24 GB VRAM at FP8 - rules out mid-tier consumer GPUs (8-16 GB)
- HuggingFace model card recommends temperature 0.1 for production use, limiting creative output quality
Related Coverage
- Ministral 3B - The 3B sibling in the same family, at $0.04/M tokens
- Mistral Small 3.2 - The 24B model Mistral positions as the comparison target
- Mistral Large 3 - Mistral's flagship for full enterprise workloads
- Reasoning Benchmarks Leaderboard - Full AIME/GPQA/LiveCodeBench rankings
- Function Calling Benchmarks Leaderboard - Structured output and tool use comparisons
- Cost Efficiency Leaderboard - Per-provider pricing across all Mistral tiers
- Edge and Mobile LLM Leaderboard - Sub-20B models ranked by hardware fit and performance
FAQ
Is Ministral 3 14B open source?
Yes. The weights are released under Apache 2.0, which permits commercial use, redistribution, and modification without royalties. Available at mistralai/Ministral-3-14B-Instruct-2512 on HuggingFace.
What hardware does Ministral 3 14B need?
24 GB VRAM at FP8. With INT8 or INT4 GGUF quantization, requirements drop to 12-16 GB, covering the RTX 3090 and similar. Self-hosting via Ollama or LM Studio is practical on a single high-end consumer GPU.
How does it compare to Mistral Small 3.2 24B?
Ministral 3 14B matches or beats Small 3.2 on instruction following and math benchmarks (per llm-stats.com), while offering 2x the context window (256K vs 128K). Small 3.2 holds an edge on coding tasks with a higher coding index.
What is the difference between the instruct and reasoning variants?
The instruct variant (Ministral-3-14B-Instruct-2512) is tuned for dialogue, tool use, and structured outputs. The reasoning variant (Ministral-3-14B-Reasoning-2512) adds chain-of-thought post-training, reaching 85.0% on AIME 2025 at the cost of longer, more verbose outputs.
Does it support images?
Yes. The model includes a 0.4B vision encoder for image captioning, document OCR with bounding boxes, and visual question answering. Use square-ish input images for best results per Mistral's documentation.
Sources:
- Introducing Mistral 3 - Mistral AI (December 2025)
- Ministral 3 14B Model Card - Mistral AI Docs
- Ministral-3-14B-Instruct-2512 - HuggingFace
- Ministral 3 Technical Report - arXiv:2601.08584
- Ministral 3 14B - Artificial Analysis
- Ministral 3 14B 2512 - OpenRouter
- Ministral 3 14B Architecture - APXML
- Mistral Large 3 and Ministral 3 family on Amazon Bedrock - AWS
✓ Last verified June 8, 2026
