Qwen3.6-Max-Preview
Alibaba's first closed-weights flagship Qwen ships with a 256K context window, tops six agentic coding benchmarks, and ranks third on the Artificial Analysis Intelligence Index.

Overview
Qwen3.6-Max-Preview is Alibaba's first flagship Qwen model released without weights. Announced on April 20, 2026, it ships as a hosted-only API through Qwen Studio and Alibaba Cloud Model Studio, reachable at the endpoint qwen3.6-max-preview. There's no Hugging Face upload, no self-hosting path, and no quantized release. For a team that spent years building its reputation as the open challenger to Western proprietary labs, that's a pivot worth taking seriously.
TL;DR
- Alibaba's best model for agentic coding, topping six benchmarks on release day including SWE-bench Pro and Terminal-Bench 2.0
- 256K context, text-only, API-only with OpenAI and Anthropic-compatible endpoints, no weights release
- Ranks #3 of 203 models on the Artificial Analysis Intelligence Index at a composite score of 52, behind only GPT-5.4 and Claude Opus 4.7
The open-weights sibling from the same family, Qwen3.6-35B-A3B, still ships freely on Hugging Face under Apache 2.0 and runs on a single RTX 4090 at Q4 quantization. Max is the tier above. Alibaba appears to be running the same playbook Meta adopted with Muse Spark: open at the mid tier, closed at the top. On the same day Max-Preview launched, the free tier of Qwen Code shut down.
Alibaba's Xixi campus in Hangzhou houses the Qwen research team, which shifted to a tiered open/closed release strategy on April 20, 2026.
Source: commons.wikimedia.org
Key Specifications
| Specification | Details |
|---|---|
| Provider | Alibaba (Qwen Team) |
| Model Family | Qwen 3.6 |
| Parameters | Not disclosed |
| Context Window | 256K tokens |
| Input Modalities | Text only (no images, no video at launch) |
| Output Modality | Text |
| Input Price | Preview (undisclosed; reporting cites $6/M input) |
| Output Price | Preview (undisclosed; reporting cites $24/M output) |
| Release Date | April 20, 2026 |
| License | Proprietary (API only) |
| API Endpoint | qwen3.6-max-preview |
| API Compatibility | OpenAI chat-completions + Anthropic messages |
| Max Output Tokens | 8,192 (per current reporting) |
| Reasoning | Extended thinking with preserve_thinking parameter |
Pricing numbers are preview-tier and indicative only. The Artificial Analysis provider page lists evaluation cost at $0.00 during the preview period. Independent coverage from Lushbinary cites the expected production rate at $6 per million input tokens and $24 per million output tokens. Alibaba hasn't published an official rate card.
Benchmark Performance
Alibaba's first-party evaluation places Qwen3.6-Max-Preview first across six coding benchmarks: SWE-bench Pro, Terminal-Bench 2.0, SkillsBench, QwenClawBench, QwenWebBench, and SciCode. The numerical gains over Qwen3.6-Plus, its predecessor, look like this:
| Benchmark | Gain vs Qwen3.6-Plus |
|---|---|
| SciCode | +10.8 |
| SkillsBench | +9.9 |
| QwenChineseBench | +5.3 |
| NL2Repo | +5.0 |
| Terminal-Bench 2.0 | +3.8 |
| ToolcallFormatIFBench | +2.8 |
| SuperGPQA | +2.3 |
On the Artificial Analysis Intelligence Index v4.0, which blends ten evaluations covering reasoning, knowledge, math, and coding, the model scores 52. That ranks it third out of 203 assessed models, behind only GPT-5.4 and Claude Opus 4.7 and ahead of every open-weight model on the board.
Head-to-head with the competition
Absolute scores are harder to pin down than ranking claims because Alibaba hasn't published full benchmark tables for Max itself, only deltas from Plus. Lilting Channel's independent read, derived from Artificial Analysis data and the first-party release notes, puts the picture like this:
| Benchmark | Qwen3.6-Max | Claude Opus 4.7 | GPT-5.4 | Kimi K2.6 |
|---|---|---|---|---|
| SWE-Bench Pro | 57.3 | 58.6 | - | 58.6 |
| Terminal-Bench 2.0 | 65.4 | 65.4 | 75.1 | 66.7 |
| AA Intelligence Index | 52 | 56 | 58 | - |
| GDPval-AA | 51.0 | - | 83.0 | - |
| QwenWebBench (ELO) | 1558 | 1182 (v4.5) | - | - |
On the two shared coding axes, Qwen3.6-Max and Kimi K2.6 are basically tied, with Kimi slightly ahead on SWE-Bench Pro and Terminal-Bench. The Max model pulls decisively ahead on Alibaba's in-house front-end benchmark QwenWebBench, posting an ELO of 1,558 against Claude Opus 4.5's 1,182 on web development tasks covering apps, games, SVG generation, and data visualization in both English and Chinese.
One flag worth noting: Qwen3.6-Max-Preview produced 74M output tokens during Artificial Analysis evaluation, against a field median of roughly 24M. Three times the verbosity at equivalent quality is a latency and cost problem at scale. High verbosity in preview models often gets dialed back before general availability, but if it doesn't, that's a material competitive drawback against Claude and GPT-5.
Access to Qwen3.6-Max-Preview routes through Alibaba Cloud Model Studio, a compliance wrinkle for US and EU teams that previously self-hosted open-weight Qwen releases.
Source: commons.wikimedia.org
Key Capabilities
Agentic coding. The six benchmark wins are consistent across the agentic coding category: terminal execution (Terminal-Bench 2.0), software engineering (SWE-bench Pro), skill composition (SkillsBench), tool use (QwenClawBench), web development (QwenWebBench), and scientific programming (SciCode). This isn't cherry-picking one dimension. The model is targeting full-stack agent workflows, and the scores line up behind that positioning.
Preserve thinking across turns. The preserve_thinking parameter keeps chain-of-thought traces intact between conversation turns rather than regenerating them. For multi-step agent runs, that cuts overhead on iterative planning and keeps context coherent across tool calls. The same flag exists on Qwen3.6-35B-A3B and Qwen3.6-Plus, so it isn't Max-specific, but it's part of what makes this family worth looking at for production agents.
Dual-API compatibility. The qwen3.6-max-preview endpoint accepts both OpenAI chat-completions format and Anthropic messages format on the same URL (dashscope-intl.aliyuncs.com/compatible-mode/v1). Teams already wired to either SDK can switch providers with nothing more than a base URL change. This is becoming standard among Chinese labs - GLM-5.1 and Kimi K2.6 both ship dual-spec endpoints - but it still meaningfully lowers migration cost.
Pricing and Availability
Access is through two surfaces: Qwen Studio, the browser-based chat interface, and Alibaba Cloud Model Studio, the production API. There's no Hugging Face mirror, no third-party hosting on OpenRouter or Fireworks as of launch day, and no announced timeline for wider availability.
The preview period currently shows $0 per token on the Artificial Analysis evaluation page. Production pricing hasn't been published by Alibaba. Independent reporting cites expected rates of $6/$24 per million tokens, which would place Max roughly between Claude Opus 4.7 and GPT-5.4 on cost. Until the rate card goes live, treat those numbers as unverified.
For reference, Qwen3.6-35B-A3B is free under Apache 2.0, and Qwen3.6-Plus lists at roughly $0.50/$3.00 per million tokens on Alibaba Cloud. That's a 12x input price jump if the $6 estimate holds.
Strengths
- Tops six coding benchmarks on release day, including SWE-bench Pro and Terminal-Bench 2.0
- Third on the Artificial Analysis Intelligence Index at 52, behind only GPT-5.4 and Claude Opus 4.7
- OpenAI and Anthropic-compatible endpoints mean near-zero migration cost from either competitor
preserve_thinkingparameter keeps chain-of-thought context across multi-turn agent runs- QwenWebBench ELO of 1558 leads the field on front-end code generation by a wide margin
- Pricing, once published, is expected to undercut Claude Opus 4.7 on both input and output
Weaknesses
- No weights release means no self-hosting, no fine-tuning, and no air-gapped deployment
- Text-only at launch; no image input, no video, no tool-native multimodality
- 256K context is a step down from Qwen3.6-Plus's 1M token window
- 74M output tokens per evaluation run vs a field median of 24M signals high verbosity that hurts latency and cost at scale
- Pricing not yet published, making TCO modeling impossible against Claude Opus 4.7 or GPT-5.4
- Routing customer data through Alibaba Cloud creates GDPR and compliance questions that open-weight Qwen releases didn't have
- No third-party hosting on OpenRouter or Fireworks at launch limits reach beyond Alibaba Cloud customers
Related Coverage
- Alibaba's Qwen3.6-Max Ships Closed - Tops Six Coding Evals - Our launch-day news coverage
- Qwen3.6-35B-A3B Model Card - The open-weight sibling in the same family
- Kimi K2.6 Ships with Agent Swarm - The open-weight coding flagship that dropped a day earlier
- Claude Opus 4.7 Model Card - Intelligence Index peer
- GPT-5.4 Model Card - Intelligence Index peer
- Coding Benchmarks Leaderboard - Full rankings where Qwen3.6-Max competes
- Overall LLM Rankings April 2026 - Where this model sits across categories
Sources
- Qwen3.6-Max-Preview: Benchmarks, API & Review - buildfastwithai.com
- Qwen3.6-Max Preview: Coding SOTA + Closed-Weights Pivot - digitalapplied.com
- Alibaba Drops Qwen 3.6 Max Preview, Its Most Powerful Model Yet - Decrypt
- Qwen3.6 Max Preview - Artificial Analysis
- Qwen3.6-Max-Preview and Kimi K2.6 lined up - lilting.ch
- Qwen3.6-Max-Preview vs Plus vs Kimi K2.6 - lushbinary.com
- Alibaba releases Qwen3.6-Max preview - CnTechPost
- Alibaba Unveils Qwen3.6-Max-Preview - BigGo Finance
- Qwen3.6 Max Preview - Dataconomy
- Alibaba Cloud Model Studio documentation
✓ Last verified April 21, 2026
