Open source

Kimi K2.5 vs Qwen3.5-35B-A3B: Frontier Powerhouse Meets the Tiny Giant Killer

A detailed comparison of Kimi K2.5 and Qwen3.5-35B-A3B - a 1T parameter frontier model with agent swarms versus a 35B model that runs on a single consumer GPU.

DeepSeek V3.2

DeepSeek V3.2 is a 671B-parameter MoE model activating 37B per token that delivers frontier-class reasoning and coding at the lowest API price in the industry - $0.14/$0.28 input, $0.42 output per million tokens.

GLM-4.7-Flash

Zhipu's GLM-4.7-Flash is a 30B-A3B MoE model that posts 59.2% on SWE-bench Verified and 79.5% on tau2-Bench while running on a single RTX 4090 - MIT licensed and free via the Z.AI API.

Google Gemma 3 27B

Google Gemma 3 27B is a 27B dense multimodal model supporting text and vision with a 128K context window, 140+ languages, and single-GPU deployment - the most capable open model at its size class.

Llama 4 Maverick

Meta's Llama 4 Maverick packs 400B total parameters into a 128-expert MoE architecture with only 17B active per token, beating GPT-4o on Chatbot Arena while matching DeepSeek V3 on reasoning at half the active parameters.

Llama 4 Scout

Meta's Llama 4 Scout is a 109B-total, 17B-active MoE model with 16 experts and a 10M-token context window - the longest of any open-weight model - with native multimodal support for text and images.

Microsoft Phi-4

Microsoft's 14B dense transformer that consistently beats models 5x its size on MATH and GPQA, available under the MIT license for unrestricted commercial use.

Mistral Large 3

Mistral Large 3 is a 675B-parameter MoE model activating 41B per token with native multimodal support, a 256K context window, and Apache 2.0 licensing - Europe's first frontier-class open-weight model.

Mistral Small 3.2

Mistral Small 3.2 is a 24B dense model with strong function calling, multimodal vision, and 128K context under Apache 2.0 - optimized for production tool-use pipelines and EU-compliant deployments.

Nemotron 3 Nano 30B-A3B

NVIDIA's hybrid Mamba2+MoE model packs 31.6B total parameters but activates only 3.2B per token, delivering frontier-class reasoning with 3.3x the throughput of comparable models on a single H200 GPU.

Qwen3.5-122B-A10B vs DeepSeek V3.2: Efficiency vs Raw Power in Open-Weight AI

A benchmark-by-benchmark comparison of Qwen3.5-122B-A10B and DeepSeek V3.2 - the efficiency-optimized underdog versus the brute-force open-source heavyweight.

Qwen3.5-122B-A10B vs Llama 4 Maverick: The Efficiency Gap Nobody Expected

A data-driven comparison of Alibaba's Qwen3.5-122B-A10B and Meta's Llama 4 Maverick - two open-weight MoE models with radically different approaches to parameter efficiency and benchmark performance.

← Previous

Open source

Google Analytics