Articles Tagged "MoE"

Qwen3.5-122B-A10B

Qwen3.5-122B-A10B

Qwen3.5-122B-A10B is a 122B-parameter MoE model activating 10B parameters per token, narrowing the gap between medium and frontier models with top scores in GPQA Diamond (86.6), MMMU (83.9), and OCRBench (92.1). Apache 2.0 licensed.

Qwen3.5-35B-A3B

Qwen3.5-35B-A3B

Qwen3.5-35B-A3B is a 35B-parameter MoE model activating just 3B parameters per token that surpasses the previous Qwen3-235B flagship across language, vision, and agent benchmarks. Apache 2.0 licensed.

Qwen3.5-Flash

Qwen3.5-Flash

Qwen3.5-Flash is Alibaba's hosted production model with 1M context, built-in tools, and multimodal support at $0.10/M input tokens - one of the cheapest frontier-tier APIs available.

Gemini 3.1 Pro

Gemini 3.1 Pro

Google DeepMind's Gemini 3.1 Pro leads on 13 of 16 benchmarks with 77.1% ARC-AGI-2, 94.3% GPQA Diamond, and a 1M-token context window at $2/M input.

GPT-OSS 20B

GPT-OSS 20B

OpenAI's open-weight 21B MoE reasoning model with 131K context, Apache 2.0 license, and o3-mini-level benchmark performance running in 16 GB of memory.