Articles Tagged "Long Context"

Qwen3.6-Max-Preview

Qwen3.6-Max-Preview

Alibaba's first closed-weights flagship Qwen ships with a 256K context window, tops six agentic coding benchmarks, and ranks third on the Artificial Analysis Intelligence Index.

Claude Sonnet 4.6

Claude Sonnet 4.6

Anthropic's mid-tier model matches Opus 4.6 on computer use, leads all models on office productivity tasks, and costs five times less than the flagship at $3/$15 per million tokens.

GPT-5.4

GPT-5.4

OpenAI's most capable frontier model combines native computer use, 1M-token context, and three variants at $2.50/$15 per million tokens.

Gemini 2.5 Flash-Lite

Gemini 2.5 Flash-Lite

Google's cheapest Gemini model pairs a 1M-token context window with $0.10/$0.40 per million token pricing, multimodal input, and 359 tokens/second throughput for high-volume production workloads.

GPT-4o mini

GPT-4o mini

OpenAI's budget API workhorse pairs 128K context with $0.15/$0.60 per million token pricing, solid coding benchmarks, and the broadest third-party ecosystem of any small model.

Llama 4 Maverick

Llama 4 Maverick

Meta's Llama 4 Maverick packs 400B total parameters into a 128-expert MoE architecture with only 17B active per token, beating GPT-4o on Chatbot Arena while matching DeepSeek V3 on reasoning at half the active parameters.

Llama 4 Scout

Llama 4 Scout

Meta's Llama 4 Scout is a 109B-total, 17B-active MoE model with 16 experts and a 10M-token context window - the longest of any open-weight model - with native multimodal support for text and images.