Articles Tagged "Long Context"

Qwen3.6-Max-Preview

Alibaba's first closed-weights flagship Qwen ships with a 256K context window, tops six agentic coding benchmarks, and ranks third on the Artificial Analysis Intelligence Index.

Claude Sonnet 4.6

Anthropic's mid-tier model matches Opus 4.6 on computer use, leads all models on office productivity tasks, and costs five times less than the flagship at $3/$15 per million tokens.

Claude's 1M Context Window Now GA - No Premium Pricing

Anthropic made the 1M-token context window generally available for Claude Opus 4.6 and Sonnet 4.6, dropping the long-context pricing premium entirely - a 900K-token request now costs the same per token as a 9K one.

GPT-5.4

OpenAI's most capable frontier model combines native computer use, 1M-token context, and three variants at $2.50/$15 per million tokens.

GPT-5.4 Lands with Computer Use and 1M Token Context

OpenAI ships GPT-5.4 with built-in computer use that beats human desktop performance, a 1 million token context window, and native Excel and Google Sheets integrations.

What Is an AI Context Window? A Plain-English Guide

A beginner-friendly guide to AI context windows: what they are, why they matter, and how to use them to get better results from any AI chatbot.

DeepSeek V3.2 vs V4 - What Changes With a Trillion Parameters

A pre-release comparison of DeepSeek V3.2 and V4 - examining the generational leap from 671B text-only to a trillion-parameter natively multimodal model with 1M context.

Kimi K2.5 vs Llama 4 Scout: Benchmark King Meets Context King

Comparing Kimi K2.5 and Llama 4 Scout - Moonshot AI's benchmark-crushing trillion-parameter model versus Meta's 10-million-token context window specialist.

Gemini 2.5 Flash-Lite

Google's cheapest Gemini model pairs a 1M-token context window with $0.10/$0.40 per million token pricing, multimodal input, and 359 tokens/second throughput for high-volume production workloads.

GPT-4o mini

OpenAI's budget API workhorse pairs 128K context with $0.15/$0.60 per million token pricing, solid coding benchmarks, and the broadest third-party ecosystem of any small model.

Llama 4 Maverick

Meta's Llama 4 Maverick packs 400B total parameters into a 128-expert MoE architecture with only 17B active per token, beating GPT-4o on Chatbot Arena while matching DeepSeek V3 on reasoning at half the active parameters.

Llama 4 Scout

Meta's Llama 4 Scout is a 109B-total, 17B-active MoE model with 16 experts and a 10M-token context window - the longest of any open-weight model - with native multimodal support for text and images.

← Previous