Articles Tagged "Long Context"

Best AI Models for Text Summarization - June 2026

Gemini 2.5 Flash Lite still leads the Vectara hallucination leaderboard at 3.3%, while two new entries - Gemini 3.5 Flash and Mistral Large 3 at $0.50/M - shift the value picture considerably since March.

Qwen3.7-Max

Alibaba's agent-first flagship model with a 1M-token context window, topping Terminal-Bench 2.0 and SWE-Bench Pro at roughly one-sixth the cost of Claude Opus 4.7.

Best LLMs with 1M+ Context Window in 2026

A practical comparison of every production LLM with a 1M+ token context window - verified pricing, real retrieval notes, and clear picks for different workloads.

Best Models for Long-Context Retrieval - May 2026

Claude Opus 4.6 leads MRCR v2 8-needle at 78% across 1M tokens while Opus 4.7 regressed sharply - GPT-5.5 and DeepSeek V4 Pro are the key new entrants in May 2026.

SubQ Review: 52x Faster, but Show Your Work

Subquadratic's SubQ claims the first linear-scaling LLM with a 12M-token window - but private beta access, self-reported benchmarks, and a 17-point MRCR gap make independent verification the only test that matters.

SubQ

SubQ is the first LLM built on a fully subquadratic attention architecture, achieving a 12M-token research context and 52x faster inference than FlashAttention at 1M tokens.

SubQ Launches: 12M-Token Context on Sub-Quadratic AI

Subquadratic exits stealth with SubQ, the first frontier model built on a sparse-attention architecture, a $29M seed round, and a 12M-token context window that costs a fraction of Opus.

Qwen 3.6 Max Review: Alibaba's Coding Contender

Qwen3.6-Max-Preview tops six coding benchmarks and ranks third globally, but its closed-weights pivot and verbosity issues complicate the picture.

DeepSeek V4

DeepSeek V4 ships in two open-weight MoE variants - V4-Pro at 1.6T/49B active and V4-Flash at 284B/13B active - both with 1M-token context and MIT license, released April 24, 2026.

MIT's Recursive Language Models Bypass the Context Ceiling

MIT researchers show that treating long documents as a Python environment - and letting models recursively spawn sub-models to explore them - beats RAG and extended context windows on every benchmark tested.

EXAONE 4.5

LG AI Research's first open-weight vision-language model packs 33B parameters, 262K context, and STEM scores above GPT-5-mini - but ships under a non-commercial license.

Qwen3.5-Omni

Alibaba's Qwen3.5-Omni takes text, images, audio, and video as input and streams both text and speech output in a single end-to-end model with a 256K context window.

← Previous