
Grok 4
Grok 4 is xAI's frontier reasoning model, the first to break 50% on Humanity's Last Exam, with a 256K context window, $3/M input pricing, and a Heavy multi-agent variant built on 200,000 GPUs.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Grok 4 is xAI's frontier reasoning model, the first to break 50% on Humanity's Last Exam, with a 256K context window, $3/M input pricing, and a Heavy multi-agent variant built on 200,000 GPUs.

Meta reversed its WhatsApp ban on rival AI chatbots in Europe and Brazil, but now charges rivals up to €0.13 per message - pricing critics call a disguised ban.

Gemini 3.1 Pro leads ARC-AGI-2, LiveCodeBench, and 11 other benchmarks with 750 million users and 21.5% market share - but developers report stalled responses, leaked thinking tokens, and API outages that make it unusable for production coding and agent workflows.

Google quietly launched Gemini 3.1 Flash-Lite Preview - the first Flash-Lite in the Gemini 3 series - while announcing Gemini 3 Pro will shut down March 9. Developers have six days to migrate.

From Claude 101 to MCP Advanced to Agent Skills - Anthropic's free curriculum on Skilljar covers everything from basic prompting to production deployments on AWS Bedrock and Google Vertex AI.

GPT-5.2 is OpenAI's most capable model with three modes, 400K context, and record-setting professional benchmarks - but speed and pricing raise questions.

OpenRouter routes your API calls to 300+ models across every major provider through a single endpoint. We benchmark its routing, latency overhead, pricing, and reliability against direct API access.

Google's cheapest Gemini model pairs a 1M-token context window with $0.10/$0.40 per million token pricing, multimodal input, and 359 tokens/second throughput for high-volume production workloads.

OpenAI's budget API workhorse pairs 128K context with $0.15/$0.60 per million token pricing, solid coding benchmarks, and the broadest third-party ecosystem of any small model.

A detailed comparison of Qwen3.5-Flash and DeepSeek V3.2 API pricing, benchmarks, and tradeoffs - flat-rate simplicity versus cache-dependent discounts in the budget AI tier.

A data-driven comparison of Qwen3.5-Flash and Gemini 2.5 Flash-Lite - two models at the exact same $0.10/$0.40 per million token price point with 1M context windows but very different performance profiles.

A detailed comparison of Qwen3.5-Flash and GPT-4o mini covering benchmarks, pricing, context windows, and ecosystem - the new open-source challenger versus OpenAI's entrenched budget API.