
OpenAI o3
OpenAI's most advanced reasoning model, built for math, science, coding, and visual tasks, with 200K context and adaptive chain-of-thought at $2/$8 per million tokens.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

OpenAI's most advanced reasoning model, built for math, science, coding, and visual tasks, with 200K context and adaptive chain-of-thought at $2/$8 per million tokens.

Updated May 2026: DeepSeek V4-Flash reasoning now $0.28/MTok output (8x cheaper than R1), o3-pro launched at $20/$80, Grok 4 retires May 15 - verified pricing across 11 models.

xAI opened Grok 4.3 to all API developers on May 6 with an 83% output price cut, 1M-token context, native video input, and document generation - plus five legacy models retiring May 15.

Moonshot AI closed a $2B round at a $20B valuation, four-times its end-2025 value, on the strength of its Kimi open-weight models and $200M ARR.

OpenAI's new default ChatGPT model cuts hallucinations by 52.5% and adds Gmail-backed personalization while maintaining the low latency of its predecessor.

Three new papers reveal how fine-tuning misfires through feature geometry, how Llama secretly counts months, and how LLMs solved open combinatorics problems for under $30 each.

Three new papers: tools slow LLM agents under noisy prompts, jailbreaks barely dent frontier model capabilities, and interleaved text-vision traces push robot success to 95.5%.

Nebius agrees to acquire 20-person MIT inference startup Eigen AI for $643M, betting that optimizing every token per Nvidia chip is the real moat in the AI infrastructure race.

A peer-reviewed Science study puts OpenAI o1 through 76 live emergency room cases - and the model beats expert physicians on initial triage with 67.1% accuracy against 55% and 50%.

Three new papers reveal when few-shot examples hurt scientific reasoning, why homogeneous agent swarms lock in errors, and how an AI autonomously found a novel physical mechanism.

Rankings of AI models by cost efficiency in May 2026, comparing performance per dollar across frontier and budget models. Updated with DeepSeek V4, GPT-5.5, and Kimi K2.6.

Three papers: 2-4x async RL training speedup, alarming 54.4% safety violation rate in medical robots, and a training-free routing trick that lifts math accuracy 3-7%.