
AI2 Fires Up $152M Blackwell Cluster for Open Science
AI2's federally backed OMAI compute cluster is now running on NVIDIA Blackwell Ultra hardware and has already shipped OLMo, Molmo 2, and MolmoAct models fully open to researchers.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

AI2's federally backed OMAI compute cluster is now running on NVIDIA Blackwell Ultra hardware and has already shipped OLMo, Molmo 2, and MolmoAct models fully open to researchers.

Three new papers show that more agent components backfire, reasoning models hide unsafe thinking, and vision-language models waste most of their attention.

Three new papers deliver a runtime safety firewall for agent tools, challenge how we measure AI alignment, and introduce elastic context management for long-horizon search agents.

Three new papers reveal how agent memory silently breaks, how a tiered architecture recovers it, and how models can self-improve without human labels.

Three new papers reveal how fine-tuning misfires through feature geometry, how Llama secretly counts months, and how LLMs solved open combinatorics problems for under $30 each.

REDMOD, Mayo Clinic's radiomics AI, detects 73% of pancreatic cancers in CT scans that look normal to radiologists - nearly double the rate specialists achieve.

Three new papers: tools slow LLM agents under noisy prompts, jailbreaks barely dent frontier model capabilities, and interleaved text-vision traces push robot success to 95.5%.

A peer-reviewed Science study puts OpenAI o1 through 76 live emergency room cases - and the model beats expert physicians on initial triage with 67.1% accuracy against 55% and 50%.

Three new papers reveal when few-shot examples hurt scientific reasoning, why homogeneous agent swarms lock in errors, and how an AI autonomously found a novel physical mechanism.

Three papers: 2-4x async RL training speedup, alarming 54.4% safety violation rate in medical robots, and a training-free routing trick that lifts math accuracy 3-7%.

Three papers show LLM self-correction hurts above a key threshold, map AI deception with 14%-72% detection gaps, and prove million-agent societies fail without interaction depth.

A data-driven comparison of six AI tools journalists are actually using in 2026, covering transcription, research, fact-checking, and writing.