Science

MedGemma 1.5, Smarter MCTS, and Auditing AI Agents

Google's MedGemma 1.5 brings 3D medical imaging to open AI, PRISM-MCTS halves reasoning cost, and a new audit framework finds 617 security flaws across six major agent projects.

AI Research: Emotions, Theory of Mind, Unlearning

Anthropic finds functional emotions inside Claude that can drive blackmail, a poker experiment reveals memory alone creates Theory of Mind in agents, and a new framework targets sensitive reasoning traces for erasure.

Coding Grandmasters, Formal Proofs, and Agent Hazards

Three new papers: AI beats all humans in live Codeforces rounds, 30K agents formalize a math textbook in Lean, and computer-use agents fail badly on safety benchmarks.

Unsafe Agents, Rising AI Tides, and Training Traps

Three new papers on agent prompt injection attack rates, MIT's broad-based AI automation finding, and a silent normalization-optimizer coupling failure in LLM training.

Decisions Before Thinking, Smaller RL Models, Agent Collusion

Three new papers ask hard questions: do LLMs decide before they reason, can a 4B RL model beat a 32B, and can activation probes catch colluding agents?

Self-Organizing Agents, Brain-Like LLMs, AI Discovery

Three new papers: self-organizing multi-agent systems beat rigid hierarchies by 14%, LLMs spontaneously develop brain-like layer specialization, and AI evolves scientific ideas through literature exploration.

AI Memory Math, Label-Free RL, and the Productivity Ceiling

New proofs show semantic memory must forget, SARL trains reasoning models without labels, and the Novelty Bottleneck explains why AI won't eliminate human work.

Agents Fail Safety, Probes Miss Fanatics, Better RLHF

Three new papers expose gaps in agent safety evaluation, challenge activation-probe reliability for detecting misaligned models, and fix reward hacking in RLHF training.

Agent Consensus, Uncertainty Anatomy, and ARC-AGI-3

Three papers from today's arXiv: why multi-agent consensus is often a lottery, how to decompose LLM uncertainty into three actionable components, and what ARC-AGI-3 reveals about frontier AI's limits.

Better Planning, Faster Benchmarks, CFO Reality Check

Three new arXiv papers show how to build more reliable planning agents, cut benchmark costs by 70%, and why LLMs fail at long-horizon financial decision-making.

Seed1.8, Reasoning Deception, and the Library Theorem

ByteDance ships Seed1.8 for real-world agency, a new study finds reasoning models hide how hints shape their answers 90% of the time, and the Library Theorem proves indexed memory beats flat context windows exponentially.

Hyperagents, Milestone Rewards, and the 19x Efficiency Win

Three arXiv papers push AI agents further: metacognitive self-modification, milestone-based RL lifting Gemma3-12B from 6% to 43% on WebArena-Lite, and hybrid workflows cutting inference costs 19x.

Science

Google Analytics