Science

How AI Agents Break - Plus Fixes for Memory and Tools

Three arXiv papers map how LLM agents fail across 19 benchmarks, show in-process memory cuts retrieval latency 1,000x, and reveal steering vectors that control tool invocation.

Research Replication, Safe Responses, Verifiable Reasoning

Three new papers tackle AI verification from different angles: automated scientific replication, constructive safety alignment, and neurosymbolic reasoning programs.

AI Research Bias, Deception Probes, and Code Exploits

AI agents reproduce 72% of human research ideological bias, lie detectors improve with model scale, and Mastermind beats iterative vulnerability agents by 7 points.

Agent Safety Gaps, Memory Learning, and Leaner Inference

Three new papers expose how production agent frameworks fail under attack, why RLVR training discards useful cross-episode signals, and how calibrated confidence cuts inference compute by 12x.

Science Agents, Jailbreak Defense, and Open-World Failures

Three papers from today's arXiv: graph-native RL generates traceable scientific hypotheses, HARC defeats jailbreaks by coupling internal safety directions, and ICML 2026's OpenAgent shows how distributional shift breaks tool-use agents.

Agent Phase Collapse, Reasoning Exits, Preference Gaps

Three new arXiv papers map capability cliffs in agent world models, the narrow benefit of learned reasoning stops, and a 56% accuracy ceiling when agents help users build preferences.

Agent Languages, Sampling Ceilings, and Abstention

Three new papers on agents inventing symbolic languages to cut reasoning tokens by 3-6x, sampling ceilings that waste inference compute, and context-engineering to double agentic abstention rates.

Tandem Training, World Models, and Efficient Agents

Three new arXiv papers on making RL reasoning legible across models, fixing broken world model latent states, and training small agents to beat their teachers.

Refusal Gaps, Prompt Bleed, and Scaling's Logic Limit

Three new papers reveal how LLM safety hinges on persona training, how prompt modules interfere in deployed agents, and why scaling alone cannot reach symbolic reasoning.

Quantization's Hidden Tax, Cliff Tokens, Smarter Memory

Three new arXiv papers reveal hidden costs in quantized reasoning models, single-token failure triggers, and a new framework that cuts agent memory errors by up to 79%.

AI Diagnosis, Cache Efficiency, and Agent Security

Three papers from today's arXiv: a 32B medical model beats DeepSeek-R1 in rare disease diagnosis, a KV cache method keeps 97% accuracy with 3% memory, and a new benchmark red-teams agentic AI systems.

AI Research: Orchestration Beats Scale, Small Models Win

Sakana Fugu tops SWE-Bench Pro by routing tasks across rival LLMs, Microsoft's 9B browser agent beats OpenAI Operator, and a 3B model from Weibo matches DeepSeek V3.2 on math.