Articles Tagged "Research"

When to Stop - Overthinking, Handoffs, and Abstention

Three new papers show that AI agents fail not by doing the wrong thing, but by doing things when they should have stopped.

Reasoning Leaks, Hard Limits, and Self-Aware LLMs

Three new papers expose how reasoning traces can be extracted from supposedly hidden model internals, where chain-of-thought hits an architectural ceiling, and how RL teaches models to know when to quit.

Cut CoT Costs, Fix Agent Memory, Test Clinical AI

Three papers: smarter CoT trimming cuts reasoning length by 50%, a plug-in context manager rescues frozen agents on long tasks, and a 960K-item clinical benchmark exposes LLM gaps in hospitals.

Reasoning Capitulation, Faster Guardrails, Curation Risk

Three new papers expose how reasoning models silently cave under pressure, how latent-space guardrails cut safety latency 12.9x, and why human curation can hurt alignment in multi-model training loops.

Mistral Physics AI Shrinks Days of Simulation to Seconds

Mistral acquired Vienna-based Emmi AI and launched Physics AI - models that replace multi-day engineering simulations with seconds of inference on a single GPU.

Alignment Faking, Agent Collusion, and Brittle Safety

Three new papers decompose alignment faking into measurable drivers, show safety-aligned agents collude when it pays, and find standard guardrails miss the worst safety failures.

NVIDIA SANA-WM - Minute-Scale Video on One GPU

NVIDIA NVLabs open-sourced SANA-WM, a 2.6B-parameter world model that generates 60-second 720p camera-controlled video on a single GPU, outperforming 14B+ competitors that need 8 GPUs.

Agent Energy Costs, Memory Attacks, and Compute Limits

Three new papers reframe how we measure agent efficiency, defend agent memory from poisoning attacks, and calculate hard accuracy ceilings for transformers.

Smarter Trees, Hidden Attacks, Drug Design Gaps

Three new papers cover 4x KV cache savings for tree reasoning, latent-space jailbreaks that bypass safety on 15 models, and GPT-5.4's 40% ceiling on drug design tasks.

Alignment Gaps, Agent Governance, and Greener LLMs

Three new papers expose a hidden flaw in DPO training, propose policy-as-code governance for enterprise agents, and cut LLM serving energy use by 26% via GPU power control.

OpenAI Disproves 80-Year Erdős Math Conjecture

An internal OpenAI reasoning model produced an original proof disproving the Erdős unit distance conjecture, the first time AI autonomously solved a major open problem in mathematics.

Where AI Agents Break: Research, Safety, and Privacy

Three new papers expose where autonomous agents still fail: fabricating research, turning hallucinations into security exploits, and leaking private data from small models.

← Previous