Articles Tagged "Research"

Fix 8% of Tokens, Dodge Memory Attacks, Cut Agent Costs

New research pinpoints the 8% of tokens driving reasoning failures, exposes memory laundering in agent systems, and cuts web agent inference costs 1.9x.

Best Perplexity Alternatives in 2026: 7 Tools Compared

Seven Perplexity alternatives compared on citation quality, pricing, and research depth - from ChatGPT Search and Kagi to Grok DeepSearch and developer-focused Exa.

Self-Correcting Models, Smarter Monitors, AI Designs Itself

Three new papers tackle critique dependency in LLMs, ensemble monitoring for AI control, and agents that autonomously discover better neural architectures.

Open Agent Leaderboard: Model Beats Architecture

IBM Research tests 25 agent configurations across 6 real-world benchmarks and finds backbone model choice matters 58x more than agent framework design.

arXiv Hits Researchers With 1-Year Ban for AI Slop

ArXiv is issuing one-year submission bans to authors whose papers contain verifiable unvetted AI output, as fabricated academic citations hit a tenfold increase since 2023.

Physics Predicts AI Risk, Math Still Hard, Tokens Saved

A physics formula predicts AI behavioral shifts before they happen, a benchmark shows LLMs fail at 90% of graduate math formalization, and a training-free method cuts synthetic data costs by up to 78%.

SU-01

SU-01 is a 30B-A3B MoE reasoning model from Shanghai AI Lab that achieves gold-medal performance on IMO 2025, USAMO 2026, and IPhO 2024/2025 using a three-stage training recipe and test-time scaling.

Olympiad Gold, Broken Memories, and Attention Loss

A 30B model earns IMO gold, memory consolidation silently corrupts agents, and a new metric predicts when LLMs lose track of their instructions.

Reasoning Bias, Behavior Cues, and Tool Interpretability

New research shows reasoning length amplifies position bias, behavior cues cut wasted tokens by 50% while boosting safety, and sparse autoencoders can predict tool failures from model internals.

NVIDIA Ising Review: AI Models for Quantum Hardware

NVIDIA Ising is the first open AI model family for quantum computing - a 35B VLM for processor calibration and CNN decoders for real-time error correction, already deployed at 20+ research institutions.

AI2 Fires Up $152M Blackwell Cluster for Open Science

AI2's federally backed OMAI compute cluster is now running on NVIDIA Blackwell Ultra hardware and has already shipped OLMo, Molmo 2, and MolmoAct models fully open to researchers.

Agent Overload, Blind Attention, Unsafe Traces

Three new papers show that more agent components backfire, reasoning models hide unsafe thinking, and vision-language models waste most of their attention.