Articles Tagged "AI Agents"

Meta Demos Neural Computers - But They Can't Do Math

A 19-person Meta AI and KAUST team including Jürgen Schmidhuber proposes Neural Computers - systems where the neural network itself is the running computer, trained solely on screen recordings.

Arcee's Trinity-Large: 398B Open Reasoning at $0.90

Arcee AI ships Trinity-Large-Thinking, a 398B sparse MoE reasoning model under Apache 2.0 that hits 91.9% on PinchBench for $0.85 per million output tokens on OpenRouter.

Shopify AI Toolkit Lets Claude Code Run Your Store

Shopify's free, MIT-licensed AI Toolkit connects Claude Code, Cursor, and Gemini CLI directly to live store data, giving agents real-time access to API schemas, documentation, and store execution.

Microsoft Open-Sources Runtime Security for AI Agents

Microsoft released the Agent Governance Toolkit, a seven-package open-source system that enforces policies on autonomous AI agents at sub-millisecond latency and covers all 10 OWASP agentic risks.

AI Agent Failures Need Escrow, Not Just Safety Training

Researchers from Google DeepMind, Microsoft, and Columbia propose financial guardrails for AI agents, with simulations showing up to 61% reduction in user losses.

Perplexity Hits $450M ARR After Agents Pivot

Perplexity's annual recurring revenue crossed $450M in March after a 50% single-month jump driven by its Computer agent platform - not search.

Muse Spark

Meta's first closed-source frontier model scores 52 on the Artificial Analysis Intelligence Index, leads on HealthBench Hard, and ships free at meta.ai - but has no public API yet.

Anthropic Launches Managed Agents - Runs Your AI for You

Anthropic released Claude Managed Agents in public beta today, a fully managed platform that handles sandboxing, state, and tool execution so developers can skip building agent infrastructure from scratch.

Best AI Models for Agentic Tool Use - April 2026

Claude Opus 4.6 leads SWE-bench Verified at 80.8% and OSWorld at 72.7% for agentic tasks, while GPT-5.4 ties for computer use; no single model dominates every workflow type.

MedGemma 1.5, Smarter MCTS, and Auditing AI Agents

Google's MedGemma 1.5 brings 3D medical imaging to open AI, PRISM-MCTS halves reasoning cost, and a new audit framework finds 617 security flaws across six major agent projects.

Best AI Chatbot Builders 2026: 6 Platforms Tested

A hands-on comparison of the top AI chatbot builder platforms in 2026, covering pricing, features, integrations, and which type of team each tool fits.

Utah Clears AI to Renew Psychiatric Meds Autonomously

Utah becomes the first government in the world to approve an AI system to autonomously renew psychiatric medication prescriptions, limiting it to 15 lower-risk drugs under a tightly supervised pilot.

← Previous