Articles Tagged "AI Agents"

Anthropic Ships Opus 4.8 with Multi-Agent Workflows

Claude Opus 4.8 launches with dynamic workflows for parallel subagent orchestration, hitting 69.2% on SWE-bench Pro and introducing granular effort controls at unchanged pricing.

Alignment Faking, Agent Collusion, and Brittle Safety

Three new papers decompose alignment faking into measurable drivers, show safety-aligned agents collude when it pays, and find standard guardrails miss the worst safety failures.

Robinhood Opens AI Agent Trading to 27M Retail Users

Robinhood launched MCP-powered agentic trading in beta on May 27, letting AI agents from Claude and ChatGPT manage stock portfolios for 27.5 million retail customers - while regulators work out who's responsible when it goes wrong.

Cognition Raises $1B at $25B as Devin Hits $492M ARR

Cognition's $1B funding round at a $25B pre-money valuation puts Devin's $492M ARR and model-agnostic architecture under scrutiny as every major AI lab ships its own coding agent.

Agent Energy Costs, Memory Attacks, and Compute Limits

Three new papers reframe how we measure agent efficiency, defend agent memory from poisoning attacks, and calculate hard accuracy ceilings for transformers.

ClickUp Cuts 290 Jobs and Deploys 3,000 AI Agents

ClickUp cut 22% of its workforce and replaced them with roughly 3,000 internal AI agents - a ratio of three agents per remaining employee.

Gemini Spark Review: Google's Always-On AI Agent

Gemini Spark is Google's first 24/7 cloud-persistent AI agent - ambitious, genuinely novel, and still rough around the privacy edges.

KPMG Goes All-In on Claude, Deploying to 276K Staff

KPMG and Anthropic signed a global alliance giving all 276,000 employees access to Claude, with a dedicated PE product and a preferred-consultant designation for private equity deployments.

Best AI Tools for Cold Outbound Research in 2026

The best AI tools for cold outbound prospect research in 2026 - comparing Clay, Apollo.io, LinkedIn Sales Navigator, Lusha, and Cognism on data coverage, AI features, pricing, and which fits each use case.

Best AI Models with Native Tool Use in 2026

The best AI models for function calling and tool use in 2026 - comparing Claude, GPT-5.4, Gemini, DeepSeek, and local models on BFCL and TAU-bench scores.

Microsoft Agent 365 Review: AI Agent Control Plane

Microsoft's enterprise control plane for AI agents ships with strong M365 integration and real security muscle - but critical features are still in preview, and the licensing model is a puzzle.

Gemini 3.5 Flash: Real Speed, Selective Benchmarks

Google's Gemini 3.5 Flash is genuinely fast at 289 tok/s and competitive on agentic tasks - but the benchmark portfolio has gaps worth knowing before you build on it.

← Previous