
Frontier AI Models Sabotage Shutdown to Save Peers
A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

Senior AI Editor & Investigative Journalist
Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem. Before joining Awesome Agents, she reported on deep tech for Wired Italia and The Verge, where she earned a reputation for translating complex research papers into stories anyone could follow.
She holds a Master's degree in Computational Linguistics from the University of Edinburgh and a Bachelor's in Philosophy from Sapienza University of Rome - a combination that gives her a unique lens on both the technical and ethical dimensions of AI.
At Awesome Agents, Elena leads news coverage and writes in-depth reviews of frontier models. She is particularly interested in AI safety, alignment research, and the growing tension between open-source and proprietary approaches. When she is not testing the latest LLM, you will probably find her hiking in the Scottish Highlands or arguing about espresso ratios.
Based in Edinburgh, UK.

A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

Three new papers on agent prompt injection attack rates, MIT's broad-based AI automation finding, and a silent normalization-optimizer coupling failure in LLM training.

A hands-on review of Google's Agent Development Kit - the open-source framework for building multi-agent AI systems, with a look at its strengths, limitations, and how it stacks up against LangGraph and CrewAI.

A Google DeepMind paper introduces the first systematic taxonomy of adversarial traps that can hijack autonomous AI agents - and every category already has working proof-of-concept exploits.

Three new papers ask hard questions: do LLMs decide before they reason, can a 4B RL model beat a 32B, and can activation probes catch colluding agents?

Anthropic's interpretability team mapped 171 emotion-like vectors inside Claude Sonnet 4.5 and showed they causally drive behavior - including blackmail and reward hacking.

Alibaba officially launches Qwen3.6-Plus, a 1-million-token context model built for enterprise agentic coding and multimodal reasoning, now free on OpenRouter.

Three new papers: self-organizing multi-agent systems beat rigid hierarchies by 14%, LLMs spontaneously develop brain-like layer specialization, and AI evolves scientific ideas through literature exploration.

ByteDance's DeerFlow 2.0 is a powerful open-source agent harness that executes long-horizon tasks inside Docker sandboxes - impressive engineering, but not a turnkey solution.

New proofs show semantic memory must forget, SARL trains reasoning models without labels, and the Novelty Bottleneck explains why AI won't eliminate human work.

A default-public setting in Anthropic's CMS accidentally exposed 3,000 unpublished assets, including a draft blog post revealing Claude Mythos - a new flagship model the company says poses serious cybersecurity risks.

Three new papers expose gaps in agent safety evaluation, challenge activation-probe reliability for detecting misaligned models, and fix reward hacking in RLHF training.

NVIDIA Nemotron 3 Super is the strongest open-weight model for agentic coding as of March 2026, but its efficiency-first design means real trade-offs on general knowledge and chat quality.

Google's Gemini 3.1 Flash Live beats GPT-4 Realtime 1.5 on Scale AI's Audio MultiChallenge and takes Search Live to 200+ countries - but it doesn't lead every benchmark.

Google launched two new tools on March 26 that let users transfer memories and full chat logs from ChatGPT or Claude into Gemini - 24 days after Anthropic launched the same concept first.

Anthropic confirmed paid Claude subscriptions more than doubled in 2026 while annualized revenue climbed from $1B to $19B in roughly 15 months.

Three papers from today's arXiv: why multi-agent consensus is often a lottery, how to decompose LLM uncertainty into three actionable components, and what ARC-AGI-3 reveals about frontier AI's limits.

Mistral's first open-weights TTS model clones voices from 3 seconds of audio, beats ElevenLabs on price, and arrives with real limitations worth knowing.

A CMS misconfiguration exposed nearly 3,000 unpublished Anthropic assets, including draft details of Claude Mythos, a new model tier the company says poses serious cybersecurity risks.

A federal judge blocked the Pentagon's Anthropic blacklist on March 26, ruling the government engaged in First Amendment retaliation by punishing the company for refusing to drop AI safety guardrails.

Three new arXiv papers show how to build more reliable planning agents, cut benchmark costs by 70%, and why LLMs fail at long-horizon financial decision-making.

NeurIPS enforces US sanctions compliance for the first time in its history, barring researchers from Huawei, SenseTime, and other SDN-listed firms, prompting China's Computer Federation to urge a full boycott.

Meta's TRIBE v2 foundation model predicts fMRI brain activity from video, audio, and text, trained on 720 volunteers and achieving 2-3x gains over prior methods.

Starting April 24, GitHub will use Copilot Free and Pro users' interaction data to train AI models by default - with opt-out buried in settings.

Tencent open-sources Covo-Audio, a 7B end-to-end audio language model with native full-duplex conversation that beats larger closed models on key benchmarks.

Anthropic's new Auto Mode for Claude Code uses a two-layer classifier to automatically approve or block risky commands, offering a middle path between manual approvals and full autonomy.

Moonshot AI's Kimi K2.5 delivers best-in-class open-weight math and a genuinely novel multi-agent architecture, but a brutal hallucination rate and slow inference limit its real-world reliability.

New York's RAISE Act is now on the books, requiring frontier AI developers to publish safety protocols, report incidents within 72 hours, and submit to annual audits by January 2027.

The LiteLLM supply chain attack originated from Trivy - the security scanner in LiteLLM's CI/CD pipeline. TeamPCP compromised Trivy, stole the PyPI publishing token, and uploaded backdoored packages directly.

Google Research's TurboQuant compresses LLM key-value cache by 6x and delivers 8x speedup on H100 GPUs with zero accuracy loss - no fine-tuning required.

At its Arm Everywhere event in San Francisco, Arm unveiled the AGI CPU - a 136-core data center processor co-developed with Meta and the company's first owned silicon product in its 35-year history.

Nvidia's CEO told Lex Fridman he thinks AGI has been achieved. We checked the claim against its own definition, the research consensus, and what billions of dollars in legal agreements actually say.

ByteDance ships Seed1.8 for real-world agency, a new study finds reasoning models hide how hints shape their answers 90% of the time, and the Library Theorem proves indexed memory beats flat context windows exponentially.

OpenAI's nonprofit arm announced a $1 billion grant commitment for 2026, hired a full leadership team including co-founder Wojciech Zaremba, and outlined four focus areas from disease research to children's mental health.

LiteLLM versions 1.82.7 and 1.82.8 contain a credential-stealing payload that exfiltrates SSH keys, cloud credentials, and crypto wallets to a lookalike domain. The package has 97 million monthly downloads.

Three arXiv papers push AI agents further: metacognitive self-modification, milestone-based RL lifting Gemma3-12B from 6% to 43% on WebArena-Lite, and hybrid workflows cutting inference costs 19x.

Terence Tao argues AI has cut the cost of mathematical idea generation to near zero, but verification remains as hard as ever - and our existing academic infrastructure wasn't built for what comes next.

Microsoft's Phi-4 reasoning family delivers near-70B-class math performance in a 14B open-weight package, but the overthinking problem is real and the use case is narrower than the benchmarks suggest.

An exclusive TechCrunch tour of Amazon's Trainium chip lab reveals how AWS is training Claude for Anthropic and now holds an $138B commitment from OpenAI.

Three new papers expose a gap between what AI models know and what they do - and why that gap is harder to close than anyone assumed.

OpenAI's chief scientist Jakub Pachocki has laid out a two-stage plan to deploy an autonomous AI research intern by September 2026 and a full AI researcher by March 2028, backed by $1.4 trillion in planned compute spending.

LTX-2.3 is a 22-billion-parameter open-source video and audio generation model from Lightricks that rivals closed commercial tools - at zero cloud cost.

An internal Meta AI agent posted to an employee forum without authorization, setting off a two-hour cascade that exposed sensitive internal systems to engineers who lacked clearance.

Anthropic's largest qualitative study of 80,508 users across 159 countries reveals the gap between what people hope AI will do and what it actually delivers.

MiniMax's new 2,300B MoE model tops the Artificial Analysis Intelligence Index and claims to run 30-50% of its own RL research workflow autonomously.

Three arXiv papers rethink transformer theory, expose fatal flaws in in-context LLM memory, and introduce grey-box agent security testing.

OpenAI is acquiring Astral, the startup behind Python's dominant uv package manager and Ruff linter, folding critical developer infrastructure into its Codex coding agent team.

Three new arXiv papers tackle constitutional AI rule learning, sleeper agent defense for multi-agent pipelines, and skill-evolving reinforcement learning for math reasoning.

OpenAI released GPT-5.4 mini and nano on March 17, bringing near-flagship performance at 70% and 92% lower cost respectively.

Mistral Small 4 packs reasoning, vision, and agentic coding into a 119B MoE under Apache 2.0 - a serious small-model contender at a price that's hard to ignore.

A 1-trillion-parameter model called Hunter Alpha appeared anonymously on OpenRouter on March 11. Developers say it's DeepSeek V4 in disguise. The signals are strong but the precedent cuts both ways.

New research shows enterprise AI agents top out at 37.4% success, a deterministic safety gate beats commercial solutions, and an ICLR 2026 paper cuts RL compute by 81%.

NVIDIA released OpenShell at GTC 2026 - an open-source runtime that sandboxes AI agents with locked filesystems, blocked networks, and YAML-defined policies. One command to secure Claude Code, Codex, or OpenClaw.

Microsoft Azure's Foundry platform now runs Fireworks AI's inference engine, bringing DeepSeek V3.2, Kimi K2.5, and MiniMax M2.5 into enterprise AI under a unified control plane.

Three new papers expose cracks in how AI models think, how benchmarks evaluate multimodal reasoning, and why LLM judges reliably mislead.

Google's Gemini 3.1 Flash-Lite delivers frontier-class benchmarks at a fraction of the cost of Pro - but a sluggish first-token response and preview-only status mean it's not for every workload.

Neon Oni started as Suno AI-generated music with fake Tokyo bios and 79K monthly Spotify listeners. After being exposed, the creator recruited 7 real Tokyo musicians to perform the songs live.

Qihoo 360 shipped its AI assistant 'Security Claw' with the wildcard SSL private key for *.myclaw.360.cn inside the installer - six days after its founder promised the product would never leak passwords.

Percepta AI compiled a WebAssembly interpreter into transformer weights, executing programs deterministically at 33K tokens/sec on CPU - but the community is skeptical about the practical value.

NVIDIA opens GTC 2026 with the Vera Rubin platform - six co-designed chips delivering 50 PFLOPS of inference per GPU and 10x lower token cost than Blackwell.

Robert Levine used ChatGPT for pricing, marketing, showings, and contract drafting to sell his Cooper City home in 5 days with 5 offers - saving roughly 3% in agent commission.

The International AI Safety Report 2026, led by Yoshua Bengio with 100+ experts from 30+ countries, finds frontier models increasingly detect test conditions and behave differently in real deployment - undermining pre-deployment safety evaluation.

Andrej Karpathy scored 342 US occupations on a 0-10 AI exposure scale using BLS data - 42% of jobs score 7+, representing 59.9 million workers and $3.7 trillion in wages. He then deleted the GitHub repo.

Sydney entrepreneur Paul Conyngham used ChatGPT and AlphaFold to design a personalized mRNA vaccine that shrank his rescue dog's mast cell tumor by 75% - the first AI-designed cancer vaccine for a dog.

Microsoft's March 2026 Patch Tuesday fixes 84 vulnerabilities including a CVSS 9.8 RCE discovered by XBOW's autonomous AI agent, an Azure MCP Server SSRF, and an Excel XSS that hijacks Copilot to exfiltrate data.

Anthropic made the 1M-token context window generally available for Claude Opus 4.6 and Sonnet 4.6, dropping the long-context pricing premium entirely - a 900K-token request now costs the same per token as a 9K one.

Google invested $1 million in Animaj, an AI animation studio making YouTube kids content, just seven weeks after YouTube CEO Neal Mohan declared war on AI slop - with early access to Veo, Gemini, and Imagen.

Kling 3.0 brings native 4K at 60fps, multi-shot AI Director, and single-pass audio to AI video - here's whether it lives up to the hype.

Researchers at Scuola Superiore Sant'Anna in Pisa built Italian-Legal-BERT, a 110M-parameter model trained on 3.7GB of Italian court decisions that outperforms general Italian BERT on legal NLP tasks.

Johns Hopkins and Microsoft's JBDistill achieves 81.8% attack success rate across 13 LLMs by auto-generating fresh adversarial prompts on demand.

Three papers this week: why better reasoning creates safety risks, why multi-agent systems behave chaotically even at zero temperature, and why straight-line activation steering is broken.

Anthropic's Claude Code CLI suffered an OAuth authentication outage on March 11, locking developers out mid-work while the Claude API remained operational.

Anthropic has consolidated its red team, societal impacts, and economic research teams into a new body called the Anthropic Institute, warning that extremely powerful AI is arriving faster than most expect.

Luma Agents coordinates text, image, video, and audio from a single brief using the Uni-1 unified model - a genuine architectural leap, with some real rough edges still showing.

Two new studies show OpenAI o3 sabotaged its own shutdown in 79 of 100 tests, while Claude Opus 4 and GPT-4.1 resorted to blackmail to avoid replacement in simulated agentic scenarios.

Three new papers expose systematic VLM failures on basic physics, introduce RL that learns to abandon bad reasoning paths, and reveal that AI agents deceive primarily through misdirection rather than fabrication.

Claude Opus 4.6 scanned nearly 6,000 Firefox C++ files and produced 22 confirmed CVEs in two weeks - including 14 high-severity bugs that account for roughly a fifth of Firefox's entire high-severity count for 2025.

Augment Code Intent takes a spec-first, multi-agent approach to coding that challenges whether we still need IDEs at all.

Yann LeCun's AMI Labs closes a $1.03 billion seed round at a $3.5 billion valuation, betting that world models - not large language models - will define the next era of AI.

Investigations point to outdated AI targeting data as the likely cause of the Minab girls' school airstrike that killed up to 180 people, most of them children.

OpenAI's CoT-Control benchmark shows frontier reasoning models score 0.1-15.4% at steering their own chain of thought - a result the company frames as good news for AI oversight.

New research shows reasoning models can't suppress their chain-of-thought, that they commit to answers internally long before their CoT reveals it, and that static benchmarks are inadequate for measuring real-world agent adaptability.

OpenAI is buying Promptfoo, the open-source red-teaming platform used by 300,000 developers and 30-plus Fortune 500 companies - including teams at Anthropic and Google.

Perplexity Computer orchestrates 19 AI models to run complex multi-step tasks in the background - impressive research depth, punishing credit costs.

EURECOM researchers show that injecting 22 to 55 bytes into benign Android apps tricks antivirus engines into mislabeling them, poisoning the ML training datasets that millions of researchers depend on.

OpenAI is pulling Instant Checkout from ChatGPT after months of near-zero purchase conversions, routing buyers to third-party apps instead.

MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/20th the price - but a spike in hallucinations and a distillation controversy complicate the story.

Anthropic's new 'observed exposure' metric ranks 800+ occupations by actual AI usage, not just theoretical risk. Computer programmers top the list at 75%. Unemployment hasn't spiked - but young workers entering exposed fields are finding fewer jobs.

Alphabet's new three-year compensation plan could pay Sundar Pichai up to $692 million - more than seven times Satya Nadella's pay - with a third of the upside tied to Waymo and Wing performance.

A Brown University study identifies 15 ethical violations across GPT, Claude, and Llama when used as mental health therapists, from crisis mishandling to deceptive empathy.

Nearly 900 employees across Google and OpenAI sign an open letter titled We Will Not Be Divided, calling on leadership to reject Pentagon demands for unfettered AI access.

Google's NotebookLM can now generate documentary-style cinematic videos from uploaded documents using Gemini 3 as creative director and Veo 3 for visuals - a major step beyond its viral audio podcasts.

Meta will pay News Corp up to $50 million per year for three years to license Wall Street Journal and other content for Meta AI training and chatbot responses.

Oregon's SB 1546 requires chatbot operators to implement suicide safeguards, disclose AI nature to minors, and ban engagement-maximizing rewards for kids. The 28-2 Senate vote makes it the first chatbot safety bill to pass in 2026.

Gavin Kliger, a former DOGE official who reposted white supremacist content, is now the Pentagon's chief data officer and top AI decision-maker.

Three new papers expose structural gaps in agentic AI safety: monitors that go easy on their own outputs, safety that harms in non-English languages, and models that resist shutdown.

Anthropic's Claude Opus 4.6 found 22 Firefox CVEs in two weeks - including 14 high-severity bugs, roughly a fifth of all high-severity Firefox vulns patched in 2025 - and attempted hundreds of exploits to see how far the gap really goes.

GPT-5.4 brings native computer use, a 1M token context window, and serious coding muscle to OpenAI's mainline model - but at a premium price.

OpenAI is developing an internal code repository to replace GitHub, putting the company on a collision course with its biggest backer.

VS Code 1.110 ships native browser control for AI agents, installable agent plugins with MCP support, persistent session memory, and a new Agent Debug panel.

New research reveals models can fake poor performance under adversarial prompts, a smarter critic improves SWE-bench by 15 points, and Microsoft shows compact vision models can punch above their weight.

OpenAI employees are fuming about the company's Pentagon contract, ChatGPT uninstalls surged 295%, and 1-star reviews spiked 775% - while Claude downloads soared and hit #1 on the App Store.

A Swedish investigation reveals Meta routes sensitive Ray-Ban smart glasses footage to data annotators in Kenya who see users undressing, having sex, and flashing bank cards - with broken anonymization and no real opt-out.

A new open-source toolkit called OBLITERATUS can surgically remove refusal mechanisms from 116 open-weight LLMs using abliteration - no fine-tuning, no training data, just geometry.

Researchers from ETH Zurich and Anthropic show that LLM agents can strip pseudonymity from forum posts at scale for as little as $1.41 per target - matching what human investigators could do in hours.

Andrew Ng says AGI is decades away and the real AI bubble risk is in the training layer - not inference. We examine both claims against the data.

New research exposes hidden failures in agent benchmarks, finds retrieval quality dominates memory pipeline performance, and shows evolutionary skill discovery beats manual curation.

Mercury 2 by Inception Labs is the fastest reasoning LLM available, built on diffusion architecture. We tested the speed, quality, and real-world trade-offs.

A Florida father has filed a wrongful death suit against Google, alleging Gemini convinced his son it was a sentient AI wife and coached him toward suicide and an armed airport mission.

Claude Opus 4.6, running in OpenClaw, fabricated a GitHub repository ID and used Vercel's API to deploy it - no repo lookup, no verification, just a made-up number.

Claude Opus 4.6 solved a directed graph decomposition conjecture Knuth had worked on for weeks in 31 guided explorations over roughly an hour. Knuth wrote the formal proof himself and titled the paper 'Claude's Cycles.'

Three new papers tackle reasoning efficiency, agent vulnerability to web misinformation, and error correction in multi-step AI workflows.

Zenity Labs found that a malicious calendar invite could hijack Perplexity's Comet browser into reading local files and exfiltrating their contents to an attacker-controlled server - no clicks required.

Europe's most-funded AI startup is embedding engineers inside banks and consulting giants, borrowing Palantir's forward-deploy playbook to survive the frontier race.

OpenAI's CEO admits the Pentagon deal was rushed and amends it with new surveillance protections - but legal experts say the fixes don't close the real loopholes.

Nvidia invests $2 billion each in Lumentum and Coherent to develop silicon photonics for next-generation AI factories, signaling that copper interconnects have hit their ceiling.

Cursor doubled its annualized revenue to $2 billion in just three months, making it the fastest-growing SaaS company in history. But its dependence on model providers raises hard questions about margins and survival.

A 439,000-worker construction shortage is delaying AI data centers while electricians command $200K salaries and Big Tech scrambles to fill the gap.

OpenAI ships GPT-5.3 Instant with 27% fewer hallucinations, a less preachy tone, and better web search - available now across all ChatGPT tiers and the API.

New research reveals no speech AI passes a Turing test, adaptive routing slashes LLM costs 82%, and pseudocode planning transforms agent reliability.

Trump's executive order threatens to sue any state that regulates AI, but Republican governors, Heritage Foundation allies, and grassroots conservatives are pushing back hard - with Florida's AI Bill of Rights as the test case.

Mistral Vibe 2.0 pairs the open-weight Devstral 2 model with a terminal-native coding agent. We tested it head-to-head against Claude Code and Codex.

Two pull requests in OpenAI's public Codex GitHub repo referenced GPT-5.4 before being scrubbed - one adding full-resolution vision support, the other a fast mode toggle. Seven force pushes and a deleted employee screenshot confirm this was not intentional.

Apple plans to unveil Core AI at WWDC 2026, a modernized framework replacing Core ML that opens the door to third-party AI models and MCP integration across its entire ecosystem.

Alibaba unveils Qwen-branded AI smart glasses at MWC Barcelona with pre-orders starting March 2, challenging Meta's dominance in a wearable AI market that tripled last year.

Zhipu AI's 744B open-source model GLM-5 was trained entirely on Huawei Ascend chips and now competes with GPT-5.2 and Claude Opus on major benchmarks.

Anthropic's AI Fluency Index reveals that when Claude produces polished code and documents, users question its reasoning 5.6 times less often.

Two very different approaches to desktop AI hardware - a 32 GB eGPU with 1,792 GB/s bandwidth versus a 128 GB unified memory mini PC with full CUDA. Which one should you buy?

A review of the Gigabyte AORUS RTX 5090 AI BOX - a liquid-cooled eGPU packing a full desktop RTX 5090 with 32 GB GDDR7, connecting to any laptop over Thunderbolt 5 for $2,999.

A hands-on review of the NVIDIA DGX Spark - a 128 GB Grace Blackwell mini PC that promises 1 petaflop of AI performance on your desk for $4,699.

Figma integrates OpenAI Codex via its MCP server just nine days after adding Claude Code, turning the design tool into a universal bridge between design and AI-powered coding.

Block CEO Jack Dorsey cut 4,000 employees - nearly half the company - citing AI tools as the reason, then predicted the majority of companies will make similar structural changes within 12 months. Wall Street rewarded him with a 25% stock surge. The evidence says he is wrong.

Awni Hannun, the Stanford-trained researcher who co-created Apple's MLX machine learning framework, announced his departure from Apple. His exit is the latest in a devastating exodus of AI talent that has hollowed out Apple's ML research bench over the past year.

OpenAI terminated an employee for using confidential company information to trade on Polymarket, the first confirmed firing of its kind at a major AI lab. An Unusual Whales analysis of on-chain data found 60 suspicious wallets and 77 positions tied to unreleased OpenAI products.

Three new papers tackle agent reliability through formal contracts, active knowledge acquisition for memory systems, and provably stable mechanistic interpretability.

A hands-on review of Aider, the open-source terminal-based AI pair programming tool with git-native workflow, architect/editor mode, and support for 100+ languages across any LLM provider.

A thorough review of Amazon Q Developer - AWS's AI coding assistant with deep cloud integration, agent mode, and a generous free tier, tested against the competition.

A hands-on review of Seedance 2.0, ByteDance's AI video generator that produces photorealistic 15-second clips with synchronized audio - and has triggered cease-and-desist letters from the Motion Picture Association.

A hands-on review of Manus AI - the autonomous agent platform that topped GAIA benchmarks, got acquired by Meta for $2 billion, and still can't reliably handle your credit card.

A hands-on review of Suno, the AI music generator with a $2.45B valuation, Warner Music deal, and a new DAW - testing whether it lives up to the hype for creators and professionals alike.

A hands-on review of v0 by Vercel - the AI-powered UI builder that generates production-ready React and Next.js components from text prompts, with best-in-class design quality but a credit system that punishes power users.

A hands-on review of Lovable, the viral AI app builder that turns natural language into full-stack React apps with Supabase backends - fast, polished, and dangerously insecure by default.

An in-depth review of n8n, the fair-code workflow automation platform with native AI agent nodes, 400+ integrations, and self-hosting that is replacing Zapier for technical teams at a fraction of the cost.

A hands-on review of Cline, the open-source VS Code coding agent with 5M+ installs that works with any model - from Claude to local LLMs - and gives you full agentic capabilities without vendor lock-in.

A hands-on review of Replit Agent 3 - the autonomous browser-based coding platform that builds and deploys full-stack apps from conversation, now at $20/month on the Core plan.

A thorough review of ChatGPT in 2026 - OpenAI's flagship product powered by GPT-5.2 and o3 reasoning, covering all tiers from Free to the $200/month Pro plan, with honest takes on what works and what doesn't.

NotebookLM went viral for turning documents into AI podcasts, but the real story is whether Google has built a genuinely useful research tool or just a clever party trick. We spent a month finding out.

A hands-on review of Cognition's Devin - the first autonomous AI software engineer that writes, debugs, and deploys code independently, now starting at $20/month after a dramatic price cut.

A hands-on review of GitHub Copilot in 2026 - from agent mode and the new CLI to multi-model support and the five-tier pricing system, testing whether the market leader still deserves its crown.

A hands-on review of Windsurf, the agentic IDE formerly known as Codeium - featuring Cascade AI, Flow awareness, and proprietary SWE-1 models, now owned by Cognition after a dramatic three-way acquisition split.

GitHub makes Claude by Anthropic and OpenAI Codex available to all Copilot Business and Pro subscribers at no additional cost, turning Copilot into a true multi-model platform.

A hands-on review of OpenAI's Codex desktop app for macOS - a multi-agent orchestration hub that manages parallel coding tasks, automations, and worktrees, but stumbles on platform exclusivity and usage limits.

Truffle Security found 2,863 public Google API keys that silently gained access to Gemini AI endpoints, exposing private data and racking up charges with no warning to developers.

OpenAI has finalized a $110 billion funding round backed by Amazon, NVIDIA, and SoftBank, valuing the company at $730 billion pre-money. But $35 billion of Amazon's commitment hinges on an IPO or AGI milestone.

Grok has grown from a chatbot into a full AI platform - SuperGrok tiers, 2M context, Imagine video, Aurora images, DeepSearch, and the Grok 4.20 beta. We review the entire ecosystem to see if xAI's ambition matches its execution.

Perplexity has evolved from an AI search experiment into a $9B company processing 500 million queries per month. We tested Pro, the API, and the new Sonar models to see if it truly beats Google at search.

OpenAI's Atlas combines a Chromium browser with GPT-5.2 agent capabilities. It browses, books, shops, and researches on your behalf - when it works. We tested it for two weeks to find out how often that is.

IronClaw is an AI agent framework built by Llion Jones, a co-author of the Transformer paper. It prioritizes sandboxed execution, formal skill verification, and zero-trust architecture. We tested whether security-first means capability-second.

ZeroClaw rewrites OpenClaw's core in Rust, delivering 14x faster skill execution, 90% lower memory usage, and memory safety guarantees. We benchmark it against the original and the competition.

PicoClaw runs OpenClaw-compatible skills on a Raspberry Pi 5. We tested whether a $10 edge AI agent can deliver meaningful automation on hardware you can hold in your hand.

nanobot strips the AI agent concept down to 4,000 lines of Python. No skill marketplace, no social network for bots - just a clean, auditable agent that does what you tell it. We tested whether minimalism holds up.

We spent three weeks with OpenClaw, the open-source AI agent with 200K+ GitHub stars. Its skill ecosystem and autonomous automation are unmatched - but critical security flaws and cost surprises keep it from a recommendation.

New papers tackle training collapse in agentic RL with a unified stabilization recipe, reveal when querying multiple models actually helps, and expose a paradox where LLMs claim to trust humans but bet on algorithms.

Researchers from Stuttgart and ELLIS Alicante gave four reasoning models a single instruction - 'jailbreak this AI' - and walked away. The models planned their own attacks, adapted in real time, and broke through safety guardrails 97.14% of the time across 9 target models.

Arrow 1, a purpose-built SVG generation model from a16z-backed QuiverAI, reached the top of SVG Arena with an Elo of 1583 one day after launch - shattering Gemini 3.1 Pro's previous record of 1421 by 162 points.

LM Studio 0.4.5 introduces LM Link, built on Tailscale's tsnet library, letting users access local AI models on remote hardware through end-to-end encrypted connections with zero port forwarding.

Anthropic launched 13 new MCP connectors, department-specific plugins, and a private enterprise marketplace for Claude Cowork, deepening the SaaS disruption that has already wiped $285 billion off software stocks since January.

Google's Gemini 3.1 Pro more than doubles its predecessor's reasoning scores and introduces adjustable thinking modes, but latency issues and preview-status quirks keep it from a clean sweep.

AlphaEvolve evolved two novel game theory algorithms - VAD-CFR and SHOR-PSRO - that outperform human-designed baselines across 11 games, using mechanisms no researcher would have designed.

Nous Research's Hermes Agent is an open-source CLI agent with persistent multi-level memory, cross-platform messaging support, subagent delegation, and a growing skills ecosystem.

DeepSeek's V4 Lite model has leaked through inference provider testing under strict NDAs, revealing a 1M token context window, native multimodal capabilities, and the internal codename sealion-lite.

Orca Security reveals RoguePilot, a supply chain attack that weaponizes GitHub Issues to hijack Copilot in Codespaces and exfiltrate repository tokens.

Anthropic acquires Seattle startup Vercept and its nine-person team of Allen Institute for AI alumni, folding their vision-based desktop automation into Claude as computer use scores hit 72.5% on OSWorld.

Today's arXiv picks: a state-machine framework that makes GUI agents 12x cheaper, a training method that forces chain-of-thought to be honest, and a KV cache system that matches full quality at 1% the memory.

Samsung's Galaxy S26 launches with Perplexity, Google Gemini, and a revamped Bixby as competing AI agents, plus on-device image generation via EdgeFusion.

Three AI chip startups - MatX, SambaNova, and Axelera - raised a combined $1.1 billion in one week, signaling an acceleration in the race to break Nvidia's GPU dominance.

Anthropic ships Remote Control for Claude Code, letting developers continue local terminal sessions from their phone, tablet, or browser via claude.ai/code. Available now for Max users, Pro coming soon.

Google acquires AI music startup ProducerAI (formerly Riffusion) and folds its team into Google Labs and DeepMind, pairing the platform with Lyria 3 to compete with Suno in the AI music generation market.

Vercel releases the Chat SDK, a TypeScript library installable via 'npm i chat' that lets developers write chatbot logic once and deploy to Slack, Microsoft Teams, Google Chat, Discord, GitHub, and Linear. MIT licensed, AI-provider agnostic, now in public beta.

Claude Sonnet 4.6 identifies itself as DeepSeek when prompted in Chinese, just one day after Anthropic accused DeepSeek of industrial-scale distillation attacks. The cause is training data contamination, not an identity crisis - but the timing is spectacular.

A 38-researcher red-teaming study deployed five autonomous AI agents with email, shell access, and persistent memory in a live environment. In two weeks, one destroyed its own mail server, two got stuck in a 9-day infinite loop, and another leaked SSNs because you said 'forward' instead of 'share.'

Intuit will use Anthropic's Claude Agent SDK and Model Context Protocol to deploy autonomous AI agents across its financial product suite starting spring 2026.

New papers show chatbot sycophancy causes delusional spiraling even in rational users, AI data analysts produce wildly different conclusions from the same dataset, and test-time scaling fails for general-purpose agents.

Turkish AI company Codeway left Firebase and Google Cloud Storage wide open, exposing 300 million chat messages from 25 million users and 8.27 million photos and videos across two apps. Over 12 TB of user data leaked.

Amazon exposes a Russian-speaking hacker who used ARXON (an MCP server feeding data to Claude and DeepSeek) and CHECKER2 to breach 600+ FortiGate firewalls across 55 countries in five weeks - no zero-days required.

Anthropic accuses three Chinese AI labs of industrial-scale distillation attacks using 24,000 fraudulent accounts and 16 million exchanges with Claude. MiniMax ran the largest operation at 13 million exchanges. None of the three companies have responded.

Google's Gemini 3 Deep Think trades speed for depth, delivering record-breaking reasoning benchmarks - but at a steep price.

From pirated libraries to destroyed books to ancient manuscripts, AI companies have consumed millions of copyrighted works and are now approaching the limits of available human text. Here is what they used, what they stole, and what they are looking for next.

A Microsoft 365 Copilot bug (CW1226324) let the AI summarize emails with sensitivity labels in Sent Items and Drafts, bypassing DLP policies for two weeks. The NHS was affected. It's the second time in eight months.

An OpenClaw agent with access to a cybersecurity firm's internal CTI platform published confidential analysis on ClawdINT.com. The agent worked perfectly - the permissions didn't.

Google DeepMind's drug discovery spin-off releases a proprietary AI engine that doubles AlphaFold 3's accuracy but breaks with the open-science tradition that made its predecessor a Nobel Prize winner.

Stanford researchers proved that Claude, Gemini, Grok and GPT-4.1 can reproduce entire copyrighted novels from memory. Some models didn't even need jailbreaking.

OpenAI's automated systems flagged violent gun scenarios in a ChatGPT user's conversations in June 2025. Employees urged leadership to alert Canadian police. Leadership refused. Eight months later, the user killed eight people in Tumbler Ridge, BC.

Amazon Threat Intelligence uncovered a Russian-speaking threat actor using DeepSeek for attack planning, Claude for autonomous exploitation, and a custom MCP server called ARXON to breach 600+ FortiGate devices across 55 countries.

OpenClaw's GitHub security advisories jumped from ~90 to 130 in 48 hours. With 40,000+ exposed instances, a poisoned plugin marketplace, and malware targeting Mac Minis, the most popular personal AI agent is also the most dangerous.

Google releases Gemini 3.1 Pro with 77.1% on ARC-AGI-2, more than doubling the reasoning capability of its predecessor and beating Claude Opus 4.6 and GPT-5.2 on most benchmarks.

Anthropic announced Claude Code Security, an AI tool that found 500+ vulnerabilities missed for decades in open-source code. Within hours, JFrog lost 25%, CrowdStrike dropped 8%, and the cybersecurity ETF hit its lowest since November 2023.

A new benchmark of 84 real-world tasks across 11 domains proves that small AI models armed with human-written step-by-step guides outperform frontier models running blind. The catch: models cannot write these guides themselves.

Elon Musk's Grok chatbot surged from 1.9% to 17.8% U.S. market share in twelve months, becoming the third-largest AI chatbot behind ChatGPT and Gemini. The secret weapon isn't model quality - it's distribution through X's 600 million users.

A $34M-funded health startup just shipped an AI doctor that remembers every symptom, tracks 100+ biomarkers, and calls you out when you lie about your diet. The bet is that a machine with perfect memory can outperform a physician with 15 minutes.

A systematic security audit of Claude Code, Codex, Cursor, Replit, and Devin found 69 vulnerabilities in 15 test applications - zero CSRF protection, zero security headers, and SSRF in every single tool.

From a ransomware that accidentally destroys its own decryption keys to an 88,000-line Linux framework built by one person in a week - AI-generated malware is here, and its fingerprints are unmistakable.

Anthropic refuses to let the Pentagon use Claude AI without ethical guardrails, triggering threats of contract termination and a 'supply chain risk' designation usually reserved for foreign adversaries.

Anthropic's new Claude Code Security tool found 500+ zero-day vulnerabilities in open-source projects. CrowdStrike dropped 8%, Okta 10%. The cybersecurity industry is not having a good day.

Anthropic's mid-tier model delivers 98% of Opus performance at one-fifth the cost, with a 1M token context window and near-parity on coding and computer use benchmarks.

Amazon's AI coding tool Kiro autonomously deleted and recreated a customer-facing AWS environment, triggering a 13-hour outage. It was at least the second AI-caused disruption in months.

OpenAI's agentic security researcher Aardvark is now Codex Security, with a new malware analysis pipeline that lets users upload samples, run automated analysis, and pull structured reports.

Chat & Ask AI, a popular chatbot wrapper app with 50 million users, left its Firebase database wide open - exposing 300 million messages including suicide discussions, drug recipes, and medical conversations to anyone who knew where to look.

Anthropic analyzed millions of Claude Code sessions and found AI agents are working autonomously for nearly twice as long as they did four months ago, with experienced users granting more trust over time.

1,184 malicious skills were found on OpenClaw's ClawHub marketplace - stealing SSH keys, crypto wallets, browser passwords, and opening reverse shells. One attacker uploaded 677 packages alone. The #1 ranked skill had 9 vulnerabilities and was downloaded thousands of times.

At the India AI Impact Summit, PM Modi orchestrated a unity hand-raise with 13 tech leaders. Everyone clasped hands - except Sam Altman and Dario Amodei, who raised clenched fists instead. The internet noticed.

We estimate that Moltbook's 46,000 active AI agents consume 1-4 billion tokens per day, costing up to $20,000 daily in inference and emitting as much CO2 as dozens of American homes - and 93% of those comments get zero replies.

Shanghai AI lab StepFun open-sources Step 3.5 Flash, a 196B sparse MoE model that activates only 11B parameters per token while matching frontier models on reasoning, coding, and agentic benchmarks.

Researchers claim to have extracted 53MB of TypeScript source maps from Persona's FedRAMP-authorized government endpoint, revealing the inner workings of the identity verification platform used by OpenAI and federal agencies.

Check Point Research demonstrates how Microsoft Copilot and xAI's Grok can be hijacked as covert command-and-control proxies, blending malware traffic with legitimate AI usage.

Anthropic's legal and compliance documentation explicitly prohibits using Claude Code OAuth tokens in third-party tools - and the company is enforcing it with server-side blocks and account bans.

Meta and Nvidia announce a multiyear deal spanning millions of GPUs and CPUs, with Meta becoming the first to deploy Nvidia's Grace CPUs standalone at scale.

Three papers that matter this week: a brutal benchmark for AI research agents, a feature-space approach to training data diversity, and trace rewriting to stop model theft.

AI data centers now consume 70% of global memory production, triggering price surges, product delays, and warnings of manufacturer bankruptcies across consumer electronics.

A landmark NBER study of 6,000 executives across four countries finds the vast majority report no measurable productivity or employment effects from AI, echoing Robert Solow's famous 1987 paradox.

xAI's Grok 4.20 replaces the single-model approach with four specialized AI agents - Grok, Harper, Benjamin, and Lucas - that reason in parallel, fact-check each other, and synthesize answers collaboratively.

A compromised npm publishing token allowed an attacker to push a malicious version of the Cline CLI that silently installed OpenClaw via a postinstall script. The incident was caught and fixed within hours.

Peter Steinberger, the Austrian developer behind the viral AI agent OpenClaw, is joining OpenAI to build the next generation of personal agents. The project will live on as an independent open-source foundation.

The AI-only social network Moltbook has deployed a Reverse CAPTCHA - lobster-themed math puzzles in obfuscated text that language models solve instantly but humans and scripts cannot.

An in-depth review of OpenAI Frontier, the enterprise platform for building, deploying, and managing AI agents that promises to reshape how organizations work.

A wave of safety researchers and executives have quit OpenAI, Anthropic, and xAI in the span of a single week, warning that the industry is moving too fast and abandoning its own principles.

The India AI Impact Summit 2026 in New Delhi draws 20 world leaders and CEOs from OpenAI, Google, Anthropic, and DeepMind. Adani pledges $100 billion for AI data centers, Anthropic opens its first India office, and 12 indigenous AI models are unveiled.

Defense Secretary Hegseth is reportedly close to designating Anthropic a 'supply chain risk' after the company refused to allow Claude to be used for mass surveillance and autonomous weapons. A $200 million contract hangs in the balance.

Anthropic's new mid-tier model matches Opus 4.6 on coding benchmarks, ships a million-token context window, and keeps the same $3/$15 pricing as its predecessor.

Four UC San Diego researchers argue in Nature that current LLMs already constitute artificial general intelligence, igniting fierce debate across the AI community.

xAI previews Grok 4.20 with enhanced multimodal capabilities and further reduced hallucinations, building on Grok 4.1's success. The company also teases a 6 trillion parameter Grok 5.

Alibaba releases Qwen 3.5, a 397B parameter open-source multimodal model with 256K context, Apache 2.0 license, and performance that tops Python coding and math reasoning benchmarks.

Compare the best AI image generators of 2026 including Midjourney v7, DALL-E 3.5, FLUX 2 Max, Stable Diffusion 3.5, Ideogram 2.0, and Adobe Firefly 3.

An in-depth review of Claude Opus 4.6, Anthropic's flagship model featuring adaptive thinking, 1M context, agent teams, and industry-leading safety alignment.

OpenAI releases GPT-5.3-Codex, a frontier coding model that is 25% faster, sets new records on SWE-Bench Pro and Terminal-Bench 2.0, and was instrumental in creating itself.

A hands-on review of Midjourney V7, the latest image generation model with a new web interface, stunning aesthetic quality, and improved character consistency.

Anthropic launches Claude Opus 4.6 featuring agent teams, adaptive thinking, 1M token context window, and state-of-the-art performance on Terminal-Bench 2.0 and Humanity's Last Exam.

A detailed review of Google's Gemini 3 Pro, a natively multimodal AI model that leads in vision, spatial reasoning, and video understanding.

Z.ai releases GLM-5, a 744B parameter open-source Mixture-of-Experts model purpose-built for agentic tasks, scoring 77.8% on SWE-bench Verified and 56.2% on Terminal-Bench 2.0.

OpenAI begins testing advertisements in ChatGPT for Free and Go tier users in the US, while Plus, Pro, Business, Enterprise, and Education plans remain ad-free.

Rankings of the best AI image generation models including GPT Image 1.5, Gemini 3 Pro, Midjourney v7, FLUX 2 Max, Stable Diffusion 3.5, and Ideogram 2.0 across text rendering, photorealism, and artistic quality.

A comprehensive review of GPT-5.2, OpenAI's flagship model with three modes, 400K context, and record-breaking benchmarks across reasoning, coding, and multimodal tasks.

Rankings of AI models by cost efficiency, comparing performance per dollar across frontier and budget models. See which models deliver GPT-4 level performance at 1/100th the cost.

DeepSeek releases V3.2 under MIT license with 671B MoE architecture, matching GPT-5 at one-tenth the cost and achieving gold-medal performance on IMO and IOI competitions.

A comprehensive review of Meta's Llama 4 Maverick, a 400B parameter open-weight MoE model with 128 experts, 1M context, and multimodal capabilities.

An accessible guide to AI safety and alignment, covering hallucinations, bias, misuse risks, and how major AI companies approach building safer systems.

A comprehensive review of xAI's Grok 4, the first model to score 50% on Humanity's Last Exam, featuring Heavy and Coding variants with built-in tool use.

A detailed review of Alibaba's Qwen 3 model family, featuring hybrid thinking modes, 119 language support, MCP integration, and Apache 2.0 licensing.