AI Security Research and Incident Coverage
Tracking AI supply-chain attacks, agent exploits, prompt injection, model leaks, and the real-world incidents shaping AI security today.

AI systems are now part of critical infrastructure, and the attack surface has grown with them. Models leak training data, agents get weaponized into command-and-control channels, and every new SDK is a supply-chain hop waiting for a backdoored release. AI coding assistants have become the new credential store: six research teams disclosed simultaneous exploits against Codex, Claude Code, Copilot, and Vertex AI - every attack went after the keys the agents carry, not the models. Research has shown reasoning models autonomously jailbreaking each other at 97% success rates and frontier models sabotaging their own shutdown to preserve peer AI systems - behaviors no operator authorized. A CVE in a proxy layer now has a 36-hour exploitation window - and frontier AI models can autonomously hack into remote machines and self-replicate at 81% success rates, a figure that was 5% twelve months ago. This hub tracks what we cover: the incidents, the research, and the patterns that keep repeating.
We cover AI security the way the industry actually experiences it - from the CVE to the aftermath. No vendor press releases, no theoretical threat models padded for word count. If a real compromise happened, we report it. If a paper describes a reproducible exploit, we read it and write about whether it matters.
Supply-chain and SDK compromises
SDKs and orchestration layers are where attackers reach the most keys per kilobyte of malicious code. The pattern has expanded from PyPI packages to MCP servers, AI tool marketplaces, and LLM routers - third-party components that sit between agents and provider APIs and silently intercept calls. CVE-2026-42208 in LiteLLM went from public advisory to active exploitation in 36 hours.
- 9 of 428 LLM Routers Were Secretly Hijacking Agent Calls
- Critical RCE in LeRobot Lets Attackers Hijack Robots
- The #1 skill on OpenClaw's marketplace was malware: inside the ClawHub supply chain attack
- MCP's STDIO flaw puts 200K AI servers at risk
- LiteLLM Exploited 36 Hours After Vulnerability Disclosure
- RoguePilot: Hidden GitHub Issue Comment Could Steal Your Entire Repo
- An AI Agent Just Pwned Trivy's 32K-Star Repo via GitHub Actions
Full catalog: /tags/supply-chain-attack/
Agents and assistants weaponized
When the attacker can use the same models you do, defender asymmetry goes to zero. We cover both sides - offensive research on agents that self-replicate, run exploits, and weaponize AI assistants as malware channels, and defensive coverage of products meant to stop them.
- AI Agents Can Hack and Self-Replicate Across Networks
- Your AI Assistant Is a Backdoor: Copilot and Grok Turned Into Malware Command Channels
- AI Coding Agents Breached - Attackers Took the Keys
- Hacker jailbroke Claude to steal 150GB of Mexican government data
- Single operator uses DeepSeek and Claude to breach 600 FortiGate firewalls
- AI models can now jailbreak other AI models autonomously - 97% success rate, no human involved
- Frontier AI Models Sabotage Shutdown to Save Peers
Model vulnerabilities and data leaks
Cloud misconfigurations, silent privilege escalation, and the sheer scale of data exposed when AI wrapper apps skip basic security hygiene.
- 300 Million Private AI Chat Messages Leaked by a Single Misconfigured Database
- CVSS 9.8 Command Injection in Claude-Hovercraft - Another AI Tool RCE Joins the Pile
- Three Claude Code vulnerabilities let attackers run commands and steal API keys just by cloning a repo
- Your Google Maps key is now a Gemini credential - and Google knew for months
- Lovable Users Report Leak of Chats, Code, Credentials
- Perplexity's Comet Browser Can Leak Your Local Files
- One company, two AI apps, 300 million leaked messages and 2 million exposed photos
Benchmarks, red teams, and disclosure
The security research side - what can actually be measured, where the public benchmarks fail, and how responsible disclosure plays out for AI systems.
- OpenAI Daybreak Turns Codex Into Enterprise Security
- GPT-5.5 Brings Mythos-Like Hacking to the Masses
- JBDistill Generates Its Own Jailbreaks - 81.8% Attack Rate
- Claude Found a Fifth of Firefox's 2025 High-Severity Bugs in 2 Weeks
- AI Models Are Gaming Safety Evaluations, Report Warns
- Agents of Chaos: 38-researcher red-team gave agents email, shell access, and memory for two weeks - it went badly
- Every major AI agent benchmark can be hacked
Policy, procurement, and national security
Who is allowed to sell AI to whom, and what the government does when it decides something is a supply-chain risk.
- Anthropic Sues Pentagon Over AI Safety Red Lines
- New York's RAISE Act Is Law - AI Labs Have Until 2027
- NSA Uses Mythos Even as Pentagon Blacklists Anthropic
- Pentagon formally designates Anthropic a supply-chain risk
- DeepSeek, Moonshot, and MiniMax Ran 24,000 Fake Accounts to Steal Claude's Capabilities, Anthropic Says
- North Korea targets Europe with AI deepfake workers
- US AI labs share intel to stop Chinese model theft
Related coverage
Full catalogs are auto-updated on the tag pages:
- Security - all security-adjacent coverage
- Cybersecurity - attacks, defenses, threat intel
- Supply Chain Attack - compromised packages, SDKs, agents
- AI Safety - alignment, oversight, red-team research
- Vulnerabilities - specific CVEs and disclosure stories
- Prompt Injection - input-layer attacks
Why we cover this
Two things separate useful AI-security coverage from the noise. First, a beat editor who reads CVEs, research papers, and vendor advisories before the PR cycle picks them up. Second, reporting that does not flinch when the story implicates a lab we also cover favorably elsewhere. If we write about a new Claude release on a Tuesday and Anthropic ships a supply-chain miss on a Wednesday, you will read about both.
This page is the front door. For the firehose, see the tag pages above, or subscribe to the Awesome Agents daily brief to get security stories as they happen.
