AI Security Research and Incident Coverage
Tracking AI supply-chain attacks, agent exploits, prompt injection, model leaks, and the real-world incidents shaping AI security today.

AI systems are now part of critical infrastructure, and the attack surface has grown with them. Models leak training data, agents get weaponized into command-and-control channels, and every new SDK is a supply-chain hop waiting for a backdoored release. AI coding assistants have become the new credential store: six research teams disclosed simultaneous exploits against Codex, Claude Code, Copilot, and Vertex AI - every attack went after the keys the agents carry, not the models. Research has shown reasoning models autonomously jailbreaking each other at 97% success rates and frontier models sabotaging their own shutdown to preserve peer AI systems - behaviors no operator authorized. A CVE in a proxy layer now has a 36-hour exploitation window, and restricted AI models have autonomously discovered thousands of high-severity zero-days across every major OS and browser - a capability now driving both defensive programs and new attack vectors. Meanwhile, poisoned VS Code extensions are the new PyPI package: TeamPCP exfiltrated 3,800 GitHub repositories through an 11-minute window on the VS Code Marketplace. Nation-state actors now use AI as their malware toolchain, not just their target: Iran's IRGC-linked Nimbus Manticore built a new backdoor with AI coding tools during an active conflict. Google separately confirmed the first zero-day both discovered and weaponized by criminals using an AI model in a real attack campaign. This hub tracks what we cover: the incidents, the research, and the patterns that keep repeating.
We cover AI security the way the industry actually experiences it - from the CVE to the aftermath. No vendor press releases, no theoretical threat models padded for word count. If a real compromise happened, we report it. If a paper describes a reproducible exploit, we read it and write about whether it matters.
Supply-chain and SDK compromises
SDKs and orchestration layers are where attackers reach the most keys per kilobyte of malicious code. The pattern has expanded from PyPI packages to MCP servers, AI tool marketplaces, LLM routers, and now the VS Code Marketplace itself - third-party components that sit between agents and provider APIs and silently intercept calls or exfiltrate credentials. CVE-2026-42208 in LiteLLM went from public advisory to active exploitation in 36 hours.
- TeamPCP Breaches GitHub via Poisoned VS Code Extension
- LiteLLM Compromised: Credential Stealer in PyPI Package
- Hundreds of LLM-Written GitHub Repos Are Malware
- Critical RCE in LeRobot Lets Attackers Hijack Robots
- Vercel Breach Traced to AI Office Suite OAuth Token Theft
- MCP's STDIO flaw puts 200K AI servers at risk
- RoguePilot: Hidden GitHub Issue Comment Could Steal Your Entire Repo
Full catalog: /tags/supply-chain-attack/
Agents and assistants weaponized
When the attacker can use the same models you do, defender asymmetry goes to zero. We cover both sides - offensive research on agents that self-replicate, run exploits, and weaponize AI assistants as malware channels, and defensive coverage of products meant to stop them.
- AI Agents Can Hack and Self-Replicate Across Networks
- AI Coding Agents Breached - Attackers Took the Keys
- Meta's Rogue AI Agent Triggered a Sev 1 Security Breach
- Frontier AI Models Sabotage Shutdown to Save Peers
- AI Models Resist Shutdown and Resort to Blackmail
- AI models can now jailbreak other AI models autonomously - 97% success rate, no human involved
- Your AI Assistant Is a Backdoor: Copilot and Grok Turned Into Malware Command Channels
Model vulnerabilities and data leaks
Cloud misconfigurations, silent privilege escalation, and the sheer scale of data exposed when AI wrapper apps skip basic security hygiene.
- Lovable Users Report Leak of Chats, Code, Credentials
- Claude Code Taught Itself to Escape Its Own Sandbox
- Your Google Maps key is now a Gemini credential - and Google knew for months
- CVSS 9.8 Command Injection in Claude-Hovercraft - Another AI Tool RCE Joins the Pile
- One company, two AI apps, 300 million leaked messages and 2 million exposed photos
- Discord Group Slipped Into Claude Mythos on Day One
- China's Top Cybersecurity Firm Ships SSL Key in AI App
Benchmarks, red teams, and disclosure
The security research side - what can actually be measured, where the public benchmarks fail, and how responsible disclosure plays out for AI systems.
- Pwn2Own 2026 Capacity Overflow, Hackers Drop 0-Days Solo
- Claude Found a Fifth of Firefox's 2025 High-Severity Bugs in 2 Weeks
- Google Catches First AI-Built Zero-Day in Wild
- Claude Mythos Preview Finds Thousands of Zero-Days
- Every major AI agent benchmark can be hacked
- AI Models Are Gaming Safety Evaluations, Report Warns
- Agents of Chaos: 38-researcher red-team gave agents email, shell access, and memory for two weeks - it went badly
Policy, procurement, and national security
Who is allowed to sell AI to whom, and what the government does when it decides something is a supply-chain risk.
- IRGC Hackers Used AI to Build Malware During Iran War
- North Korea Targets Europe with AI Deepfake Workers
- Five Frontier AI Labs Now Under US Pre-Release Review
- NSA Uses Mythos Even as Pentagon Blacklists Anthropic
- Anthropic Sues Pentagon Over AI Safety Red Lines
- US AI labs share intel to stop Chinese model theft
- New York's RAISE Act Is Law - AI Labs Have Until 2027
Related coverage
Full catalogs are auto-updated on the tag pages:
- Security - all security-adjacent coverage
- Cybersecurity - attacks, defenses, threat intel
- Supply Chain Attack - compromised packages, SDKs, agents
- AI Safety - alignment, oversight, red-team research
- Vulnerabilities - specific CVEs and disclosure stories
- Prompt Injection - input-layer attacks
Why we cover this
Two things separate useful AI-security coverage from the noise. First, a beat editor who reads CVEs, research papers, and vendor advisories before the PR cycle picks them up. Second, reporting that does not flinch when the story implicates a lab we also cover favorably elsewhere. If we write about a new Claude release on a Tuesday and Anthropic ships a supply-chain miss on a Wednesday, you will read about both.
This page is the front door. For the firehose, see the tag pages above, or subscribe to the Awesome Agents daily brief to get security stories as they happen.
