Claude Mythos Finds 10K Flaws in Critical Systems
Anthropic expands Project Glasswing to 150 organizations across 15 countries, with Claude Mythos Preview surfacing 10,000 high-severity vulnerabilities since April.

Since April, Claude Mythos Preview has found more than 10,000 high- or critical-severity vulnerabilities across Anthropic's Project Glasswing partners. On June 2, Anthropic announced it's expanding the program to roughly 150 new organizations across 15 countries - power grids, water systems, telecom operators, hospital networks, and chipmakers whose software runs infrastructure that serves hundreds of millions of people.
TL;DR
| Key Stat | Value |
|---|---|
| Vulnerabilities found (total) | 10,000+ high/critical |
| New organizations | ~150 across 15+ countries |
| Mozilla Firefox 150 fixes | 271 vulnerabilities |
| Cloudflare bugs found | 2,000 (400 high/critical) |
| Open-source projects scanned | 1,000+ (23,019 flagged, 6,202 high/critical) |
| Anthropic financial commitment | $100M usage credits + $4M to open-source orgs |
Anthropic is explicitly framing this as a race against time. Within 6-12 months, it expects competitors to release similarly powerful cybersecurity models - possibly without the access controls or usage agreements that Glasswing requires. The program's current head-start is the entire argument for why the expansion matters now.
How Glasswing Actually Works
The program isn't a single product you install. It's a structured access arrangement: partner organizations get API access to Mythos Preview under a use-agreement that restricts offensive applications, and Anthropic provides support for integrating the model into existing security workflows.
The Scanning Layer
Mythos Preview reads codebases and flags suspicious patterns - memory corruption paths, logic errors in authentication, improper pointer handling - at a speed and scale no human team can match. According to Anthropic, the model has found vulnerabilities in every major operating system and web browser tested so far.
The output follows standard security reporting formats. A typical finding looks roughly like this:
{
"severity": "critical",
"cvss": 9.1,
"type": "heap-use-after-free",
"location": "dom/media/webrtc/RTCPeerConnectionIdp.cpp:847",
"description": "Use-after-free in WebRTC ICE candidate handler triggered by malformed SDP offer",
"recommended_fix": "Null-check before release; add RAII wrapper around candidate lifetime"
}
That's a simplified representation of the SARIF-compatible output format security teams use to pipe findings into their tracking systems. The actual Mythos output includes exploit path analysis and patch drafts.
The Triage Layer
Speed is not the problem. Anthropic's own summary of what's holding the program back: "The bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them."
Mythos can scan an entire browser engine in hours. What it can't do is replace the security engineer who has to read the finding, confirm it isn't a false positive, understand the blast radius, write the patch, get it reviewed, and shepherd it through a release cycle. That pipeline is slow, and it's still human.
The Patch Layer
Some partners are using Mythos for the full cycle - scan, triage, draft patch, verify - while others pipe findings into existing security tools (bug trackers, SAST platforms, SBOM systems). Anthropic hasn't standardized the integration stack, which is both flexible and a real operational ask for organizations without dedicated AI security tooling.
What Mythos Preview Can Actually Do
Mythos Preview is described by Anthropic as its most powerful model. It's not available publicly. The company has been direct about why: no organization, including Anthropic itself, has developed safeguards strong enough to prevent a model this capable from being weaponized for attacks.
Finding Zero-Days at Scale
The model has surfaced vulnerabilities in Firefox, Chrome, macOS, Windows, and Linux - findings confirmed as valid by the organizations that received them. It doesn't just flag known CVE patterns; it reasons about code paths and identifies novel attack chains that signature-based scanners miss.
Open-Source Coverage
Anthropic ran Mythos across more than 1,000 open-source projects and surfaced 23,019 potential issues. Of the 6,202 tagged as high or critical, independent reviewers confirmed more than 90% as genuine. Given that this open-source software ships inside commercial products and government systems worldwide, those numbers aren't academic.
Critical infrastructure sectors - power, water, healthcare - are now in scope for Glasswing's second cohort.
Source: unsplash.com
The New 150 Partners
The expansion covers sectors where a successful attack could affect 100 million or more people, according to Anthropic's own assessment. Named organizations include Okta, Samsung, SK Hynix, SK Telecom, NATO, and ENISA (the EU's cybersecurity agency).
| Sector | Representative Organizations | Countries |
|---|---|---|
| Communications / Telco | SK Telecom, unnamed European carriers | South Korea, Netherlands, Sweden |
| Hardware / Semiconductors | Samsung, SK Hynix | South Korea, Japan |
| Identity / Security Infra | Okta | Australia, Canada, India |
| Government / Defense | NATO, ENISA | Belgium, France, Germany |
| Power / Water / Healthcare | Undisclosed operators | Spain, Italy, New Zealand, Switzerland |
Countries represented include Australia, Canada, France, Germany, Italy, Switzerland, Netherlands, Spain, Belgium, Sweden, India, Japan, New Zealand, and South Korea - a wider geographic spread than the April cohort, which skewed heavily American.
Results So Far
The first cohort ran from early April through May. Three sets of numbers define what the program has produced.
Cloudflare: 2,000 Bugs
Cloudflare ran Mythos across its critical-path systems and found 2,000 bugs. Of those, 400 were rated high or critical severity. The company reported that Mythos's false-positive rate was better than human testers - a claim that matters operationally because false positives burn engineering time that security teams don't have.
Firefox 150: 271 Fixes
Mozilla used Glasswing access to audit Firefox before the 150 release. The result was 271 vulnerabilities fixed in that release, more than ten times what an earlier Anthropic model surfaced in a comparable audit. That comparison matters: it shows Mythos isn't just incrementally better. It's a different class of tool for this workload. The official security advisory is MFSA 2026-30.
Open Source: 23,019 Flagged
The open-source scan is the number with the widest implications. Libraries and frameworks in that 1,000-project sample show up in cloud platforms, medical devices, and industrial control systems. Fixing 6,202 high-severity issues across that ecosystem is years of work - if the findings get routed to maintainers at all, which isn't guaranteed.
Mythos Preview won't be released publicly - Anthropic says no company has enough safeguards to prevent misuse at this capability level.
Source: unsplash.com
Where It Falls Short
The program has real limits and Anthropic is mostly honest about them.
The Triage Bottleneck
Mythos produces findings faster than security teams can act on them. This isn't a complaint about the model - it's a structural problem with the security industry. Organizations with thin security teams will get a backlog, not a fix. Smaller critical-infrastructure operators without dedicated security engineering capacity may struggle to extract value even with Glasswing access.
Why Mythos Stays Private
Anthropic has committed to not releasing Mythos-class models to the public until it has safeguards solid enough to prevent offensive use. That's the right call, but it also means the only organizations that can use these capabilities are vetted partners operating under a specific agreement. The model is already powerful enough to automate much of what an offensive security team does. Putting it on an open API would be a different kind of problem completely.
There's also a gap in the open-source findings. Anthropic's $4M donation to open-source security organizations helps, but routing 6,202 high-severity findings to volunteer maintainers isn't the same as routing them to funded engineering teams. The patch rate on those findings is an open question.
OpenAI Is Already Competing
OpenAI has distributed GPT-5.5-Cyber to its own security partners for testing. The details are sparse, but the program exists. Anthropic's 6-12 month window claim assumes competitors will release without enough safeguards. If OpenAI ships something with comparable capability and similar access controls, Glasswing's defensive lead shrinks.
The more important question isn't which model finds more bugs. It's whether the patch velocity on the defensive side can keep pace with the scan velocity - and right now, across the entire 150-organization cohort, that answer is unknown. Anthropic hasn't published patch-completion rates, only finding counts.
Sources:
- Anthropic: Expanding Project Glasswing
- Anthropic: Project Glasswing overview
- CyberScoop: Anthropic expanding access to Project Glasswing
- TechCrunch: Anthropic scales Claude Mythos to critical infrastructure in 15 countries
- Mozilla: Security Vulnerabilities fixed in Firefox 150 (MFSA 2026-30)
- Help Net Security: Claude Mythos identified 10,000+ software flaws
