Anthropic said Claude Mythos was too dangerous to release publicly. They shared it with 40 handpicked organizations. They committed $100 million to help patch the vulnerabilities it found.
It leaked anyway. And now Google has confirmed the first known case of an AI-developed zero-day exploit being used by actual cybercriminals in the wild.
The containment era lasted about six weeks.
The Model That Finds What Humans Can’t
Quick recap for anyone not tracking this saga: Claude Mythos Preview is Anthropic’s unreleased frontier model that autonomously discovers zero-day vulnerabilities — unknown flaws in software that developers haven’t patched because they didn’t know they existed.
We’re not talking about finding obvious bugs. Mythos identifies weaknesses that survived decades of human security review and millions of automated tests. It chains individually minor flaws into critical attack paths. It develops working exploits without human guidance. The UK’s AI Security Institute confirmed it completed a 32-step simulated cyberattack — a first for any AI system.
Daniel Stenberg, creator of curl (a foundational piece of internet infrastructure), confirmed Mythos found a genuine vulnerability in his code. “Maybe Anthropic/Mythos is just hype or marketing — I personally don’t think so,” he wrote.
How It Escaped
In late April, Anthropic confirmed a “handful” of users on a private forum gained unauthorized access. The breach vector: a contractor at one of Anthropic’s vendor partners who leveraged their permissions to tap into Mythos.
This is the oldest story in security. You can build the most secure vault in the world, but if 40 organizations have keys, the attack surface isn’t the vault — it’s the people.
Anthropic did everything the “responsible AI” playbook says you should do. They withheld the model. They vetted partners. They built a defensive coalition called Project Glasswing with AWS, Apple, Google, Microsoft, NVIDIA, JPMorganChase, and others. None of it prevented the inevitable.
First Blood: AI-Crafted Exploit Goes Live
In mid-May, Google reported what appears to be the first confirmed instance of an AI-developed zero-day exploit deployed by real criminals. The direct connection to the Mythos leak isn’t fully confirmed, but the timing is… convenient.
This crosses a threshold. AI vulnerability research has gone from theoretical concern to operational weapon in a matter of weeks. The defenders-first window that Anthropic tried to create has already closed — or at minimum, cracked wide open.
Banks Are Patching at Machine Speed (Barely)
US banks with Mythos access are discovering hundreds to thousands of vulnerabilities and scrambling to patch them in days instead of weeks. Legacy systems are getting shredded — aging code that limped along on obscurity is now fully exposed.
The speed mismatch is brutal. Cyber risk now moves at machine speed. Bank defenses still largely operate at human speed. Rapid-fire patching increases outage risk, meaning your bank might go down more often in the coming months. Not from attacks — from frantic repairs.
Smaller banks face a different problem entirely: at $25 per million input tokens and $125 per million output tokens (five times Claude’s normal pricing), they can’t afford access. Larger institutions are sharing findings, but the information asymmetry creates a two-tier banking security landscape.
The Skeptics Have a Point (Sort Of)
Not everyone is buying the apocalypse framing. Aisle, an AI cybersecurity firm, found that cheaper models can identify some of the same vulnerabilities. Security experts note that most real-world breaches still come from boring stuff: weak passwords, known unpatched flaws, phishing.
There’s also a legitimate question about incentives. Anthropic — now valued around $800 billion — benefits enormously from positioning itself as the responsible steward of dangerous capabilities. Announce a terrifying model, form a defensive coalition, and you’ve established yourself as both the alarm-ringer and the solution-provider. It’s brilliant marketing wrapped in a genuine breakthrough.
But here’s the thing: even if Mythos is partially overhyped, the trajectory is undeniable. If not this model, then the next one. Or an open-source equivalent 12-18 months from now. The capability is out of the bottle regardless of what you think about Anthropic’s PR strategy.
What This Actually Means
The cybersecurity arms race just went exponential. A few implications:
Obscurity is dead. If your security posture relies on “nobody knows about this bug,” Mythos-class models will find it. Assume every vulnerability in your stack is knowable.
Legacy systems are now critical liabilities. End-of-life software running on banking infrastructure, government systems, hospitals — these are sitting targets for AI-powered discovery.
Containment doesn’t scale. Anthropic tried everything right and the model still leaked within weeks. As more labs develop comparable systems, the proliferation problem compounds.
The defender’s window is shrinking. The UK government warned last month that AI capabilities would “rapidly increase” over the next year. The gap between “breakthrough” and “weaponized” is compressing toward zero.
The Arms Race Nobody Wins
Global cybercrime costs around $500 billion annually. If AI-powered vulnerability discovery becomes widely accessible — which appears inevitable — that number could explode before defenses adapt.
The US and China opened AI safety talks in Beijing this week, which suddenly feels less like diplomatic theater and more like mutual survival planning. When AI can autonomously find and exploit critical infrastructure vulnerabilities, AI governance stops being abstract policy and becomes existential necessity.
Anthropic made what looks like the responsible call. They still couldn’t contain it. That’s not an indictment of their approach — it’s a statement about the nature of powerful capabilities in a networked world.
May 2026 will likely be remembered as the month AI cybersecurity went from “interesting research” to “operational reality.” The only question left: can defenders stay ahead of the curve, or are we watching the opening moves of a race nobody wins?