Anthropic Built a Model That Hacks OpenBSD for $2,000. You Can't Use It.

On April 7, Anthropic did something it has never done before. It announced a new frontier model and told the world the public can't have it.

The model is called Claude Mythos. It scores 93.9% on SWE-bench Verified and 97.6% on USAMO 2026. Those numbers alone would make it the most capable model the company has ever shipped. But that's not why it's locked behind a wall. It's locked because of what it can do to a computer that isn't yours.

Mythos finds zero-day vulnerabilities in operating systems, browsers, and infrastructure software. Then it writes working exploits for them. Without help. For about $2,000 per chain.

The numbers that broke the release plan

Anthropic ran Mythos against the kind of code most security researchers consider a graveyard. OpenBSD. FFmpeg. FreeBSD. Linux. The Firefox JavaScript engine. Codebases that have been audited for decades by people who do nothing else.

It found bugs ranging from 16 to 27 years old. In one case, it took a 17-year-old FreeBSD remote code execution flaw and turned it into a working exploit on its own. In another, it chained two to four separate Linux bugs into a privilege escalation that lands at root. Browser exploits with JIT heap sprays. The full menu.

The Firefox number is the one that should make you sit down. Claude Opus 4.6, the current public flagship, lands 2 successful JavaScript exploits in Anthropic's evaluation. Mythos lands 181. That isn't an improvement curve. That's a different category of tool.

And here's the line from the announcement that reframes everything: "Over 99% of the vulnerabilities we've found have not yet been patched."

Project Glasswing and the staged release bet

Instead of shipping Mythos to the API, Anthropic launched Project Glasswing. The idea is to put the model in the hands of about fifty organizations responsible for the software the rest of us depend on. Amazon, Apple, Microsoft, Cisco, CrowdStrike, Broadcom, Palo Alto Networks, the Linux Foundation, and roughly forty more.

The bet is that defenders can use Mythos to harden the ecosystem before a model with similar capabilities leaks, gets reproduced by an open source effort, or gets matched by a competitor with looser release norms. Glasswing is an experiment in whether you can run the dangerous part of an AI release as a coordinated patching campaign instead of a launch event.

It's also a tacit admission. The standard playbook of "ship broadly, iterate on safety, watch what users do" stops working when one of the things users can do is autonomously root half the internet.

Why this is different from every previous AI security panic

For years, the conversation about AI and cybersecurity has been mostly hypothetical. Researchers warned that language models would eventually accelerate offensive work. Red teams found jailbreaks. Vendors patched them. Nobody could point to a specific model and say "this one changed the math."

Mythos is the first one you can point to. Three things make it qualitatively new.

Speed collapses to nothing. Pentesting firms estimated some of the exploits Mythos produced as multi-week jobs for senior researchers. Mythos finished them in hours. A bug that would have taken a skilled human a month now costs the price of a flight.

Cost collapses with it. Two thousand dollars in API calls is not a meaningful barrier to anyone who wanted to do this in the first place. Nation states already spend that on coffee. Criminal groups already spend it on infrastructure. The economic moat around offensive capability just evaporated.

The "audited code is safe" assumption is dead. OpenBSD is the project that built its reputation on being the most carefully reviewed open source code in existence. FFmpeg powers a huge fraction of the world's video infrastructure. If Mythos is finding 20-year-old bugs in those, the question isn't whether your codebase has exploitable vulnerabilities. It's how many, and how long it takes for someone with a Mythos-class model to find them.

Patch velocity is now the bottleneck

Here's the uncomfortable thing about the Glasswing approach. Even if the partner companies do everything right, the rest of the supply chain still has to ship the patches.

The old assumption behind responsible disclosure was that bugs are scarce and exploits are hard. Both halves of that just stopped being true. A 90-day disclosure window made sense when one researcher was working one bug at a time. It makes a lot less sense when an automated system can find a class of bugs in an afternoon and chain them by dinner.

Industrial control systems, medical devices, embedded firmware, and the long tail of unmaintained dependencies that runs underneath modern software were already the soft underbelly of computing. They don't get patched on a 90-day cycle. Some of them don't get patched at all. Mythos didn't create that problem. It just put a clock on it.

If you ship software, the practical takeaways are not subtle. Use the models you already have access to, today, to scan your own code for the kinds of bugs Mythos is finding. Compress your patch cycle until it hurts. Automate the parts of incident response that humans currently do by hand. Audit the dependencies you can't patch and figure out what you'd actually do if one of them turned up in a Mythos-class disclosure next month.

The shape of the next year

Mythos is the first model where Anthropic is acting as if the capability itself is the hazard, not the misuse of it. That's a real shift. Every previous safety conversation in the industry has been about guardrails on a model anyone can call. This one is about whether a model should exist in a callable form at all, given what the world looks like underneath it.

The honest read is that Glasswing buys time. Maybe a lot of time. Maybe a little. Capability of this kind doesn't stay in one lab forever. Either a competitor matches it, an open source project approximates it, or weights leak. The coordinated patching window Anthropic is trying to create is a head start, not a permanent solution.

What you do with that head start is the part that actually matters. The companies in Glasswing get to find and fix their own bugs first. Everyone else gets to find out, in public, what their software looks like under a microscope they don't control.

The barrier to attacking systems just dropped by an order of magnitude. The barrier to defending them didn't move. That gap is the entire story of AI cybersecurity risk in 2026, and Mythos is the first time you can measure it in dollars and hours instead of slide decks.

Anthropic Built a Model That Hacks OpenBSD for $2,000. You Can't Use It.

The numbers that broke the release plan

Project Glasswing and the staged release bet

Why this is different from every previous AI security panic

Patch velocity is now the bottleneck

The shape of the next year

Talvez goste de

Axios Got Hacked and Most Vibe Coders Won't Even Notice

Anthropic Is Subsidizing Every Claude Code Session. The Math Doesn't Work.