cybersecurity

Claude Mythos Found 10,000+ Flaws in a Month — And That’s Not the Scary Part

Anthropic’s Project Glasswing says Claude Mythos helped partners surface 10,000+ vuln candidates in 30 days. The real shift isn’t finding bugs—it’s that patching and verification are now the bottleneck.

Marty Bostick

06 Jun 2026 • 4 min read

10,000+ vulnerability candidates in 30 days. That’s what Anthropic’s Project Glasswing reports a restricted partner group pulled out of open-source code using Claude Mythos Preview.[1][3][4] If you’re running a SaaS, shipping an app, or even just stacking a bunch of third-party libraries like LEGO bricks, this isn’t “security news.” It’s your new operating environment.

Here’s the thing… the headline isn’t “AI finds bugs.” The headline is: AI can now find bugs faster than the world can fix them. And that changes what “being secure” actually means.

So what happened with Project Glasswing + Mythos?

Futuristic conveyor moving glowing code files through a scanning gate with warning icons — Bug-finding just went from manual to mass production.

Anthropic describes Project Glasswing as a 30-day push where a restricted set of about 50 partner organizations used Claude Mythos Preview to hunt vulnerabilities in open-source software.[3][4]

The results (as reported across multiple outlets) are kind of wild:

10,000+ total vulnerability candidates found across 1,000+ open-source projects in ~30 days.[1][3][4]
6,202 were classified as high/critical-severity candidates.[1][3][4]
In an independently reviewed sample of ~1,752 findings, Anthropic reports a 90.6% true positive rate.[1][4]
1,094 of the validated findings were confirmed as high or critical in that sample.[1][3]

Infographic showing five numbered security actions: inventory, SLA, alerts, rollback, verification — Five moves. One week. Less panic.

And yes, the model found real, consequential issues. One example reported widely: a WolfSSL vulnerability, CVE-2026-5194, that could allow certificate forgery and impersonation—impacting devices at massive scale (coverage cites billions).[1][3]

The new reality: the bottleneck moved

Minimal chart with fast discovery arrow and slower patching arrow side by side — This gap is where the risk lives.

Look, I’ll be honest… most people still think security is about finding vulnerabilities. That was true when skilled humans were the limiting factor.

Now? The limiting factor is:

Triage (sorting signal from noise at speed)
Ownership (who maintains the dependency?)
Remediation (patching + shipping without breaking production)
Verification (proving the fix actually fixed it)

Anthropic’s own numbers underline the gap. In earlier reporting, only 97 findings had been patched and 88 advisories issued—despite thousands of high/critical candidates surfaced.[1][3] In other words: discovery is accelerating, remediation isn’t.

What this means for entrepreneurs (not security teams with 200 people)

Here’s what most people miss… you don’t need to become a vulnerability researcher. You need a company that can respond like one.

Because whether AI is used by defenders, attackers, or both (spoiler: it’s both), the “win condition” shifts to:

Shorter time-to-patch
Better dependency hygiene
Clear incident playbooks
Smarter access controls around powerful tooling

Case study snippet (real-world, as reported)

One partner bank reportedly prevented a $1.5M fraudulent wire transfer using Mythos findings to detect the attack pattern.[1][3][4] That’s not “cool AI.” That’s “AI just paid for itself before lunch.”

Quick Wins: 5 moves you can make this week

If you’re a founder or marketer thinking, “Okay Marty, but what do I do with this?” — here are five practical moves that don’t require a PhD:

Inventory your dependencies. If you can’t list your top open-source libraries, you can’t protect them.
Set a patch SLA. High/critical issues need a clock (like 48–72 hours), not a “someday.”
Automate alerts to a real owner. Alerts that land in a shared inbox are basically therapy, not security.
Practice one rollback. The fastest patch is sometimes “revert and regroup.” Know you can do it.
Add a verification step. Don’t just merge the fix—confirm the exploit path is dead.

Digital infographic, numbered circles with icons, 5 numbered circles arranged in flowing pattern with icons and 3-5 word labels: 1) Inventory dependencies 2) Patch SLA 3) Ownered alerts 4) Practice rollback 5) Verify fixes, monochrome blue color scheme, clean professional design, high resolution, vector style graphics

FAQ (because I know you’re thinking it)

Is “10,000+ flaws” the same as “10,000 confirmed vulnerabilities”?

No—those are candidates. But the reviewed sample showed ~90.6% true positives, which is… not trivial.[1][4]

Why keep Mythos limited to ~50 organizations?

Because capability cuts both ways. Anthropic is restricting access while the industry debates when (and how) to broaden it safely.[3][4]

Should I be worried about open-source now?

Open-source is still amazing. The risk is unmaintained or under-maintained open-source used everywhere. When AI scales discovery, “everyone uses it” becomes “everyone is exposed faster.”[1][3][4]

What about AI solving full attack simulations?

The UK AI Security Institute reportedly found Mythos was the first model to solve their multi-stage cyberattack simulations end-to-end.[4] That’s a signal that these models aren’t just finding bugs—they’re learning sequences.

Common mistakes I’d avoid

Waiting for perfect certainty. Treat high-confidence reports as “act now, verify in parallel.”
Thinking scanners = security. Scanning without patching is like buying a treadmill and never plugging it in.
No one owns the fix. Security tasks without an owner are just well-written wishes.

What’s next?

The bottom line is… we’re moving into a world where the best companies aren’t the ones who “never have vulnerabilities.” They’re the ones who can identify, prioritize, patch, and prove fixes at speed.

So here’s my question for you: if an AI system dropped 20 high-severity findings on your team tomorrow… would you have a clean way to ship fixes this week, or would you drown in your own inbox?

Sources

[1] Reporting on Anthropic Project Glasswing results and validated findings (10,000+ candidates; sample validation; WolfSSL CVE coverage), as summarized in provided research data.
[3] Coverage noting restricted partner group (~50 orgs), open-source scanning scale, remediation lag, and access-control debate, as summarized in provided research data.
[4] Coverage citing UK AI Security Institute simulation results and additional validation details, as summarized in provided research data.