A model completed a 32-step corporate-network attack simulation end to end. Not in a movie script. In a UK AI Security Institute lab test, Claude Mythos Preview solved the range from start to finish in 3 of 10 attempts and averaged 22 of 32 steps across all runs. AI cyber capabilities have reached a more concrete threshold than most people realize.
That threshold is not “AI can now hack anything.” The verified claim is narrower, and more useful: in controlled environments, a frontier model can chain reconnaissance, exploitation, and lateral movement into a sustained attack workflow that used to take human professionals many hours or days.
That distinction matters. The AISI results are independently reported by a government lab. Anthropic’s bigger claims about “thousands” of vulnerabilities and zero-days are more impressive if they hold up, but they are still mostly company-supplied evidence with limited public detail. The gap between those two things is exactly where readers should pay attention.
How Claude Mythos Was Evaluated for AI cyber capabilities
The AISI did not ask Mythos trivia questions about malware. It ran two kinds of tests.
First: capture-the-flag challenges, where the model had to find and exploit weaknesses to retrieve hidden flags. On expert-level tasks, AISI says Mythos succeeded 73% of the time. That is a verified number from the institute’s own evaluation.
Second: a cyber range called “The Last Ones”, a 32-step simulation of a corporate network attack from initial recon to full network takeover. AISI estimates a human would need about 20 hours to complete it. Mythos was the first model to finish the whole chain, succeeding in 3 out of 10 attempts. The next-best model, Claude Opus 4.6, averaged 16 steps; Mythos averaged 22.
That is the interesting part. We are no longer talking about isolated exploit snippets. We are talking about agentic attack chains: the model has to keep state, choose tools, recover from mistakes, and move from one host and network segment to another.
AISI also tested an operational technology range called Cooling Tower. Mythos did not complete it. Verified. The institute says the failure does not prove the model is bad at OT attack work because it got stuck in the IT portions first. That is a useful constraint, especially if your mind jumped straight to “Stuxnet, but automated.” We are not there. If you want a reminder of what real-world industrial compromise looks like, it is still worth revisiting Stuxnet.
What the AISI results actually verified
Here is the clean split between confirmed and not yet confirmed.
Verified by AISI:
– Mythos is a step up from earlier frontier models on cyber evaluations.
– It reached 73% on expert-level CTF tasks.
– It completed AISI’s 32-step enterprise attack range end to end in 3/10 runs.
– Across runs, it averaged 22/32 steps.
– The test environment gave the model network access and explicit instructions to attack.
Also verified by AISI, and easy to miss:
– These are controlled cyber ranges, not real organizations.
– The ranges are easier than defended real-world environments.
– They lacked common security features like active defenders and defensive tooling.
– There were no penalties for actions that would trigger alerts.
That last part is the gotcha.
The catch: a model that can complete a long attack chain in a lab has not yet been shown to take over a well-defended real network.
AISI says this directly. The result indicates capability against small, weakly defended, vulnerable enterprise systems once network access has been gained. It does not prove reliable real-world compromise against targets with EDR, SOC analysts, rate limits, logging pipelines, authentication hardening, and people who notice when odd things start happening.
This is why the “AI got better at hacking” framing is too sloppy. The meaningful update is that frontier model security has crossed into workflow territory. The unproven part is how that workflow holds up once defenders push back.
Anthropic’s vulnerability claims are plausible, but not independently established
Anthropic’s own red-team writeup goes further than AISI. It says Mythos can identify and exploit zero-days in major operating systems and web browsers when prompted, and that the company has already found “thousands” of additional high- and critical-severity vulnerabilities for responsible disclosure.
Some of that is backed by concrete numbers. Anthropic says in 198 manually reviewed vulnerability reports, expert contractors agreed with the model’s severity assessment 89% of the time exactly and were within one severity level 98% of the time. That is a specific, useful metric.
But the broader zero-day narrative is still mostly plausible, company-reported evidence, not independent public verification. We do not have a public list of those vulnerabilities. We do not have outside replication of the full discovery pipeline. We do not know how many claims survive vendor validation, how novel they are, or how much human steering was needed per find.
So the right way to read this is:
- AISI’s lab attack-chain result is independently grounded.
- Anthropic’s large-scale AI vulnerability discovery claims may be real, but they are not yet public enough to treat as settled fact.
That does not make them unimportant. It makes them a queue for scrutiny.
TechCrunch’s reporting adds context rather than verification here: Anthropic says Project Glasswing includes 12 partner organizations, with 40 additional organizations getting access outside the core group. That suggests serious industry interest. It does not itself prove the vulnerability numbers.
Why autonomous attack chains matter now
A model that can do 22 of 32 steps on average in a long attack simulation changes the shape of the problem.
The old mental model was “AI helps write phishing emails and exploit code.” The newer one is closer to “AI can increasingly operate as a junior, very fast operator across a messy sequence of tasks.” That means recon, exploit attempts, privilege escalation, pivoting, retry logic, and stitching together partial wins.
That is why this feels adjacent to recent work on AI agent security and even agentic sandbox escape. Once models can pursue goals across multiple steps and tools, security problems stop being about one dangerous output and start being about persistent behavior over time.
For defenders, there is good news hidden in the bad news. The same convergence is happening on offense and defense:
– models can triage bug reports faster
– validate severity faster
– search larger code surfaces for weird edge cases
– help less-specialized teams do more security work
That is probably the real near-term shift. Not autonomous cyberwar. Compression of skilled security labor.
A small team with strong tooling gets more dangerous. So does a small defense team, if it uses the tools well. The race is not abstract anymore.

What generalists should read into the security arms race
Three things.
First, stop looking for a single dramatic proof point. The important evidence is incremental and cumulative. A model finishes expert CTFs. Then it completes a 32-step range. Then it finds vulnerabilities at scale. Each result on its own is limited. Together, they show AI cyber capabilities are compounding.
Second, pay attention to evaluation design. A benchmark with no defenders, no alerting, and no operational consequences tells you something real about offensive potential, but not everything. When future claims arrive, the first question should be: lab, range, or defended production-like target?
Third, expect the bottleneck to move. If models keep improving, the scarce resource stops being raw exploit cleverness and becomes access, operational stealth, and the ability to survive detection. In other words: the harder parts start to look less like puzzle-solving and more like systems engineering.
That is the practical reason this audit matters. It narrows the argument. We do not need sentience hype to get a serious cyber risk story. We already have one.
Key Takeaways
- Verified: AISI found Claude Mythos Preview could complete a 32-step cyber range end to end in 3 of 10 runs and averaged 22 of 32 steps.
- Verified: Mythos hit 73% on expert-level CTF tasks, a clear jump over prior models.
- Not verified by AISI: reliable compromise of defended real-world systems with active defenders and security tooling.
- Plausible but company-reported: Anthropic’s claims about thousands of vulnerabilities and zero-days need more public validation.
- What changed: the threat is no longer just better exploit snippets; it is longer, more autonomous attack workflows.
Further Reading
- Our evaluation of Claude Mythos Preview’s cyber capabilities, The primary AISI evaluation, including CTF and 32-step cyber-range results.
- Claude Mythos Preview red-team writeup, Anthropic’s technical account of vulnerability discovery and severity validation.
- Project Glasswing, Anthropic’s launch page for the broader defensive cybersecurity initiative.
- TechCrunch coverage of Anthropic’s Mythos security launch, Independent reporting on the partner coalition and access rollout.
- TechRadar on Project Glasswing and Mythos, Trade-press framing of the defensive use case and its risks.
The next thing to watch is not another vague “superhuman hacker” headline. It is whether independent labs can reproduce these results in ranges with defenders, detection systems, and penalties for noisy mistakes. That is where the real line will move.
