Mozilla said this week that its Firefox zero-day hardening work with an early version of Claude Mythos Preview helped identify and fix 271 vulnerabilities shipped in Firefox 150. In Mozilla’s account, the model-assisted effort followed earlier scans with Opus 4.6 that had already led to fixes for 22 security-sensitive bugs in Firefox 148.
The company also named three Firefox CVEs it explicitly credited to Claude Mythos Preview: CVE-2026-6746, CVE-2026-6757, and CVE-2026-6758. Mozilla’s public posts said the broader set of 271 findings came from the initial Mythos evaluation, while most of those fixes did not receive individual public CVE listings.
AI model helps Mozilla fix 271 Firefox vulnerabilities
In Mozilla’s security blog, the company said Firefox 150 includes fixes for 271 vulnerabilities identified during its first evaluation of Claude Mythos Preview. Mozilla said the work started in February, when the Firefox team began using frontier AI models to look for latent security bugs in the browser.
That number is a lot larger than Mozilla’s previous public AI-assisted result. The same post said earlier work with Claude Opus 4.6 produced fixes for 22 security-sensitive bugs in Firefox 148.
Ars Technica reported Mozilla engineers described the latest results as having “almost no false positives.” That is the part worth noting. AI bug reports have usually had the opposite reputation: lots of plausible text, then a human spends the afternoon discovering the bug does not exist.
Mozilla’s own engineering post says exactly that. It described earlier AI-generated security reports as “unwanted slop” and said the dynamic changed because both the models improved and Mozilla got better at steering and filtering them.
For related coverage of model performance on offensive and defensive security tasks, see NovaKnown’s earlier reporting on AI cyber capabilities.
Mozilla says Claude Mythos Preview found three Firefox zero-days
Mozilla’s advisories for Firefox 150 publicly credit three named vulnerabilities to Claude Mythos Preview: CVE-2026-6746, CVE-2026-6757, and CVE-2026-6758. Those are the clearest line items connecting the model to specific disclosed bugs.
Mozilla’s engineering post added an important detail about the mix of findings: some of the reports were sandbox escapes. In browser security, a sandbox escape is a bug that lets code break out of the restricted rendering process into a more privileged one.
Mozilla said those sandbox escapes would need to be combined with other exploits to produce a full-chain Firefox compromise. The company also said the model was allowed to patch Firefox source code during these investigations, as long as the modified code only ran in the sandboxed process.
That matters because Mozilla framed these as hardening results across multiple browser subsystems, not just a list of one-shot critical remote code execution bugs. Several findings were defense-in-depth issues or bugs that improved exploitability boundaries rather than standalone takeover chains.
For background on the model itself, NovaKnown previously covered anthropic mythos.
The harness and workflow behind the Firefox scans
Mozilla said the jump in useful findings came from a custom agent harness wrapped around the model. The harness gave the LLM instructions, access to project tools, and a loop that kept it working until it either produced a verifiable result or ran out of road.
Ars Technica quoted Mozilla Distinguished Engineer Brian Grinstead describing the harness as code that tells the model to find a bug in a file, gives it tools to read and write files and run test cases, and then keeps iterating until completion. Mozilla said the harness plugged the model into the same testing pipeline and special Firefox builds its human developers already use.
One concrete example was memory-safety work with sanitizer builds. Grinstead said the team could point the agent at a source file, tell it there was an issue to find, and let it generate test cases until it produced a crash under the sanitizer build. That is a much clearer success condition than “read this code and tell me if anything looks bad,” which is how you get slop.
Mozilla also said the model could use existing fuzzing infrastructure and other internal tools. The workflow was not a chatbot staring at source code. It was an LLM inside a project-specific loop with deterministic checks.
The company’s post says this setup improved both signal generation and noise filtering. That lands squarely in the bucket of LLM failure modes: the model still needs a workflow that can verify outputs against reality.
How Mozilla framed the findings and the remaining caveats
Mozilla’s public framing was blunt. In its security post, the company wrote that “the zero-days are numbered” and said defenders now have a chance to win “decisively.” The reporting underneath that claim is narrower and more concrete: Firefox 150 shipped with 271 fixes tied to model-assisted hardening work, and Mozilla published extra technical detail because of the level of interest.
Mozilla also said it intentionally released only a small sample of the underlying reports. The company normally keeps detailed bug reports private for several months after fixes ship, and said it made a calculated decision to unhide some examples earlier than usual.
The engineering post also describes what the models did not find. Mozilla said some hardened surfaces and layered defenses held up against the model’s attempts, including areas where previous human researchers had found clever routes. That is a useful detail because it puts the Firefox zero-day discussion on actual terrain: a browser with layered mitigations, not a generic claim that AI now solves security.
A separate government evaluation from the UK AI Security Institute puts Claude Mythos Preview’s cyber performance in a broader comparison set. The institute said an early checkpoint of GPT-5.5 now reaches a similar level on its cyber evaluations, after Mythos Preview had previously been the first model to complete its end-to-end corporate network attack simulation. On the institute’s expert-level cyber tasks, GPT-5.5 posted a 71.4% average pass rate versus 68.6% for Mythos Preview.
That evaluation does not measure Firefox directly. It does, however, place Mozilla’s Firefox zero-day work next to an external benchmark showing Mythos Preview is no longer alone at that performance tier.
Key Takeaways
- Mozilla said Firefox 150 includes fixes for 271 vulnerabilities found during an initial evaluation of Claude Mythos Preview.
- Mozilla explicitly credited three CVEs to Claude Mythos Preview: CVE-2026-6746, CVE-2026-6757, and CVE-2026-6758.
- Mozilla said a custom agent harness was central to the results, giving the model tools, test infrastructure, and deterministic verification loops.
- Some findings were sandbox escapes and defense-in-depth issues, not all standalone full-chain compromises.
- The UK AI Security Institute said GPT-5.5 now performs at a similar level to Mythos Preview on its cyber evaluations.
Further Reading
- The zero-days are numbered, Mozilla’s security post on Firefox 150 and the 271 vulnerabilities tied to Mythos-assisted hardening.
- Behind the Scenes Hardening Firefox with Claude Mythos Preview, Mozilla’s engineering write-up on the harness, sample reports, and workflow details.
- Our evaluation of OpenAI’s GPT-5.5 cyber capabilities, The UK AI Security Institute’s benchmark comparing GPT-5.5 with Mythos Preview.
- Mozilla says 271 vulnerabilities found by Mythos have ‘almost no false positives’, Ars Technica’s report with additional quotes from Mozilla engineers on the harness and verification loop.
