Firefox showed what AI can find.
The fixing gap is real.

Mozilla used Claude Mythos Preview to find 271 bugs in Firefox. We ran Kodah on two of them: real open bugs, public repo. One had been open for three months.

Mozilla published on May 7 how they used Claude Mythos Preview, a pre-release model, to find 271 bugs across the Firefox codebase. The chart below tells the story better than any headline.

Firefox Security Bug Fixes by Month.
Firefox security bug fixes by month. January 2025 through March 2026 averaged roughly 20–30 fixes/month. April 2026: 423, driven by the Claude Mythos bug-finding effort. Source: Mozilla, May 7, 2026.

That spike is what AI-assisted discovery looks like at scale. And Claude Mythos is still a preview. As frontier models mature, more teams will run experiments like this, and the lists of discovered bugs will get longer.

The fixing problem is a different story. That's where Kodah comes in. Every bug on that list represents 30 to 45 minutes of an engineer's time: investigating the root cause, writing a fix, making sure nothing else breaks.

Multiply that by 271.

That's not a tooling problem. That's a capacity problem.

Kodah automates the investigation and the first patch. Your engineers review the diff and merge.

How long can AI work alone?

Before showing what fixing looks like, it helps to understand where AI autonomy actually stands today. METR, an AI safety research organization, independently benchmarks how long AI agents can operate on complex, real software tasks before needing human input.

METR Time Horizon Benchmark
METR's time-horizon benchmark tracks how long an AI can sustain useful autonomous work on software tasks. GPT-2 managed seconds. GPT-4 reached roughly 6 minutes. Claude Mythos Preview exceeds the 16-hour ceiling: beyond what METR's current task suite can even measure. Source: METR, last updated May 8, 2026.

The practical implication: the same class of models that found Mozilla's 271 bugs can now execute multi-hour autonomous fix attempts. The tooling to go from "found a bug" to "here's a patch" is maturing fast. Teams that build a review pipeline now will be structurally ahead when the discovery rate keeps climbing.

We ran Kodah on Firefox

To show what the fix side looks like, we ran Kodah against the public Firefox repository on two real open bugs: both with fixes from Mozilla engineers available for comparison. One had been sitting open for three months. Both were fixed in under eleven minutes combined.


Bug 2014226 — Guard against redundant actor registration

Opened Feb 3, 2026  ·  Fix pushed May 9, 2026

A logic bug that had been sitting open for over three months. When an enterprise policy set TranslateEnabled: true, Firefox tried to register browser actors that were already registered via the default preference value. ChromeUtils.register* threw NotSupportedError every time, flooding the console with errors. No user-visible breakage, just a silent, ongoing error condition in production.

Kodah resolved it in 4 minutes and 21 seconds.

ActorManagerParent.sys.mjs
@@ registerActor @@ const registerActor = () => { if (!actorRegistered) { - register(actorName, actor); - actorRegistered = true; + try { + register(actorName, actor); + actorRegistered = true; + } catch (e) { + if (e && e.name === "NotSupportedError") { + actorRegistered = true; + } else { throw e; } + } } };

Mozilla engineers went further: introducing a stateful actorRegistered flag with closures applied consistently across every registration path, not just the direct call. Cleaner, more future-proof. Again: the right kind of decision to make in review.

4m 21s Time to fix
Feb 3 Bug opened
May 9 Fix pushed

Bug 2038139 — Bookmark autofill broken when history is disabled

Opened May 8, 2026  ·  Fix pushed May 9, 2026

A behavioral bug. In Firefox Nightly, typing a bookmarked domain in the address bar stopped autofilling when browsing history was disabled, or in private browsing mode. Type "moz" with places.history.enabled = false and a mozilla.org bookmark, and nothing autofills. The only workaround was disabling adaptive history entirely, which degraded the experience for everyone.

The root cause was in _getAdaptiveHistoryQuery: the SQL used an INNER JOIN with moz_inputhistory. No history rows, because the user disabled history, cleared it, or was in a private window, no join matches, no autofill result, even for sites the user had explicitly bookmarked.

Kodah's fix targeted the query directly:

UrlbarProviderAutofill.sys.mjs
@@ _getAdaptiveHistoryQuery @@ + bookmarksOnly: queryContext.sources.includes( + UrlbarUtils.RESULT_SOURCE.BOOKMARKS + ) && !queryContext.sources.includes(UrlbarUtils.RESULT_SOURCE.HISTORY) + ? 1 : 0, - i.input AS input, + COALESCE(i.input, :fullSearchString) AS input, - JOIN moz_inputhistory i ON i.place_id = h.id + LEFT JOIN moz_inputhistory i ON i.place_id = h.id - WHERE LENGTH(i.input) != 0 + WHERE (:bookmarksOnly = 1) + OR (LENGTH(i.input) != 0 + AND i.use_count >= :useCountThreshold)

Mozilla engineers fixed it one layer up: introducing an effectiveSources() helper that treats history as unavailable at the routing level, covering all three query generators, plus tests. The user-visible result is identical. The architectural difference is real, and it's exactly the kind of improvement that belongs in a code review, not in the initial investigation.

6m 50s Time to fix
May 8 Bug opened
May 9 Fix pushed

Discussion

Both fixes are slightly narrower in scope than what Mozilla's engineers produced and that's expected. Kodah generates a first viable patch. The engineer's job shifts from writing to reviewing, which is faster, less draining, and far more scalable as the volume of AI-discovered bugs grows.

What the Firefox experiment makes clear is that the discovery side of this problem is largely solved. Mozilla found 271 bugs in a single effort using a model that's still in preview. More experiments like this are coming, from more organizations, with more mature models. The queue will get longer before it gets shorter.

BugOpenedFixed in
2014226 — Actor registration Feb 3, 2026 4m 21s
2038139 — Bookmark autofill May 8, 2026 6m 50s
Total ~11 min

The teams that invest in fix review infrastructure now: tooling, workflows, the habit of treating a diff as the starting point rather than the finish line, will be structurally ahead when that happens.