The Public Record Is Thin, but Real
The AI vulnerability-discovery record is still small, but direct credits now span browsers, kernels, bootloaders, crypto libraries, and OSS tooling.
AI vulnerability discovery is easy to overstate. It is also easy to dismiss. Both reactions miss the useful middle.
As of May 8, 2026, the public record has a finite but growing set of high-confidence entries:
- Chrome release notes credit Google Big Sleep for CVE-2025-9132 and CVE-2025-9478.
- Apple security advisories credit Google Big Sleep for WebKit CVEs.
- Google and Project Zero directly describe Big Sleep’s SQLite findings.
- Google’s OSS-Fuzz team says LLM-generated fuzz targets found CVE-2024-9143 in OpenSSL and 25 other vulnerability reports.
- Microsoft says Security Copilot accelerated a 20-CVE bootloader campaign across GRUB2, U-Boot, and Barebox.
- OpenAI’s Codex Security page lists assigned CVE examples across open-source projects.
- Xint/Theori directly describes Xint Code’s role in finding CopyFail, CVE-2026-31431.
- Theori’s Xint public bug tracker lists 50 Xint tracker findings as of May 5, 2026, including eight CVE-backed entries. Bugflation indexes CopyFail separately and groups the seven non-CopyFail CVE-backed entries while leaving embargoed rows as tracker-only context.
- Bynario now has three public AI credits: Apple’s advisory credits BynarIO AI
for CVE-2025-43377, and Linux fixes for CVE-2026-31532 and CVE-2026-31694
include
Assisted-by: Bynario AI. - Microsoft CVE records confirm critical RCE vulnerabilities that XBOW says were found autonomously.
The 2026 record broadened again. Anthropic and Mozilla document
Claude-assisted Firefox findings in Firefox 148 and later Mythos-identified
fixes in Firefox 150. FreeBSD advisories directly credit Nicholas Carlini using
Claude on multiple kernel issues. AISLE’s OpenSSL work shows a purpose-built
autonomous analysis system repeatedly producing accepted CVEs and fixes in one
of the world’s most scrutinized cryptographic libraries. Bynario adds another
kernel-focused signal: public Linux commits with Assisted-by: Bynario AI,
including a detailed write-up for CVE-2026-31532 where its LLM-driven pipeline
is described as discovering, validating, and patching a CAN raw socket
use-after-free. The FUSE entry, CVE-2026-31694, is counted with a narrower
boundary because the upstream commit also carries separate reporter credits.
That is not a flood of fully transparent exploit write-ups. It is enough to show that AI-assisted discovery has crossed from demonstration into accepted disclosure workflows.
Why attribution is hard
Vulnerability credits are social artifacts. A finder might use an AI assistant without naming it. A vendor might accept a report but avoid crediting the tool. An AI company might publicize its role while the upstream CVE only names the vulnerability. Confidentiality terms may suppress the most interesting details.
The absence of attribution is not proof of human-only discovery. The presence of attribution is not proof that a model did everything by itself.
Bugflation therefore treats attribution as a field with strength:
- Direct: the upstream vendor, release note, advisory, or primary research post names the AI system.
- Self-reported: the AI system operator claims discovery, while the CVE or vendor record corroborates the vulnerability but not every workflow detail.
- Secondary: credible reporting exists, but a primary source is missing.
Only direct and self-reported entries appear in the current findings list.
What the record does not prove
The ledger is not a census. It does not measure all AI-assisted vulnerability research, all private vendor usage, or all rejected reports. It also does not compare model capability in a controlled benchmark.
It answers a narrower question: which public vulnerability disclosures contain auditable evidence that an AI system or AI-assisted workflow played a named role in discovery?
That narrower question is the right one for launch. A small, clean index is more useful than a large index polluted by inference.
What to watch next
The key metric is not raw count. It is the rate at which AI-attributed reports survive normal security review and become shipped fixes. Direct credits now span browsers, kernels, bootloaders, cryptographic libraries, open-source application stacks, and bug-bounty programs. The curve is still early, but it is no longer a single-vendor story.
The next stronger signal will be repetition: the same systems producing accepted fixes across multiple releases, independent maintainers, and different bug classes. That is why the ledger tracks attribution quality, source links, and clusters rather than treating every claim as equal.
Published May 8, 2026 by Bugflation Editorial. Follow new articles and findings through the RSS feed.