OpenAI says new GPT-5.5-Cyber outperforms Anthropic’s Mythos on cybersecurity benchmark

OpenAI says new GPT-5.5-Cyber outperforms Anthropic’s Mythos on cybersecurity benchmark

OpenAI says new GPT-5.5-Cyber outperforms Anthropic’s Mythos on cybersecurity benchmark

https://the-decoder.com/openai-says-new-gpt-5-5-cyber-outperforms-anthropics-mythos-on-cybersecurity-benchmark/

Publish Date: 2026-06-23 06:48:00

Source Domain: the-decoder.com

Author:

Using an unordered list, summarize the following article with between 4 and 8 key points.
OpenAI is expanding its Daybreak cybersecurity initiative with an updated Codex Security plugin, the full GPT-5.5-Cyber model, and a partner network of more than 25 security firms and several governments.
Anthropic recently made a similar point, and OpenAI agrees. The real bottleneck in cybersecurity has moved from finding flaws to actually patching them. To close that gap, OpenAI is shipping an updated Codex Security plugin that covers the full pipeline from discovery to patch generation, along with the full release of GPT-5.5-Cyber, a specialized model that sets new highs on security benchmarks. OpenAI also launched an open-source patching initiative and a partner program with more than 25 security firms.
Codex Security update closes the loop from discovery to patch
The Codex Security plugin shipped as a research preview back in March. Since then, it’s scanned over 30 million commits across more than 30,000 codebases, OpenAI says. Over 500,000 findings were automatically flagged as fixed, and human reviewers manually confirmed another 70,000.
OpenAI wants the updated plugin to act like a security engineer sitting next to every developer. It analyzes code alongside a threat model, spots flaws, checks whether affected code is actually reachable, builds a targeted patch, and verifies the result.
New in this update are deep scans of entire codebases, attack path analysis, and export to existing vulnerability management systems through SARIF files or CodeQL queries. The plugin can also triage findings from other scanners or bug bounty reports and automate patch generation in batch mode. Humans still sign off on every change, OpenAI says.
GPT-5.5-Cyber stays locked to vetted defenders
The full version of GPT-5.5-Cyber replaces an earlier preview that mostly aimed to cut unnecessary refusals in security workflows. OpenAI calls the updated model the most capable single model for finding and patching software flaws.
GPT-5.5-Cyber leads on all key cybersecurity benchmarks, according to OpenAI. CyberGym measures whether an agent can reproduce known flaws in software environments. ExploitGym tests whether agents can turn vulnerabilities into working exploits. SEC-bench Pro evaluates long-term vulnerability discovery.

Model
CyberGym
ExploitGym
SEC-bench Pro

GPT-5.5-Cyber
85.6%
39.5%
69.8%

Mythos 5
83.8%

GPT-5.5
81.8%
25.95%
63.1%

GPT-5.4
79.0%

Claude Opus 4
73.1%

The latest version of GPT-5.5-Cyber is deliberately more permissive than standard models and refuses fewer requests, OpenAI says. But only verified defenders can access it, and OpenAI ties that access to verification, monitoring, and guardrails. Most users should stick with GPT-5.5 paired with Trusted Access for Cyber and Codex Security, OpenAI says.
Over 25 security firms and several governments join the program
Through the Daybreak Cyber Partner Program, security companies can plug GPT-5.5 with Trusted Access for Cyber into their own products. Partners include Cisco, CrowdStrike, Cloudflare, Palo Alto Networks, IBM, Fortinet, Wiz, SentinelOne, Darktrace, Palantir, Accenture, PwC, and KPMG.
OpenAI is also expanding its government work. The company says it has Trusted Access partnerships with Australia, Canada, France, Germany, Japan, South Korea, the EU agency ENISA, and the UK. In the US, OpenAI is working to carry out a recently issued executive order on AI security and plans to collaborate directly with critical infrastructure operators.
OpenAI also launched Patch the Planet together with Trail of Bits, HackerOne, and Calif to bring the same patching tools to open-source software. More than 30 open-source projects have signed on, including cURL, Go, Python, Sigstore, and pyca/cryptography. Security researchers work with maintainers to validate and deduplicate flaws and patches before anything gets merged. A first five-day sprint turned up hundreds of issues and led to dozens of merged patches, OpenAI says.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive “AI Radar” frontier report six times a year, full archive access, and access to our comment section.

Subscribe now