{"id":215457,"date":"2026-05-18T08:02:00","date_gmt":"2026-05-18T12:02:00","guid":{"rendered":"https:\/\/testing.news-you-need.com\/index.php\/2026\/05\/18\/how-a-government-contest-launched-a-revolution-in-ai-based-bug-hunting\/"},"modified":"2026-05-18T08:05:11","modified_gmt":"2026-05-18T12:05:11","slug":"how-a-government-contest-launched-a-revolution-in-ai-based-bug-hunting","status":"publish","type":"post","link":"https:\/\/testing.news-you-need.com\/index.php\/2026\/05\/18\/how-a-government-contest-launched-a-revolution-in-ai-based-bug-hunting\/","title":{"rendered":"How a government contest launched a revolution in AI-based bug hunting"},"content":{"rendered":"<p><a href=\"https:\/\/www.cybersecuritydive.com\/news\/ai-vulnerability-discovery-darpa-challenge-critical-infrastructure\/819494\/\">How a government contest launched a revolution in AI-based bug hunting<\/a><\/p>\n<p><a href=\"https:\/\/www.cybersecuritydive.com\/news\/ai-vulnerability-discovery-darpa-challenge-critical-infrastructure\/819494\/\">https:\/\/www.cybersecuritydive.com\/news\/ai-vulnerability-discovery-darpa-challenge-critical-infrastructure\/819494\/<\/a><\/p>\n<p>Publish Date: <a href=\"publish_date]\">2026-05-18 08:02:00<\/a><\/p>\n<p>Source Domain: <a href=\"www.cybersecuritydive.com\">www.cybersecuritydive.com<\/a><\/p>\n<p>Author: <a href=\"\"><\/a><\/p>\n<p> Using an unordered list, summarize the following article with between 4 and 8 key points. <\/p>\n<p>While the world alternates between panicking and fawning over Anthropic\u2019s powerful new AI model Claude Mythos and its ability to discover serious software vulnerabilities, open-source AI systems are already revolutionizing the vulnerability-hunting landscape \u2014 at a far lower cost.<br \/>\nThese increasingly sophisticated open-source tools are the product of the Defense Advanced Research Projects Agency\u2019s (DARPA)\u00a0Artificial Intelligence Cyber Challenge, a multiyear effort to spur the development of AI systems that can quickly find and fix bugs in America\u2019s sprawling web of critical infrastructure. The vulnerability-hunting systems that emerged from DARPA\u2019s contest didn\u2019t get splashy launches like Claude Mythos or OpenAI\u2019s similar new tool, but because they\u2019re open source and much cheaper to run, they could help far more infrastructure providers, businesses and independent software developers.<\/p>\n<p>With the DARPA competition in the rear-view mirror, the winning teams and other finalists are putting what they learned into practice to help secure open-source packages that quietly undergird the entire internet. While efforts to connect with critical infrastructure operators and their vendors remain nascent, DARPA and several competition winners told Cybersecurity Dive they\u2019re thrilled with how effective the new AI tools have proven.<br \/>\nAt a time when the U.S. cybersecurity workforce is stretched thin and adversaries are using AI to speed up their attacks, the nation\u2019s best hope could be automated tools that find and help fix vulnerabilities before they lead to chaos.<br \/>\nFinding bugs everywhere<br \/>\nAfter DARPA announced its challenge\u2019s three winners in August 2025, it created a $1.4 million bonus prize pot for competition finalists who used their AI systems to find and fix vulnerabilities in critically important software. The agency reviewed teams\u2019 proposals to scrutinize important open-source packages and tracked how they engaged with the projects\u2019 maintainers. Each of the seven competition finalists could earn up to $200,000, with a maximum of $10,000 per project.<br \/>\nBy the time the paid vulnerability-hunting spree ended in March, the teams had found 83 vulnerabilities in more than 30 commercial and open-source projects, including Android, Linux, the popular database engine SQLite and the widely used data-storage tool Redis. Of the $1.4 million in prize money the government set aside, it awarded $830,000.<br \/>\nSince then, \u201cthe teams have continued to find and produce patches for additional vulnerabilities across other projects,\u201d said Andrew Carney, the DARPA program manager who oversaw the competition and now liaises with teams in the new phase of their work.<\/p>\n<p>Team Atlanta, the DARPA contest\u2019s winner, found flaws in the U-Boot boot loader and several core Apache libraries, he said, while another finalist, 42-b3yond-6ug, identified vulnerabilities in the Linux kernel that could have let hackers cripple devices widely embedded in critical infrastructure.<br \/>\nTheori, the third-place team, has deployed its system, Xint, to find flaws in \u201call sorts of really widely used open-source projects that everyone on the internet relies on,\u201d said Tyler Nighswander, a researcher at the company. Xint has identified vulnerabilities in the popular database tools Redis, Postgres and MariaDB, as well as Python, Linux and Apple\u2019s XNU kernel, which powers MacOS and the iPhone.<\/p>\n<p>AI has been particularly useful for finding \u201clogic bugs\u201d \u2014 flawed code that traditional vulnerability-assessment software wouldn\u2019t flag as defective. As AI gets better at understanding context, \u201cautomated tools are able to push their boundaries more,\u201d said Michael Brown, principal security engineer, at second-place team Trail of Bits.<br \/>\nAnd while finding flaws gets most of the attention, the AI systems\u2019 ability to validate their findings and their automatically generated fixes might be their real superpower.<br \/>\nCritical infrastructure organizations typically run highly customized hardware and software and often struggle to test patches on their bespoke devices, if they can patch them at all. Because the vulnerability hunters had to develop patch-validation capabilities for DARPA\u2019s competition, their latest systems contain those features.<br \/>\n\u201cThere\u2019s so much power there, and there\u2019s so much value for safety-critical, high-assurance-required systems,\u201d Carney said. \u201cWhen we do have these conversations [with infrastructure organizations], that\u2019s where we end up spending a lot of our time.\u201d<\/p>\n<p>\u201cIt\u2019s been a bit difficult to convince those slower companies [and] industries to adopt this tech.\u201d<\/p>\n<p>Tyler Nighswander<br \/>\nResearcher, Theori<\/p>\n<p>Critical infrastructure roadblocks<br \/>\nWhile the teams have been busy fixing core internet architecture, DARPA has also tried to connect them with the operators of critical infrastructure and the vendors who supply their equipment. Much of America\u2019s computerized industrial machinery is old, insecure and poorly maintained, and DARPA hopes volunteer vulnerability hunters can root out major flaws before hackers exploit them.<br \/>\nDARPA has briefed several sector coordinating councils (SCCs) on the AI tools\u2019 potential. Carney recently spoke at a meeting of the Health Sector Coordinating Council to share the competition teams\u2019 progress. \u201cForums like that are what we&#8217;re really focused on,\u201d he said, \u201cbecause they\u2019re very efficient at getting the message out.\u201d<br \/>\nThe introductions that DARPA has facilitated between vulnerability hunters and critical infrastructure operators have had \u201cvarying degrees of success,\u201d according to Nighswander, who said many organizations aren\u2019t eager to embrace new technologies. \u201cIt&#8217;s been really slow. Different sectors have different adoption cycles and uptake willingness.\u201d<br \/>\nSome infrastructure firms don\u2019t understand how the AI systems would work in their environments. Others decline AI help because they think their human security teams are sufficient. Still others are interested in AI vulnerability detection but can\u2019t get the necessary permissions. \u201cTrying to figure out how to cut some of that red tape would be nice,\u201d Nighswander said, \u201cbecause so far that&#8217;s been the biggest limitation.\u201d<\/p>\n<p>Theori has signed vulnerability hunting agreements with \u201cfewer than five\u201d critical infrastructure entities, Nighswander said. \u201cIt\u2019s been a bit difficult to convince those slower companies [and] industries to adopt this tech.\u201d<br \/>\nThe biggest success story so far has been Trail of Bits\u2019 partnership with the Department of Health and Human Services to hunt for flaws in medical devices, a project that Brown said has fixed many vulnerabilities through strong partnerships with healthcare providers and their suppliers.<br \/>\nBecause infrastructure vendors routinely use lightweight open-source packages \u2014\u00a0especially in embedded devices \u2014\u00a0the vulnerability hunters\u2019 existing work will have significant downstream effects, even for vendors that don\u2019t want to engage directly.<br \/>\n\u201cThat\u2019s been a way of providing value and then jump-starting those conversations with the industry-specific or sector-specific companies and entities,\u201d Carney said.<\/p>\n<p>Anthropic\u2019s announcement of Claude Mythos upended the business world, but similar open-source tools have been available for months.<br \/>\nMichael M. Santiago via Getty Images<br \/>\n\u00a0<\/p>\n<p>Busting Mythos<br \/>\nWhen Anthropic announced Claude Mythos Preview and said it was too dangerous to release publicly, prompting shock and alarm across the business community, the former DARPA competitors mostly just shrugged.<br \/>\n\u201cIt\u2019s very cool,\u201d Nighswander said, but \u201cthis is the world that we\u2019ve been living in for a while now.\u201d<br \/>\nStill, the new publicity surrounding AI vulnerability detection could benefit the teams behind the open-source systems. \u201cIt leads people to find out that, \u2018Oh, this is a thing that my company should be worried about,\u2019\u201d Nighswander said.<br \/>\nThe DARPA contest finalists even have an advantage over OpenAI and Anthropic, because their open-source tools are far cheaper than the big AI companies\u2019 products, which can cost tens of thousands of dollars in access tokens.<br \/>\nUsing Claude for vulnerability hunting is \u201ckind of like showing up to a fancy restaurant with no prices on the menu,\u201d said Trent Brunson, Trail of Bits\u2019 director of research and development. \u201cYou know you have a large code base. You don\u2019t know what bugs you have. \u2026 Companies might spend $50,000, $75,000 on tokens and not even realize it, and then they might come up with very low-information bugs.\u201d<br \/>\nCash-strapped critical infrastructure firms might pass over OpenAI and Anthropic\u2019s tools in favor of the DARPA finalists\u2019 much cheaper but similarly effective services. \u201cMore companies are going to look at the bottom line,\u201d Brunson said, \u201crather than just throw AI tokens at it.\u201d<br \/>\nBeyond the competition<br \/>\nAs they\u2019ve left the DARPA competition behind, Theori, Trail of Bits and their peers have taken different approaches to implementing their own AI systems.<br \/>\nTheori is working with open-source package maintainers, but it has also commercialized Xint and is contracting with businesses to evaluate their products. \u201cWe\u2019ve been running that quite successfully so far,\u201d Nighswander said. Trail of Bits, by contrast, is focusing mostly on open-source packages. Commercializing its tool, Buttercup, would \u201cfundamentally\u201d change the company, Brunson said.<br \/>\nThe vulnerability hunters have had to modify their AI systems as they\u2019ve moved out of the competition environment. The tools need to be able to find real vulnerabilities, not the \u201csynthetic\u201d flaws that DARPA created for the challenge. They need to produce reports that humans can easily read, not just data to feed into a scoring algorithm. And they need to be able to evaluate a wider range of inputs than what the heavily structured competition required.<br \/>\nTrail of Bits built an entirely new system for its work analyzing medical devices\u2019 firmware, which is typically written in binary, unlike software, which involves source code. Binary code is the main way embedded devices communicate, but AI has a hard time processing it because it doesn\u2019t look like natural language the way source code does. Once that problem is solved, Brunson said, \u201cthe world\u2019s our oyster.\u201d<\/p>\n<p>\u201cI\u2019m extraordinarily excited at the performance and impact that the technology continues to have.\u201d<\/p>\n<p>Andrew Carney<br \/>\nProgram Manager, DARPA<\/p>\n<p>AI\u2019s true sea change<br \/>\nDARPA, the competition finalists and cybersecurity experts said it\u2019s almost impossible to overstate how much AI will change the process of finding vulnerabilities.<br \/>\nSoftware security assessments that used to take multiple people six months can now be done by AI in a matter of hours, often with better results, Nighswander said. \u201cThat scale and efficiency is incredible.\u201d<br \/>\nThe technology is obviously a double-edged sword. Nick Reese, the former director of emerging technology policy at the Department of Homeland Security, said the same tools that \u201cpresent a significant opportunity for security professionals\u201d also create \u201ca potential advantage for attackers if they get access to the same data.\u201d<br \/>\nBut DARPA views things optimistically. It took years for the self-driving cars that emerged from DARPA\u2019s first challenge in 2004 to hit the market; with the AI bug-fixing competition, Carney said, the agency never thought it\u2019d see \u201ca technical miracle\u201d that was \u201ceconomically feasible at the same time.\u201d<br \/>\n\u201cI\u2019m extraordinarily excited,\u201d Carney said, \u201cat the performance and impact that the technology continues to have.\u201d<\/p>\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>How a government contest launched a revolution in AI-based bug hunting https:\/\/www.cybersecuritydive.com\/news\/ai-vulnerability-discovery-darpa-challenge-critical-infrastructure\/819494\/ Publish Date: 2026-05-18&#8230;<\/p>\n","protected":false},"author":1,"featured_media":215458,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/imgproxy.divecdn.com\/JccB76e4oS9n7bJQTvGRyLlkntE2cHYX0YdkAYEKQUw\/g:ce\/rs:fit:770:435\/Z3M6Ly9kaXZlc2l0ZS1zdG9yYWdlL2RpdmVpbWFnZS9QWExfMjAyNTA4MDhfMTg0MTEzMjQxLmpwZw==.webp","fifu_image_alt":"","footnotes":""},"categories":[15],"tags":[26,20,24,31,27],"class_list":["post-215457","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cybersecurity","tag-ai","tag-artificial-intelligence","tag-cybersecurity","tag-exploit","tag-vulnerability"],"_links":{"self":[{"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/215457"}],"collection":[{"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/comments?post=215457"}],"version-history":[{"count":1,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/215457\/revisions"}],"predecessor-version":[{"id":215459,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/posts\/215457\/revisions\/215459"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/media\/215458"}],"wp:attachment":[{"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/media?parent=215457"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/categories?post=215457"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/testing.news-you-need.com\/index.php\/wp-json\/wp\/v2\/tags?post=215457"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}