Back in 2016, the Pentagon’s research and development arm staged the very first hacking competition fought entirely by automated systems. Dubbed the “Cyber Grand Challenge,” the event stretched for more than eight hours and wrapped up with a Pittsburgh-based team’s computer system, named Mayhem, earning the victory.
Mayhem would later be pitted against—and defeated by—human hackers at DEF CON, one of the world’s largest hacking conferences, held in Las Vegas that same year. In the ten years since, however, the bots may have gained the (metaphorical) upper hand.
Earlier this month, the American AI company Anthropic announced that its team had developed a model called Mythos, capable of uncovering and exploiting software vulnerabilities at a level surpassing “all but the most skilled humans.” The rapid pace of AI progress fuels concerns that such capabilities could be misused to strike at vulnerabilities within software that underpins the world’s most sensitive infrastructure—from government networks to banks to hospitals. Yet supporters argue that, when wielded properly, the technology can open up new avenues for cyber defense.
Anthropic has framed Mythos as a watershed moment for cybersecurity. Because it deemed the model too dangerous for public release, the company shared access with more than 40 technology firms to help them defend against attackers and partnered with 11 of them in an initiative to secure critical software. Anthropic says this arrangement will give defenders a meaningful edge over attackers, who lack access to the model.
But several analysts contend that Anthropic’s push to close cybersecurity gaps might be too late, pointing out that existing open-source AI models can already locate and exploit vulnerabilities. “I think a lot of people took the Mythos announcement as if the ability to discover zero-day vulnerabilities hadn’t existed before, and now, suddenly, it’s here,” Stanislav Fort, founder and chief scientist of the cybersecurity startup Aisle, told The Dispatch. Zero-day vulnerabilities are flaws in software that attackers can exploit before defenders have crafted a fix. A well-known example is the flaw that let Stuxnet, a U.S.-Israeli cyberweapon, quietly disable Iran’s nuclear centrifuges for years before it was discovered in 2010.
The swift evolution of AI heightens the risk of such assaults, but it can also be harnessed to forestall them. Firms like Aisle have used the technology to devise methods for autonomously spotting potential weaknesses. By building a specialized system with open-weight models, the team has employed its tool to pinpoint security gaps in critical digital infrastructure, including 12 vulnerabilities in OpenSSL, the software library that underpins most online communications. It also uncovered a three-year-old security flaw in the software that shields data transmissions between NASA spacecraft and Earth.
Anthropic’s new cyberdefense initiative has also helped uncover key security risks, including in OpenBSD, an operating system used within critical infrastructure such as firewalls. For 27 years, the company contended, the system harbored a bug capable of remotely crashing any computer connected to its network. Yet existing tools, too, may be able to detect weaknesses like this. By directing their tool at the OpenBSD vulnerability and supplying “contextual hints,” Aisle researchers say they identified it at a fraction of the cost. Other AI security firms, including Vidoc Security Labs, similarly report reproducing Anthropic’s findings with public models.
While Mythos is “almost certainly impressive in many respects,” Fort noted, the AI landscape is best understood as a “jagged frontier” where different systems possess distinct strengths and weaknesses, with even smaller models sometimes delivering unexpectedly robust capabilities.
Turning these vulnerabilities into working exploits may be a tougher discipline, yet it’s not necessarily unique to Claude Mythos. “Mythos, the model, is like a spectacular engine, right? But an engine by itself, perched on a stand in a lab, doesn’t win a race,” said Jamieson O’Reilly, a hacker and co-founder of Aether AI, a tool that autonomously seeks vulnerabilities. “It requires carbon fiber, ceramic brakes, top-tier suspension, aerospace-grade titanium exhaust, and, crucially, a driver,” he told The Dispatch. “Without that, it’s merely an impressive piece of hardware with nowhere to go.”
And while Anthropic keeps Mythos tightly controlled, other players—both allies and rivals—are racing to develop hacking technologies that could rival the model, often at a low cost. “No attacker is waiting for Anthropic to release Mythos to the public,” O’Reilly asserted. “They’re already building their own tooling around public models.” O’Reilly claims Aether AI managed to run a simulated assault against a government agency portal, breaching the external defenses and then escalating privileges to an “admin user” who could potentially delete sensitive government data. He also demonstrated to The Dispatch a spear-phishing Zoom attack he crafted using current AI models for under $5.
Fortunately, cyberdefenders seem to hold an advantage over would-be intruders—at least for now. “AI excels at detection, and that’s the most valuable trait for defenders. For intruders, finding vulnerabilities is just the initial step,” said Lennart Maschmeyer, an assistant professor of cybersecurity at Georgia Tech, in a chat with The Dispatch. “Attackers must locate weaknesses and then produce working exploits that achieve their particular aims against a target system, even as defenders strive to locate and neutralize them.”
While Claude Mythos marks a significant and potent advancement, O’Reilly stressed that the cybersecurity sector should not become complacent about security given what current models can do. Marcus Hutchins, the British cybersecurity expert who famously halted the 2017 WannaCry ransomware assault on hospitals worldwide from his bedroom, likewise warned about the hazards of cruder AI technology. Most attacks, he explained, do not require elaborate exploits and can be carried out far more cheaply. AI-powered phishing and deepfakes are already generating billions for attackers.
Nevertheless, the U.S. government appears to be treating the potential impact of Anthropic’s new model with seriousness. Despite the tech firm’s ongoing legal dispute with the Pentagon—and the Trump administration’s labeling of Anthropic as a supply-chain risk, a first for an American company—the government is reportedly exploring giving U.S. agencies access to a version of the model. And time may be of the essence for Washington, as other nations race to expand their own AI-hacking capabilities. In a recent interview with the Financial Times, Anthropic CEO Dario Amodei forecast that Chinese open-source models could replicate Mythos within six to twelve months.
Digital infrastructure underpins much of modern life, as a recent breach against the Mexican government illustrated. The intrusion, which leveraged an earlier Claude model alongside OpenAI’s ChatGPT, accessed more than 150 gigabytes of government data, including voter rolls and tax records.
Ultimately, these AI-fueled hacks put ordinary people and their personal information at risk, as Ryan Fedusiak, a fellow at the American Enterprise Institute, can attest. Becoming a victim of a state-sponsored cyberattack “forced me to reset my entire digital life, from top to bottom,” he told The Dispatch.