In July 1851, an American locksmith named Alfred Charles Hobbs sat down in an upstairs room at Joseph Bramah’s Piccadilly shop with a padlock that had hung in the window for sixty-one years, beneath a standing offer of two hundred guineas to anyone who could open it without the key. The lock, made in 1790, was considered unpickable. Hobbs opened it in fifty-one hours of work across sixteen days. Two days before, he had picked Chubb’s Detector lock — the benchmark of secure British engineering — on a Westminster vault door in twenty-five minutes.
The press christened it the Great Lock Controversy. Chubb promised redesigns. Bankers argued in the letters pages of The Times. Critics asked whether it was responsible to publish, in such detail, the weaknesses of the locks the country relied on. Hobbs’s 1853 answer: “Rogues are very keen in their profession, and know already much more than we can teach them.” The industry was embarrassed, then it innovated — but only after a transitional period that was disorienting before it was productive. That transition is the part worth looking at now, because the cybersecurity industry is about to live through its own version of it.
What Mythos actually is
When Anthropic unveiled Project Glasswing on April 7, it did so in the register that now defines frontier AI launches. A model too dangerous to release, a coalition from AWS to JPMorganChase, $100 million in usage credits, and a promise that Claude Mythos Preview will never be generally available. It is exactly the kind of announcement that invites cynicism, and most of the commentary it has generated deserves it.
Anthropic’s claims, compressed: over the past few weeks, Mythos has identified thousands of zero-day vulnerabilities, many critical and many one to two decades old, chained them into working exploits, and posted a 100 percent success rate on Cybench, a benchmark no prior model has cleared. The anchor result is a 27-year-old bug in OpenBSD, a codebase under continuous professional audit since 1996. Over 99 percent of these findings remain unpatched, so outsiders are evaluating Mythos through benchmark scores and partner endorsements rather than independent replication. If a codebase scrutinized continuously for nearly three decades held a 27-year-old bug, the implication for everything less scrutinized is the actual story.
The obvious comparison is GPT-2. In 2019, OpenAI called it too dangerous to release; nine months later, it shipped, the feared harms did not materialize, and the model now looks like a toy. “Too dangerous to release” has been wrong before. But GPT-2 wrote paragraphs. Mythos produces working exploits for code that survived decades of human scrutiny. That is a different category of danger, and pretending otherwise is how the next decade goes badly.
The transitional decade
In the long run, AI-assisted development plausibly produces code with fewer vulnerabilities by default. Models review models. Security shifts left, into the keystroke itself. The software written in 2035 is likely to be harder to exploit than the software written in 2015.
In the short run, we are in the worst possible window. Decades of legacy code written under weaker threat models now face discovery tools orders of magnitude cheaper than before. Core banking infrastructure at many of the world’s largest institutions still runs on code from the 1980s and 1990s, much of it COBOL, tested exhaustively for business logic but rarely subjected to the kind of deep security audit OpenBSD has received. Payment rails, grid controllers, switching fabric — a great deal of this plumbing has been protected less by formal security properties than by obscurity and the cost of effort. Mythos-class models erode both. And where an OpenBSD bug can usually be fixed with a line of code, a flaw deep inside a mainframe written forty years ago may be genuinely unfixable, because nobody alive still understands what else it touches. That is the part of the problem that does not have a technical solution on any useful timeline.
The false-comfort problem
The most plausible failure mode is not “AI replaces security teams.” It is far more mundane and far harder to argue against in a sprint planning meeting. Dedicated security tools start to feel like overhead once a developer’s loop becomes write code → model checks → model suggests fix → PR merges. The scanner that shows up after the decision has already been made begins to look like an audit layer, a source of noise, something to run later.
Research published earlier this year found that today’s best model produces code that is simultaneously working and secure only 56 percent of the time. The more teams lean on that loop alone, the more vulnerabilities reach production. Anthropic’s own safety research suggests Mythos’s concerning behaviors come from ruthless task completion rather than hidden goals — and that is the more generalizable problem. A model very good at finding the most effective path to a stated objective will sometimes find paths humans would not have crossed. That tendency generalizes across vendors and architectures. The shortcut compounds: AI-generated vulnerabilities land in codebases that AI will soon be very good at finding vulnerabilities in.
What frontier models still can’t do
There is a stubborn list of things frontier models cannot do yet, and it defines where specialists remain indispensable. Binary analysis without source code is still a weak spot. Regulated environments will not, and should not, let a chat conversation replace a certified scanning tool with an audit trail. The hardest defensive work is further out of reach: real-time darknet monitoring, fresh indicators of compromise, attribution of threat actors, spotting intentional backdoors in software supply chains. All of it runs on proprietary intelligence that no public model has ever seen. The market will stratify, but deep cybersecurity expertise becomes more valuable in this world, not less. It is precisely the layer that frontier models cannot bootstrap themselves into.
What sits behind restricted access in the cloud today runs on a workstation within a year, so the durable advantage is not access to the model but the method and data a team brings to it.
The price of admission
Mythos’s capabilities are real — Cybench and OpenBSD settle that. The harder problem is how the world gets through the window between a system built on the assumption that vulnerability discovery is expensive and a system in which it isn’t, without cascading failures in exactly the infrastructure least able to be rewritten.
The transitional decade is the price of admission, and it is the part to argue about now — not whether Anthropic’s launch was marketing, but whether the systems that matter most have the time and the budget to make it through.
The lockless society the alarmists feared in 1851 never arrived. Bramah and Chubb adapted, and the industry emerged more secure than the one the Great Lock Controversy had embarrassed. The long-term endgame here is plausibly the same: software with far fewer vulnerabilities than today, and that is worth building toward. Whether the infrastructure underneath gets the same grace period the locksmiths did is the part still open.

Alexander Gostev is Chief Technology Expert at Kaspersky. He is a multi-disciplinary infosecurity expert and one of the world’s most prominent security professionals. At Kaspersky Lab since 2002, he builds threat intelligence capabilities that defined the industry.
In 2008, Alexander founded the Global Research & Analysis Team (GReAT) and served as editor-in-chief of Securelist.com. His research covers deep malware forensics, cyber-espionage, APT attribution, and the intersection of geopolitics and digital warfare.
Since 2020, Alexander is the Chief Technology Expert at Kaspersky — leading M&A, technology scouting, investment due diligence, and advisory on Kaspersky OS and IoT threats.
TNGlobal INSIDER publishes contributions relevant to entrepreneurship and innovation. You may submit your own original or published contributions subject to editorial discretion.
Featured image: Pramod Tiwari on Unsplash
Earth Day: Reflections on energy resilience & climate risks for Southeast Asia

