Anthropic has kept Claude Mythos Preview, its AI model for finding software vulnerabilities, out of general release, prompting scrutiny from banks, regulators and central banks. Instead, the company is providing access through Project Glasswing to selected organisations involved in securing critical software.
Anthropic says Mythos found and exploited vulnerabilities across major operating systems and web browsers, including a 27-year-old bug in OpenBSD.
That has pushed the discussion beyond AI capability alone and into financial infrastructure. It was reported this month that UK financial regulators were holding talks with the National Cyber Security Centre and major banks about the risks linked to Mythos. The Bank of England has also said it is testing AI-related risks to the financial system, with Governor Andrew Bailey warning that Anthropic may have “found a way to crack the whole cyber risk world open.”
From model capability to banking resilience
The concern in finance is not only that Mythos appears unusually capable at cybersecurity tasks. It is that a model like this could shorten the gap between discovering a weakness and exploiting it, especially in older, highly connected banking systems. Anthropic has framed Project Glasswing as a defensive effort, arguing that the same capabilities that make advanced models dangerous in the wrong hands can also help defenders find and fix flaws faster.
Azimkhon Askarov, co-CEO and partner at fintech company CONCRYT, said the risks extend beyond individual firms.
“News that Anthropic’s new Mythos AI model has been found to identify and exploit vulnerabilities in major financial systems is a reminder that AI in financial services isn’t a straightforward good news story. Finance ministers and central bankers raising this at IMF level is significant. The concern isn’t just hypothetical: if powerful models can identify and exploit vulnerabilities in core financial infrastructure, the consequences extend far beyond individual institutions. We’re talking about the resilience of the banking system itself, the exposure of core operating systems and payment networks to bad actors, and ultimately the erosion of trust in the financial systems that underpin the global economy.
“What strikes me is the dual-use nature of this. The same AI capabilities that can expose weaknesses can, in principle, be used to find and fix them. But that only works if the people building and deploying AI in financial services are asking the hard questions about responsibility rather than purely capability.”
Anthropic’s own framing leaves room for that dual-use reading. Project Glasswing is meant to give defenders an advantage by helping them secure the software that underpins large parts of the internet and the wider economy. At the same time, it has chosen not to release Mythos publicly because of the potential harm that misuse could cause.
The harder problem starts after the flaw is found
For banks and fintechs, Mythos raises a practical question. Once a serious weakness is identified, how quickly can the organisation confirm it, understand its exposure, and fix it without breaking something else?
Nik Kairinos, CEO of RAIDS AI, an AI safety monitoring platform, said the operational problem starts downstream.
What makes Mythos significant is not only the capability, but what Anthropic chose to do with it. A frontier model, without instruction, surfaced a Linux kernel vulnerability that had gone unnoticed for 27 years. Restricting release to critical infrastructure partners is the right call, but it only buys time.
“When finance ministers, central bank governors, and the CEOs of major banks are publicly concerned about a single AI model, the framing has already shifted. We are no longer debating whether frontier AI creates systemic risk. We are watching institutions scramble to catch up to capabilities that are already in the wild.
“The harder problem sits downstream. You cannot prevent every zero-day from being found, by AI or otherwise. What you can do is monitor every AI system in your estate for anomalous behavior, in real time, with a continuous evidence trail. The organizations that instrumented their AI before this week are in a very different position from those still treating governance as an annual audit exercise.”
That creates different pressures across the market. A large bank may focus on patching speed and third-party dependencies across sprawling infrastructure. A fintech may focus on how much of its stack it can actually see how quickly it can verify supplier exposure, and whether it can prove to partners and regulators that it understands the risk. Banks and regulators in Britain, Germany and other markets are already reportedly assessing those questions.
Weak foundations make the problem bigger
Several responses point to the same underlying issue: a powerful model does not improve resilience on its own. It exposes the quality of the systems around it.
Phil Cotter, CEO of AML and digital compliance firm SmartSearch, tied that point to identity, fraud and compliance.
“British banks are set to onboard AI that its own creator warned could surpass the most skilled humans at finding and exploiting software vulnerabilities. That same capability is already being used by criminals to fabricate synthetic identities, open accounts, and move money at a speed and scale that manual checks were never designed to detect.
AI adoption in financial institutions is a necessary step in the right direction. It will reduce dependence on manual processes and outdated technologies that still underpin most compliance tasks within regulated firms. But this opportunity comes with a warning: AI built on weak foundations doesn’t just fail to stop financial crime – it risks amplifying it.
With many firms becoming liable for criminal prosecution for failing to prevent fraud, it is critical that they have robust systems in place to help verify who they are doing with business with, and provide evidence to regulators that they are complying with legal obligations. But if AI is layered over existing data gaps and fragile systems, it hands criminals a more powerful set of tools to evade a company’s compliance checks, launder money, and commit fraud at scale, while simultaneously handing a potential criminal sentence for the directors unable to detect it.”
Radi El Haj, CEO of payment infrastructure provider RS2, made a similar point from a systems angle.
“What this development highlights is a fundamental shift: AI is no longer just enhancing defensive capabilities – it is accelerating the discovery of systemic weaknesses across critical infrastructure. In this environment, the traditional timelines for identifying, patching and mitigating vulnerabilities are being compressed dramatically.”
“Third, technology architecture is now a critical differentiator in risk management. Institutions operating on flexible, modern platforms will be better equipped to respond quickly to newly identified vulnerabilities, while those reliant on legacy systems may face increased exposure and slower remediation cycles.”
That is where the story becomes less about one model and more about operational readiness. Mythos may be restricted today, but the pressure it creates is immediate: firms need to know where their weak points are, how quickly they can patch them, and how dependent they are on suppliers whose own visibility may be limited.
Leaders still need proof, not panic
There is also a clear note of caution in the responses. Most firms discussing Mythos have not used it directly. Anthropic has published selected examples of what it says the model can do, but it has not disclosed everything it found because many of the vulnerabilities remain sensitive. That creates a familiar problem for security teams and executives: they need to decide how seriously to act before they have complete visibility of the evidence.
Lee Sult, chief investigator at cybersecurity company Binalyze, said that can distort priorities if leadership teams react to noise rather than proof.
“The uncomfortable truth about Mythos is that most people haven’t seen it, used it or had access to anything beyond the marketing. Leaders reacting to hype rather than evidence risk distorting priorities and misallocating resources, with the knock-on effect of erosion of trust with their teams. Mythos may prove meaningful, but technology was never the issue; it’s more about how organisations respond to it.
“We’ve seen it before in cybersecurity with zero-day detection at scale and Endpoint Detection and Response. New tools arrive promising to revolutionise, but underneath the flashy exterior you find automation that shifts where expertise is required rather than reducing it and often adds complexity, because humans still need to validate the findings.
Organisations need to make decisions based on evidence rather than narrative. Leaders need to be asking what actually changes about their environment once the tool has been tested and validated.”
For banks, payments firms and fintechs, the immediate issue is whether they have the software visibility, governance and patching processes to respond if tools like this become part of mainstream cyber defence and, eventually, cyber offence. That shifts the discussion beyond the model itself and towards how firms assess and manage resilience across their systems.

