Politics clouds frontier model releases

On 9 June, Anthropic released Claude Fable 5, its first public model built on the Mythos architecture, a launch we covered in full last week. Three days later it was gone. Not throttled, not geofenced, but switched off worldwide, for every user, including paying customers inside the United States. To understand why, you have to follow a sequence that started months earlier and has very little to do with the jailbreak everyone is talking about.

The story begins in February, when the Department of Defense classified Anthropic an unacceptable supply-chain risk after the company refused to drop contract terms barring military use of its models for domestic surveillance and fully autonomous weapons. That dispute cost Anthropic an arrangement worth around $200 million and left the relationship between the lab and the administration openly hostile. Trump ordered federal agencies to stop using Anthropic products. OpenAI signed a Pentagon deal days later.

When Anthropic's new model was ready, the company judged its ability to find and exploit software vulnerabilities too dangerous to release, said so publicly, and put it only into the hands of vetted partners through Project Glasswing.

The White House moved next. On 2 June, Trump signed an executive order asking AI companies to share advanced cyber-capable models with the government up to 30 days before granting partner access. The order was explicitly voluntary and stated it was not a licensing regime. On the same day, Anthropic expanded its Project Glasswing programme, putting Mythos in front of about 150 more organisations for vulnerability research. One of them was SK Telecom, a Korean carrier and Anthropic investor.

On 10 June, the day after Fable launched, Amodei published a detailed regulatory proposal. He called for mandatory third-party testing of the largest models above a compute threshold, across four named risks including cybersecurity, with the government given authority to block or reverse any model that failed, modelled on the FAA. He argued the time for voluntary governance was over.

After Fable launched, Amazon's researchers fed it code containing known vulnerabilities and asked it to review them for security flaws. The model refused, because its classifiers treated a request to find vulnerabilities as sensitive. The researchers changed three words and asked it to fix the code. It complied, patched the flaws, and produced test scripts when asked. That was the bypass. There is nothing dangerous in it. Fixing flawed code is the defensive task security teams run every day, and asking a model to fix vulnerabilities is the same request as asking it to find them. The same capability answers both. Anthropic's review found the demonstration surfaced a few previously known, minor flaws, the kind other public models turn up with no bypass at all. A thousand hours of red-teaming had found no universal jailbreak. Katie Moussouris, the only outside expert to read the underlying paper, called it standard defensive security work.

What Amazon did next makes no sense. Amazon has put at least $13 billion into Anthropic and is its primary cloud provider. Its own models, the Nova family, sit well below the frontier and do not compete with Claude. Amazon's frontier AI offering is, in large part, Anthropic itself. Its commercial interest lies in Anthropic succeeding. On 11 June, chief executive Andy Jassy was on a prearranged White House call about another matter and raised the finding. Officials sent him to Treasury Secretary Scott Bessent, who was leading the administration's Mythos response over the threat to financial infrastructure. Jassy reported, in effect, that a coding model asked to fix code had fixed code. The NSA pressed for emergency controls. Anthropic argued the finding was trivial. Officials were unmoved. Just after 5pm the next day, a letter from Commerce Secretary Howard Lutnick required government approval before either model could reach any foreign national anywhere, with a 90-minute deadline and warnings of criminal and civil liability. By around 10pm, both models were offline worldwide. Because US export law treats access by any non-citizen as an export, even inside America, Anthropic had no compliant way to keep the models running for anyone. So it turned them off.

The President explained the episode in his own terms: Asked by Axios whether he saw Anthropic and Dario Amodei as a national security threat, Trump said, "Well, not now, but a week ago maybe." He described meeting Amodei at the G7 the day before, called him a smart guy, and praised how quickly the company complied. "It's tremendous liability," he said. "People get put in prison immediately for that. You can't play games with that." Then the detail that reframes everything: "Actually it was a competitor, and a part owner, that turned Anthropic in. They didn't like what they were doing." SK Telecom's access, revoked days earlier over a long-dormant Chinese joint venture it exited in 2009, appears to have fed the same distrust in the administration.

“People get put in prison immediately for that. You can't play games with that.”
Donald Trump

Anthropic sent researchers to the Commerce Department this week, and the two sides are rumoured to be drafting a framework to grade the severity of model flaws and decide when the government intervenes. On the surface this looks like sensible. Read it against everything that led here and it is the opposite. A severity framework only works if you can actually define, in advance, the line between a dangerous model and a safe one. The fix this code episode demonstrates that no such line exists. The same capability that patches a vulnerability finds it, builds the tool that monitors a network, and builds the tool that attacks one. A grading rubric cannot separate these, because they are not separate. So the framework will measure proxies, jailbreak depth, capability classes, demonstrated consequences, and call the result a safety judgement. It is the theatrical requirement made permanent. The labs will go along with it, file their assessments, and everyone will agree to treat an unanswerable question as answered.

Anthropic has also updated its privacy policy, effective 8 July, to allow checks using government ID, and the purpose is to confirm US citizenship and restore access to Americans. Prediction markets, though, are not betting on a quick return. Unable to inspect what the model can do, the state has settled for inspecting who is holding it. This is a control you can actually enforce, which is exactly why it has been chosen, and it is useless against the threat it names. The well-funded foreign attacker borrows a US identity, hires a US shell, or simply uses an open model. The citizen check stops a Korean engineer at an Anthropic investor. It does not stop a state cyber unit. It moves the cost of the policy onto allies, researchers, and ordinary users, and leaves the actual adversary untouched.

Ultimately, what comes of this is a loss in trust for those outside of America in an ability to rely on US labs and models. It also lessens the overall safety of our digital infrastructure. If U.S. threat actors gain access to this, they'll have an advantage over the global software teams that build the critical software we use every day. The United States is not some U.S.-only Internet microcosm. It is connected with systems across the globe using software built by myriad international teams. If this is the start of a new era of government intervention, it couldn't have gone off to a more irrational and chaotic start.

Takeaways: The lesson for anyone building on AI is not about Anthropic, or Amazon, or this particular fight. It is that access to US frontier models has become a political variable, switched off overnight on a phone call and back on once a chief executive complies, with no durable rule in between and no sign that one is coming. You cannot build serious work on a dependency that behaves like infrastructure one week and a hostage the next. Treat model access the way you treat any single point of failure, and remove it. The practical move is to hold a capability you control, and the timing has never been better; Read our GLM 5.2 story to get the lowdown on a very powerful new Open Weight model. The frontier will keep moving and the US labs may well stay at its edge. But the past fortnight has shown that being at the edge and being reliably available are no longer the same thing, and the only capability you can truly plan around is the one running on a machine you control.

Politics clouds frontier model releases

A confusing fable

Claude's inner thoughts

Sol eclipsed by government permits

Anthropic proposes AI pause

Subscribe to the ExoBrain Weekly Newsletter