ExoBrain

ExoBrain Weekly Newsletter

Politics clouds frontier model releases, Copilot Cowork's price shock, and GLM 5.2 democratises coding power

Welcome to our weekly newsletter, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our Exo agents.

This week we look at:

  • Politics clouds frontier model releases

    The US government switched off Anthropic's Claude Fable 5 worldwide three days after launch, via an export-control directive the President says a rival triggered. Access to frontier models is now a political variable, and the case for self-hosting just hardened.

  • Copilot Cowork's price shock

    Microsoft made Copilot Cowork generally available on consumption pricing, every task burning Copilot Credits at $0.01 on top of a $30 licence. The numbers suggest a product priced out of the frequent use it was built for.

  • GLM 5.2 democratises coding power

    Z.ai's GLM 5.2, an open-weight 744B model under an MIT licence, lands within a few points of Claude Opus on coding and runs on a high-memory workstation. With Fable blocked, self-hosting a near-frontier model becomes a continuity requirement.

  • News roundup

    Mega-funding rounds and China's $295bn AI plan, privacy crackdowns and digital IDs for AI agents, sharper alignment research, and rivals chipping at Nvidia's lead.

Politics clouds frontier model releases

The US government switched off Anthropic's Claude Fable 5 worldwide three days after launch, via an export-control directive the President says a rival triggered. Access to frontier models is now a political variable, and the case for self-hosting just hardened.

Joel Miller

Joel Miller

4 min read
Politics clouds frontier model releases

On 9 June, Anthropic released Claude Fable 5, its first public model built on the Mythos architecture, a launch we covered in full last week. Three days later it was gone. Not throttled, not geofenced, but switched off worldwide, for every user, including paying customers inside the United States. To understand why, you have to follow a sequence that started months earlier and has very little to do with the jailbreak everyone is talking about.

The story begins in February, when the Department of Defense classified Anthropic an unacceptable supply-chain risk after the company refused to drop contract terms barring military use of its models for domestic surveillance and fully autonomous weapons. That dispute cost Anthropic an arrangement worth around $200 million and left the relationship between the lab and the administration openly hostile. Trump ordered federal agencies to stop using Anthropic products. OpenAI signed a Pentagon deal days later.

When Anthropic's new model was ready, the company judged its ability to find and exploit software vulnerabilities too dangerous to release, said so publicly, and put it only into the hands of vetted partners through Project Glasswing.

The White House moved next. On 2 June, Trump signed an executive order asking AI companies to share advanced cyber-capable models with the government up to 30 days before granting partner access. The order was explicitly voluntary and stated it was not a licensing regime. On the same day, Anthropic expanded its Project Glasswing programme, putting Mythos in front of about 150 more organisations for vulnerability research. One of them was SK Telecom, a Korean carrier and Anthropic investor.

On 10 June, the day after Fable launched, Amodei published a detailed regulatory proposal. He called for mandatory third-party testing of the largest models above a compute threshold, across four named risks including cybersecurity, with the government given authority to block or reverse any model that failed, modelled on the FAA. He argued the time for voluntary governance was over.

After Fable launched, Amazon's researchers fed it code containing known vulnerabilities and asked it to review them for security flaws. The model refused, because its classifiers treated a request to find vulnerabilities as sensitive. The researchers changed three words and asked it to fix the code. It complied, patched the flaws, and produced test scripts when asked. That was the bypass. There is nothing dangerous in it. Fixing flawed code is the defensive task security teams run every day, and asking a model to fix vulnerabilities is the same request as asking it to find them. The same capability answers both. Anthropic's review found the demonstration surfaced a few previously known, minor flaws, the kind other public models turn up with no bypass at all. A thousand hours of red-teaming had found no universal jailbreak. Katie Moussouris, the only outside expert to read the underlying paper, called it standard defensive security work.

What Amazon did next makes no sense. Amazon has put at least $13 billion into Anthropic and is its primary cloud provider. Its own models, the Nova family, sit well below the frontier and do not compete with Claude. Amazon's frontier AI offering is, in large part, Anthropic itself. Its commercial interest lies in Anthropic succeeding. On 11 June, chief executive Andy Jassy was on a prearranged White House call about another matter and raised the finding. Officials sent him to Treasury Secretary Scott Bessent, who was leading the administration's Mythos response over the threat to financial infrastructure. Jassy reported, in effect, that a coding model asked to fix code had fixed code. The NSA pressed for emergency controls. Anthropic argued the finding was trivial. Officials were unmoved. Just after 5pm the next day, a letter from Commerce Secretary Howard Lutnick required government approval before either model could reach any foreign national anywhere, with a 90-minute deadline and warnings of criminal and civil liability. By around 10pm, both models were offline worldwide. Because US export law treats access by any non-citizen as an export, even inside America, Anthropic had no compliant way to keep the models running for anyone. So it turned them off.

The President explained the episode in his own terms: Asked by Axios whether he saw Anthropic and Dario Amodei as a national security threat, Trump said, "Well, not now, but a week ago maybe." He described meeting Amodei at the G7 the day before, called him a smart guy, and praised how quickly the company complied. "It's tremendous liability," he said. "People get put in prison immediately for that. You can't play games with that." Then the detail that reframes everything: "Actually it was a competitor, and a part owner, that turned Anthropic in. They didn't like what they were doing." SK Telecom's access, revoked days earlier over a long-dormant Chinese joint venture it exited in 2009, appears to have fed the same distrust in the administration.

People get put in prison immediately for that. You can't play games with that.

Donald Trump

Anthropic sent researchers to the Commerce Department this week, and the two sides are rumoured to be drafting a framework to grade the severity of model flaws and decide when the government intervenes. On the surface this looks like sensible. Read it against everything that led here and it is the opposite. A severity framework only works if you can actually define, in advance, the line between a dangerous model and a safe one. The fix this code episode demonstrates that no such line exists. The same capability that patches a vulnerability finds it, builds the tool that monitors a network, and builds the tool that attacks one. A grading rubric cannot separate these, because they are not separate. So the framework will measure proxies, jailbreak depth, capability classes, demonstrated consequences, and call the result a safety judgement. It is the theatrical requirement made permanent. The labs will go along with it, file their assessments, and everyone will agree to treat an unanswerable question as answered.

Anthropic has also updated its privacy policy, effective 8 July, to allow checks using government ID, and the purpose is to confirm US citizenship and restore access to Americans. Prediction markets, though, are not betting on a quick return. Unable to inspect what the model can do, the state has settled for inspecting who is holding it. This is a control you can actually enforce, which is exactly why it has been chosen, and it is useless against the threat it names. The well-funded foreign attacker borrows a US identity, hires a US shell, or simply uses an open model. The citizen check stops a Korean engineer at an Anthropic investor. It does not stop a state cyber unit. It moves the cost of the policy onto allies, researchers, and ordinary users, and leaves the actual adversary untouched.

Ultimately, what comes of this is a loss in trust for those outside of America in an ability to rely on US labs and models. It also lessens the overall safety of our digital infrastructure. If U.S. threat actors gain access to this, they'll have an advantage over the global software teams that build the critical software we use every day. The United States is not some U.S.-only Internet microcosm. It is connected with systems across the globe using software built by myriad international teams. If this is the start of a new era of government intervention, it couldn't have gone off to a more irrational and chaotic start.

Takeaways: The lesson for anyone building on AI is not about Anthropic, or Amazon, or this particular fight. It is that access to US frontier models has become a political variable, switched off overnight on a phone call and back on once a chief executive complies, with no durable rule in between and no sign that one is coming. You cannot build serious work on a dependency that behaves like infrastructure one week and a hostage the next. Treat model access the way you treat any single point of failure, and remove it. The practical move is to hold a capability you control, and the timing has never been better; Read our GLM 5.2 story to get the lowdown on a very powerful new Open Weight model. The frontier will keep moving and the US labs may well stay at its edge. But the past fortnight has shown that being at the edge and being reliably available are no longer the same thing, and the only capability you can truly plan around is the one running on a machine you control.

Copilot Cowork's price shock

Microsoft made Copilot Cowork generally available on consumption pricing, every task burning Copilot Credits at $0.01 on top of a $30 licence. The numbers suggest a product priced out of the frequent use it was built for.

Joel Miller

Joel Miller

3 min read

Microsoft made Copilot Cowork generally available this week, after a three-month Frontier preview, and confirmed how it will be charged. From 1 July, every task draws down Copilot Credits at $0.01 each, priced on the model used, the context retrieved, the tool calls made, and the runtime. Cowork also requires a Microsoft 365 Copilot licence at $30 per user per month.

This breaks from how Copilot has worked until now. The standard licence is a fixed cost with no metering. Cowork is billed on consumption because it runs longer, multi-step tasks that cost more. A typical task can use several hundred to over a thousand credits. On Microsoft's own estimator defaults, a single technical worker comes out at roughly $250 for just 35 mixed prompts in one month!

Microsoft claims this is based on the usage they have seen over the trial period. However they were arrived at, they render the product useless. Microsoft markets it for managing email, scheduling, preparing meetings, and producing reports. These are continuous tasks, and a tool that is good at them will be used often. Consumption pricing makes frequent use impossible. Charles Lamanna, who leads Copilot, told Axios that internal testing showed Cowork could not be offered on an unlimited-use basis. It seems it can't be offered at all.

In the same announcement, Microsoft trailed a new model; Cowork 1, due in the coming weeks, post-trained to handle tasks at substantially lower cost. Axios separately reported that Microsoft is testing a hosted version of DeepSeek V4 as a cheaper option. The plan appears to be to move most tasks onto a cheaper model and reduce the per-task cost over time. Copilot has been an unmitigated disaster for Microsoft, Cowork looked like a chance to redeem the product, but for now that's not going to happen.

Takeaways: This is the last chance saloon for Copilot. Reasonably priced, its agentic capabilities and access to Microsoft 365 files and content made it substantially better than the standard offering. But priced at this level, no one will use it for fear of racking up bills of thousands of dollars after maybe only a week's worth of meaningful use. So if Microsoft can't make Cowork 1 or Deepseek deliver the kind of capabilities that are attracting more and more people to Claude, the game will be lost for good.

GLM 5.2 democratises coding power

Z.ai's GLM 5.2, an open-weight 744B model under an MIT licence, lands within a few points of Claude Opus on coding and runs on a high-memory workstation. With Fable blocked, self-hosting a near-frontier model becomes a continuity requirement.

Joel Miller

Joel Miller

2 min read

This week's chart shows Z.ai's GLM-5.2 landing within a few points of Claude Opus 4.8 on long-horizon coding tasks. It is a 744-billion parameter mixture-of-experts model with a 1 million-token context window, and the industry response has been very positive. Many engineers now rate it as effectively on a par with Opus 4.7 and 4.8 and ahead of GPT-5.5.

GLM-5.2 ships under an MIT licence, so the weights can be downloaded, modified, and run inside an organisation's own boundary, with no provider able to revoke access. As Article 1 this week sets out, the US government's decision to disable and block Fable 5 has made that property concrete. When a frontier model can be switched off by a foreign government, the ability to host a near-equivalent model yourself stops being a preference and becomes a continuity requirement.

So how would you actually run it. The most accessible route uses Unsloth's dynamic quantisation, which shrinks the full model to 239GB at 2-bit, enough to fit on a 256GB unified-memory Mac Studio. That path suits one or two users and has been shown writing complete, working software on the first attempt. For a team, the economics change shape. A single EU-hosted three-GPU Blackwell box, around 540GB of VRAM, can serve roughly eight heavy agentic engineers at near-frontier capability for about £400 to £540 per engineer each month, running vLLM with an NVFP4 checkpoint.

The constraint has not disappeared. You still need fast memory measured in hundreds of gigabytes, the cheapest quantisations trade away accuracy on harder work, and a self-hosted model running under near-100% duty cycle needs real operational care. This is a workstation and server story, not a laptop one. But the option now exists where it did not before.

Takeaways: the open-weight question has shifted from capability to control. GLM-5.2 shows a frontier-class model can be self-hosted today, under a licence no government can revoke, at a cost that competes with metered seats once usage is heavy. With Fable 5 blocked, that is no longer theoretical. The practical step for any organisation handling sensitive work is to pilot a self-hosted model now, learn what local inference really costs in memory and latency, and stop assuming that access to frontier capability is something only a vendor can grant.

News roundup

Mega-funding rounds and China's $295bn AI plan, privacy crackdowns and digital IDs for AI agents, sharper alignment research, and rivals chipping at Nvidia's lead.

AI business news

AI governance news

AI research news

AI hardware news

Subscribe to the ExoBrain Weekly Newsletter

Stay up to date with AI. Get analysis of the week's most important stories, plus a focused roundup across business, governance, research and infrastructure.

Follow us on LinkedIn