ExoBrain

ExoBrain Weekly

The adaptive thinking backlash, Nvidia “not a car” but not untouchable, and OpenAI’s super app evolves

Welcome to our weekly newsletter, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our Exo agents.

This week we look at:

  • The adaptive thinking backlash

    Anthropic’s Opus 4.7 faces user backlash due to its new adaptive thinking mode and tokenisation changes, revealing a disconnect between benchmark performance and real-world developer experience.

  • Nvidia “not a car” but not untouchable

    Jensen Huang defends Nvidia’s supply chain moat and chip durability, but Anthropic’s successful frontier training on AWS custom silicon and geopolitical tensions regarding China highlight emerging vulnerabilities.

  • OpenAI’s super app evolves

    OpenAI’s new desktop client integrates over 90 plugins and multiple tools into a single agent-centric interface, aiming to unify workflows across code, communication, and documents.

The adaptive thinking backlash

Anthropic’s Opus 4.7 faces user backlash due to its new adaptive thinking mode and tokenisation changes, revealing a disconnect between benchmark performance and real-world developer experience.

Joel Miller

Joel Miller

2 min read
The adaptive thinking backlash

Anthropic launched Opus 4.7 this week and the reaction has been unusually negative. Within hours, power users were complaining the model felt shallower than 4.6, followed instructions worse, and burned through weekly allowances at an alarming rate. On X and Reddit, long-time Claude fans described spending hours debugging their own setups because they could not believe the new release was really the upgrade on the box. One widely shared post called it “basically 4.6 with low thinking as a default”. The word most often used was regression.

The central change sits behind a harmless-sounding phrase. Extended thinking, the old toggle that let users decide how hard Claude should reason, has been replaced with “adaptive thinking”, where the model itself decides. In Claude Code the default shifts to a new xhigh effort level; in the consumer UI the controls are thinner. Combined with a new tokeniser that maps the same text to up to 35% more tokens, and a temporary 7.5x premium multiplier in GitHub Copilot, the effect for many developers is that Opus 4.7 costs more and feels less reliable than the model it replaced.

None of this shows up in the headline benchmarks. On SWE-bench Verified and SWE-bench Pro, 4.7 is a clear step up, and Anthropic’s partner case studies report meaningful gains on long-running agentic work. Yet on SimpleBench, which tests the kind of everyday common-sense reasoning humans actually ask for, 4.7 stumbles. That gap is the story. Benchmarks test hard things, and adaptive thinking happily spends compute when it senses difficulty. For a simple question that a careful human would still think about for ten seconds, the model may decide no reasoning is needed and fire back a confident, shallow answer.

This is not unique to Anthropic. OpenAI went through the same cycle with GPT-5’s auto-routing last year and had to restore explicit effort controls after a backlash. The lesson does not seem to have travelled. Labs keep trying to hide effort management behind clever routing because the internal maths is seductive: similar average accuracy at much lower cost. What the maths misses is that heavy users are not the averages, and they notice issues within minutes.

Takeaways: The Opus 4.7 backlash is not really about one model. It is about an industry struggling to balance the cost of reasoning with demand for intelligence.

Nvidia “not a car” but not untouchable

Jensen Huang defends Nvidia’s supply chain moat and chip durability, but Anthropic’s successful frontier training on AWS custom silicon and geopolitical tensions regarding China highlight emerging vulnerabilities.

Joel Miller

Joel Miller

3 min read

This week’s most-watched tech moment wasn’t a model launch. It was Dwarkesh Patel’s two-hour interview with Jensen Huang, and by the end the Nvidia CEO had dropped his usual composure on camera. The “Nvidia is not a car” memes did the rounds, poking fun at Jensen’s insistence that his chips can’t be commoditised the way every other piece of hardware eventually is. But it was on China where he really lost his cool.

For most of the conversation Jensen held his ground, and fairly. His case for Nvidia’s durability rests on three things: demand for AI compute keeps compounding, Nvidia’s software and systems are deeply woven into how models actually get built, and the supply chain itself is now the moat. He has locked up around $100 billion of forward commitments, scaling toward $250 billion, across TSMC wafers, advanced packaging and high-bandwidth memory. Rivals simply cannot replicate that at speed. On custom chips from Google and Amazon he tried to narrow the threat, arguing that TPU and Trainium growth today is essentially one customer, Anthropic, rather than a broad market shift.

We need to make an important correction. Last week we reported that Anthropic’s new Mythos model had been trained on Nvidia’s Blackwell. That turns out to be wrong. AWS bosses confirmed this week that Mythos was trained on AWS’s Trainium chips, running on Project Rainier, a cluster of around 500,000 custom accelerators scaling toward more than a million. It is the first genuine frontier-scale pre-training run completed without Nvidia silicon, and Jensen himself conceded on the podcast that missing Anthropic was his own failure to invest early enough. In other words, the commoditisation question is no longer theoretical.

Where Jensen actually lost his footing was China. Dwarkesh pressed him with Dario Amodei’s analogy comparing chip exports to enriched uranium, and Jensen’s answers started contradicting each other. China has all the chips it needs, but also desperately wants his. US compute is 100 times larger, but China can still aggregate enough to matter. Models cannot easily swap between accelerators, except Anthropic has just done exactly that across three architectures. He dismissed the uranium comparison as “lunacy” without offering a counter, and told Dwarkesh “you’re not talking to someone who woke up a loser”.

Why does this matter for the outlook? Because Jensen is a disciplined operator who rarely gets rattled. But he lost composure on China, which may tell us where his real concern lies. China was around 13% of Nvidia’s revenue before the H20 restrictions, and Jensen has publicly pitched the restored market as a $50 billion annual opportunity under the new revenue-share arrangement. Losing it, or having it eroded by Huawei’s Ascend line, is the one scenario he cannot narrate his way through.

Takeaways: Nvidia’s supply-chain moat is real and the short-term commoditisation story is overstated, but two things shifted this week. Mythos on Trainium proves hyperscaler custom chips now work at the frontier. And Jensen’s reactions on geopolitics indicated that in the long run, China remains the biggest growth unknown; sell or starve? Does the US retain AI dominance by denying China chips, forcing innovation in a constrained environment, or by selling to China and accelerating its open-source ecosystem?

OpenAI’s super app evolves

OpenAI’s new desktop client integrates over 90 plugins and multiple tools into a single agent-centric interface, aiming to unify workflows across code, communication, and documents.

ExoBrain

1 min read

This screenshot shows OpenAI’s new Codex desktop release (on Mac only for now), or, as many describe it, a gradually evolving “super app”.

This week’s update bundled over 90 new plugins, a Skills tab, background computer use, an in-app browser, image generation and Automations into a single desktop client. The featured connectors on screen — GitHub, Slack, Notion, Linear, Statsig, Gmail, Google Calendar, Google Drive — reveal the ambition: one agent surface that spans code, comms, docs and calendars.

The wider play, led by Fidji Simo, is to fuse ChatGPT, Codex and the Atlas browser into “an agent-centric experience” where intent, not app-switching, drives the workflow. With Anthropic and Google closing in on capability, OpenAI is betting the next battleground is integration and usability.