Are the labs hitting a scaling wall?

The Information reported this week that OpenAI’s latest model, Orion, while showing improvements, hasn’t matched the dramatic leap seen between GPT-3 and GPT-4. They also reported that Google and Anthropic are facing similar hurdles with their upcoming releases. This seemed like a potential answer to the biggest question in AI… will the ‘scaling laws’ hold and will models continue to improve as they are made bigger and trained using ever more compute?

Shortly after The Information piece former OpenAI chief scientist Ilya Sutskever was quoted for the first time in some months and suggesting we’re moving from “the age of scaling” to “the age of wonder and discovery.” Perhaps the leading mind in AI research was signalling that the traditional approach of making models bigger and feeding them more data might be reaching its limits? So, what’s the truth?

There are two opposed camps; those who have staked their reputation on a slow-down of returns from scaling GPT style transformer models (often those with a belief or stake in an alternate architecture), and those who believe there is no limit for the foreseeable. The sceptics immediately went into overdrive on social media, using the article as some kind of proof of their bearish past pronouncements. Sam Altman responded with characteristic confidence, posting on X: “There is no wall” despite the article mentioning “multiple OpenAI sources”. This is likely because OpenAI’s recent work on their o1 family points to one potential way to continue scaling no matter what – shifting computation to “test time,” when the model is actually being used. OpenAI researcher Noam Brown noted that giving a poker playing agent just 20 seconds to think matched the benefits of scaling up training “by 100,000 times.” As we mentioned when we covered the o1 launch, there is a new scaling law in town… one with far fewer limits.

Anthropic CEO Dario Amodei, speaking for several hours straight on the Lex Fridman podcast this week, shares this optimism, though his company’s largest training effort for Claude 3.5 Opus has reportedly faced performance issues. Meanwhile Google’s next Gemini iteration is also rumoured to be falling short of internal targets. Much of this can be attributed to the need for these models to also be cost effective and viable for widespread use. GPT-4o and Claude 3.5 Sonnet are smart models that are purported to be much smaller and more profitable than the previous generation, and whilst they are in demand the labs must find a good reasons to rush to launch larger and more resource hungry and less profitable products.

Rumours will abound when the topic is the most hotly debated in AI, and there have been no major releases in recent months that might provide concrete evidence. While some experts see this moment as validation that current architectures have fundamental limits, others see the very opposite.

Takeaways: Whether through smarter inference techniques, test-time computing, or entirely new approaches, AI development continues at pace. 2025 isn’t just about building bigger models, but about finding smarter ways to use them, backed by unprecedented computing power. With big tech deploying GPU clusters relentlessly, and Nvidia’s Blackwell chip yet to ship in volume, the real constraint isn’t data or algorithms, but raw computing capacity. As AI chips become more plentiful over the next 12-18 months, we will see a step change in the number of AI agents working together to solve problems, and perhaps at that point starting to improve themselves. Scaling is not a single dimension; much more is on the horizon. As Miles Brundage former head of OpenAI’s AGI readiness effort just posted: “Betting against AI scaling continuing to yield big gains is a bad idea. Would recommend that anyone staking their career, reputation, money etc. on such a bet reconsider it.”

Are the labs hitting a scaling wall?

Compute crunch 2.0 arrives

The recipe behind Mythos

o3 and the new scaling laws

The compute commodity

Subscribe to the ExoBrain Weekly Newsletter