Week 12 news

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we deep-dive into the Nvidia GTC conference announcements and specifically their new chip Blackwell. GPU compute is the fundamental driving force and also main indicator for AI progress:

  • Blackwell is a big chip and a big deal; Exa-scale ‘AI factories’ packed with these GPUs will pave the way for an age of abundant compute and Nvidia becoming the most valuable company on earth
  • Inflection and Mustafa Suleyman get subsumed by Microsoft indicating the power of big tech, and the challenge of consumer-centric foundation model building
  • GPT-5 is a few months away but OpenAI CEO Sam Altman and the rumour mill are setting high expectations

The key themes this week…

Blackwell is a big deal

At their AI ‘festival’ (GTC) this week Nvidia announced their next generation super chip; the RTX 50-series or Blackwell (named after the statistician and game theorist David Blackwell). The basic numbers are predictably huge… 208 billion transistors, 4-30x more performance than the current generation Hopper chip, plus huge boosts to the memory, connectivity and density across the supporting Blackwell datacentre architecture. What’s impressive is that in 2020 we saw the first exaFLOP super computer (an entire building sized machine pumping out more than 1 quintillion floating point operations per second) and with Blackwell, a single rack the size of the fridge in your kitchen is exa-scale. The ultimate expression of this new chip generation is what Nvidia calls the “AI Factory”. 32,000 GB200 super GPUs packed into a datacentre with more internal high-speed bandwidth than the ENTIRE Internet and 600x more power than that ultimate supercomputer from way back in… 2020.

Takeaways: With maybe 100-200 of these AI factories to be deployed over the next few years, what does this mean for us humans? We are about to realise that we have been living in a world with a miniscule amount of computational power. Like our medieval ancestors who would not have been able to imagine the future abundance of food, healthcare, travel and entertainment that would be born from the first industrial revolution, we now face the challenge of adjusting to a future of truly ‘abundant compute’. The difference here is that this will happen in the next few years, not over several centuries.

We’re also going to see a lot more AIs… GPT-4 took 100 days to train back in 2022, it will now take <3 days in an AI Factory. We’ll also see bigger smarter models, Nvidia predict that Blackwell kit will run models of 20+ trillion parameters (nearer to 1/4 the size of a human brain rather than the 1/100th we see today).

And with these AIs and ‘accelerated’ computing capabilities Nvidia’s conference also gives a glimpse of the applications; from new intelligent enterprise software, to autonomous industry, world simulators where time is sped up 1000x to train robots or design new chemistry and biology, to the brains of humanoid machines themselves.

Whilst Blackwell suggests we’re not running out of ‘space at the bottom’ to reference Richard Feynman’s assertion in the 1950s that the nano-scale would provide huge scope for machine miniaturisation, we may be nearing the buffers ‘at the top’. Power grids are straining as we build new datacentres, and the ability to plug in these AI factories, not fabricate the chips, will become the constraining factor. This explains Amazon’s recent step in buying a nuclear power station (at 960MW this could in theory power around 15 Blackwell AI factories). This post provides an excellent analysis of the power challenges, and a concerning note for us Europeans. We are far behind in building AI capable datacentres. Today a simple chatbot doesn’t require nearby compute, but in the next few years, more advanced high bandwidth AI applications may thrive on low latency. The future will be one of intense competition over sovereign AI factory capacity as much as the ingenuity to make use of it.

Our newly updated ExoBrain AI compute model predicts we’ll see a 10x growth by this time next year and 50x by the end of 2026. This is hardware only and does not include the huge gains that are coming from more efficient models and software techniques. Where today a power user might be running 4-5 model instances on and off during the day, in a few short years we’ll each be able to run 2,000-3,000, the vast majority of which will be autonomous working tirelessly to advance every facet of our physical and virtual lives.

Some interesting points:

  • We assume that other firms will start to compete more effectively with Nvidia, but outside of some radical breakthrough, that can’t happen fast enough to shift their compute domination and production deployment head start. Nvidia is likely to become the most valuable company in the world.
  • This is of course assuming there are no disruptions to the chip output from the single TSMC facility in Taiwan where ALL of these chips come from today (and that the data centres can be built or upgraded, but we are not estimating an impossible rate in this regard).
  • No matter how fast competitors ramp up or how many H100s are shipped, from 2025 and beyond the vast majority if AI computations in the cycle (and potentially the emergence of AGI) will happen on a Blackwell, by shear dint of its relative power.
  • Its a cliché in the AI world to see a “you are here” on an exponential curve. But as of this week’s news, this is a real curve. You are here..
 

Consumer foundation AI is hard

In a surprise move this week, Microsoft announced that Mustafa Suleyman would be joining them from Inflection AI, and to all intents and purposes the lab would be coming under the wing of the tech giant, with a licencing and staff transfer deal. Suleyman will lead the consumer focused AI arm at Microsoft across Copilot, Edge and Bing. Inflection posted that they would be shifting focus to enterprise use cases. Pi their ultra-friendly assistant had recently seen a model enhancement to near GPT-4 levels at a 40% smaller size. But with no brand awareness, distribution, API, or paid consumer subscription option, and tough competition from ChatGPT and Claude, it seems Inflection were not up for the fight. Perhaps for Microsoft, Suleyman is seen as the creative force that can stitch together the consumer space more coherently. In other lab news Stability AI saw the departure of further key technical staff. Despite having substantial GPU compute and having made some progress with its Stable Diffusion and LM text models, again not being closely aligned to a big tech firm is proving problematic.

Takeaways: In the current landscape, the big tech firms are the ultra-dominant forces (and perhaps now Saudi Arabia with their new $40bn AI fund). AI labs need vast pockets, market access, and AI GPU ‘factories’ given the huge costs in researching, training and continuously deploying competitive models. Plus consumer AI is still a challenge, with limited user maturity and willingness to pay for Internet tools. But as the compute abundance equation changes, and open-weight models grow in power and diversity, this concentration of power will lessen. Meta will release their Llama 3 open-weight model to the world in July and this will be exactly the disruptive event Meta are wanting to achieve (from their war chest of some 600,000 GPUs).

GPT-5 rumours

As the Claude 3 family of models continue to impress, with new capabilities emerging as more users switch, pressure continues to mount on OpenAI to release GPT-5. This week Sam Altman has been busy hyping up the potential for the model, and some upcoming features (likely in the planning, autonomous agents and robotics spaces), but also dampening down timeline expectation. 

Takeaways: The latest consensus is mid-year for GPT-5, and a major step up in capability, although what that really means will be a huge indicator of future AI trends. You can hear Altman’s interview on the Lex Fridman podcast here. Several quotes have emerged from Altman’s publicity activity this week that are worth considering:

  • Firstly that ChatGPT to date has been a revolution in AI “expectation” but not in function… this seems fair in that much of what AI can achieve is still theoretical and there is a lot of integration to be done. But also suggests that the future capability steps will be more significant.
  • He also suggested that companies should not be delaying innovation, and should learn and integrate now, so that they are ready for these significant future steps in capability as we near AGI. (Naturally we at ExoBrain would agree, but it also just seems like the most common-sense approach in an unpredictable world, which is about to be flooded with AI compute).
ExoBrain symbol

EXO

This week’s news highlights the massive scale of AI investment and development, with deals between tech giants, billions in funding for nations and companies, and new efforts around AI governance on a global stage.

AI business news

AI governance news

AI research news

AI hardware news

Week 40 news

OpenAI accelerates, auto-podcasting emerges from uncanny valley, and why not to take every economist’s view on AI at face value

Week 39 news

AlphaChip plays the optimisation game, money talks as content walks, and infrastructure goes hyperscale

Week 38 news

Microsoft turn the page on Copilot, infinite ambitions but finite resources, and are you opted in or out?