Cloudy with a chance of machine learning
Neural networks are challenging traditional supercomputing in weather forecasting, with models from Google, Microsoft, and Nvidia demonstrating superior speed and efficiency in predicting atmospheric conditions.
Joel Miller

Whilst we in the UK dodge the showers and hope for guarantees of a decent summer, climate change and the renewable energy transition are underlining the economic criticality of weather prediction. Today, most forecasting relies on classical supercomputing, the UK’s Met Office have a Cray system that they claim has “enabled an additional £2 billion of socio-economic benefits across the UK through enhanced prediction of severe weather and related hazards” since its introduction in 2016. These supercomputers use complex physics models to simulate the Earth’s atmosphere, dividing it into a grid of millions of 3D boxes and then calculating how conditions like temperature, pressure, and wind will change over time in each box to generate a forecast. The UK system uses a 300m box size for short-range 12-hour forecasts in London, while 10km boxes are used for 3–10-day national forecasts. Doubling the resolution of such a model typically requires about ten times more computing power, as there are many more 3D boxes to process.
Now in the field of weather forecasting as in many other domains, neural networks trained on vast amounts of historical data are upending the traditional approaches. Google’s GraphCast, and announced this week, Microsoft’s Aurora, are learning the patterns and relationships between various atmospheric variables and generating predictions much faster than conventional tools. An early pioneer of this approach called WeatherMesh, using a constellation of weather balloons for sensor data, was able to compete with the supercomputer physics forecasting whilst running on a single desktop GPU.
Beyond neural nets and compute, the other unlock as ever is data. Vast amounts of it exist in this industry spanning decades. Microsoft’s Aurora is trained on a diverse set, at multiple resolutions, consisting of many years of climate insight from various sources. This allows the model to learn a general-purpose representation of atmospheric dynamics that can be adapted to different tasks. Like our UK national forecast, Aurora also has a resolution of around 10km and matches or outperforms state-of-the-art weather and atmospheric chemistry models across a wide range of variables and time periods. It shows particular improvements in predicting extreme events. These models can also identify intricate correlations and dependencies that may not be captured by conventional numerical designs.
Where there’s GPU compute there’s Nvidia, who publicised their research into AI weather forecasting this week at the Computex show in Taiwan. They touted their Earth-2 digital twin and AI models that can predict conditions down to a 1km resolution and, they claim, up to 1,000 times faster and 3,000 times more efficiently than traditional physics models. This has particular criticality in Taiwan where forecasting typhoons and their landfall can save lives. They next plan to develop hyper local forecasting, even modelling the flow of air around individual buildings.
Takeaways: There is a ‘but’ here; weather AI currently relies on the big physics models, with all of the latest observations ingested, as an input… essentially a kind of giant weather ‘prompt’, before they can run their predictions. Integrating realtime data is one of the next stages of development. Existing models are trusted and in wide use, but many weather agencies around the world are evaluating these new solutions. The specific accelerant here is high quality data, and the capability for these models to ‘learn’ a good enough physics models from the available representative patterns. These models are small by GPT-4 standards, around 1,000th of the size, but mighty. Problem domains with rich patterns captured in datasets are ripe for transformation.