The bell curve of AI intelligence

Our chart this week comes from aiiq.org, a project by Ryan Shea. The site aggregates seventeen public benchmarks across five reasoning dimensions, composites the results, and calibrates the output against the human IQ scale where 100 is the population average and each 15 points is one standard deviation.

The chart plots around seventy models on that scale. The leading cluster, GPT-5.5, Gemini 3.1 Pro, Gemini 3 Pro, Opus 4.6 and GPT-5.4, sits between IQ 130 and 135. The middle of the chart, between 100 and 125, holds most current models, including Chinese open-weight releases such as DeepSeek V4 Pro, Kimi K2.5 and Qwen 3.6 alongside Western entries like Gemma 4 31B and the GPT-OSS family. China and US clusters are now indistinguishable on this measure.

This is not an equivalent to human IQ, but it does show what one might expect: a predictable, normal distribution of model intelligence. What we have not yet seen, and what will be interesting to track, is what starts to populate the edges of this distribution. The site also publishes a cost per intelligence view, which is a useful frontier of efficiency to watch and a good companion to resources like Artificial Analysis for monitoring model progress.

The bell curve of AI intelligence

DeepSeek pays less attention

Top agentic tool users

No putting this genie back

Meta’s Eco Llama

Subscribe to the ExoBrain Weekly Newsletter