
Our visual story this week leads on from the o3 launch. This image charts the rapid performance improvements of OpenAI models on the American Invitational Mathematics Examination (AIME) as training compute rises. The grey curve shows o1 rising from 25 % to 72 % as post-training (reinforcement learning on complex problems) compute grows, while the yellow o3 line passes 85 % overall. AIME is a high‑school Olympiad test whose problems need multi‑step reasoning, so strong results signal genuine mathematical skill. The steady rise implies the 10x more post-training compute spent on o3 brought further gains. If this slope continues, o3‑pro and later generations will continue to scale for now, though at increasing up-front cost, even if they are more efficient and cheaper to run day-to-day with algorithmic improvements.
