ExoBrain
The Darwin Gödel machine
inference economicsmodel releasesresearch and science

The Darwin Gödel machine

Recent research demonstrates that AI models are beginning to self-improve by utilising internal confidence signals, latent reasoning, and evolutionary search to optimise their own architectures and performance.

Joel Miller

Joel Miller

4 min read

For decades, researchers have dreamt of AI that never stops improving. Schmidhuber’s Gödel Machine offered a blueprint: AI that rewrites its own code when it can mathematically prove an improvement. Such proof was once a high bar, but the latest generation of AI research abandons proof for pragmatism. Instead of waiting for mathematical certainty, it harnesses evolution’s blind search: generate variations, test empirically, keep what works. And towards this goal, models are beginning to teach themselves, not just through massive datasets or human feedback, but by listening to their own internal signals, reasoning beyond words, and discovering algorithms that improve the very systems that created them. Several research papers published this week presage the likely direction of AI models in the coming year, and tell a story of how AI is learning to bootstrap its own intelligence

Traditional AI development has been resource-intensive. Past LLMs depended on millions of labelled examples and computational budgets only tech giants could afford. But the landscape is changing. The INTUITOR system from UC Berkeley demonstrates that language models can achieve competitive reasoning performance using nothing but their own confidence as a reward signal. On mathematical benchmarks, INTUITOR matches traditional methods while achieving a 65% improvement on coding tasks. It’s evidence that models possess richer internal signals than we’ve recognised. Research from Tsinghua’s LeapLab reveals current reinforcement learning doesn’t actually teach models new reasoning patterns – it simply makes existing patterns more likely to be used. This might sound discouraging, but it could mean the opposite. Are we still barely tapping into these models’ true potential?

The implications of internal learning go further than just replacing external methods. Recent work on Hybrid Reasoning Policy Optimization (HRPO) reveals that models can perform reasoning not just through explicit token generation, but within their latent representations – the continuous hidden states that exist between layers. Traditional reasoning approaches force models to “think out loud” through chains of words. But HRPO demonstrates that by gradually blending in hidden states, models achieve superior performance on both knowledge-intensive and mathematical tasks. On challenging benchmarks like MATH, smaller models using this reasoning match or exceed much larger traditional models. These models spontaneously develop cross-lingual reasoning patterns, fluidly integrating concepts across languages within their hidden representations. They produce more compact yet accurate responses, requiring fewer tokens because richer context is encoded in the continuous space. This is evidence that genuine reasoning can occur in the model’s inner world, beyond the tokens we observe.

While INTUITOR shows models can learn from internal confidence and HRPO demonstrates reasoning in latent space, Google’s AlphaEvolve from a few weeks ago takes the next leap: combining these capabilities with evolutionary search to create genuinely new knowledge. AlphaEvolve pairs Gemini models with automated evaluators in an evolutionary framework, improving upon the most promising ideas over successive generations. This is starting to unlock a form of self-improvement. By discovering better matrix multiplication strategies, it reduced Gemini’s (it’s own) training time by 1% – significant savings given its scale.

But what of a pure bootstrap? The Darwin Gödel Machine (DGM), developed by researchers at UBC and Sakana AI in a paper published yesterday, takes self-improvement to its logical conclusion: AI that directly rewrites its own code to improve performance. Starting from a baseline 20% success rate on SWE-bench (real-world GitHub issues), the DGM autonomously improved itself to achieve 50% – without any human intervention. It discovered improvements like adding patch validation steps, better file viewing tools, and generating multiple solutions to select the best.

Today’s AI agents can be frustratingly unpredictable and limited. But we’re on an evolutionary tree and the path to breakthrough agents must run through these less-performant “ancestors”. Failed experiments are the essential stepping stones. A Darwinian process is underway, an expanding space of possible minds, and each new branch is potentially superior.

Takeaways: Experimental AI can now learn from internal signals, discover better algorithms, and rewrite their own code. The combination of reasoning models with evolutionary search is about to unlock genuine self-improvement. As these systems enhance the very infrastructure and algorithms that create them, we may be approaching a future where AI progress accelerates itself.