The model that built itself

Last week we looked at Andrej Karpathy’s autoresearch, a simple script that runs ML experiments while you sleep. This week, a Chinese AI lab called MiniMax showed us what happens when you scale that idea up to frontier model development. Their new model, M2.7, comes with a detailed, first-hand account of how it participated in its own creation.

The workflow is relatively simple. A researcher on MiniMax’s reinforcement learning team starts by discussing an experimental idea with the model. From there, M2.7 takes over: it reviews the literature, tracks the experiment spec, pipelines the data, launches the training run, monitors progress, reads logs, debugs failures, analyses metrics, and submits code fixes. The human researcher only steps back in for critical decisions. MiniMax says the model now handles 30-50% of the daily workflow that previously required multiple researchers across different teams.

But the more interesting part is what happens next. They gave M2.7 a specific task: optimise a programming scaffold. The model ran autonomously for over 100 rounds, following a loop of analysing failures, planning changes, modifying code, evaluating results, and deciding whether to keep or revert each change. It discovered optimisations the team hadn’t specified, like systematically searching for optimal sampling parameters and adding cross-file bug pattern detection. The result was a 30% performance improvement. No human touched the loop.

Then they pushed further. They pointed M2.7 at 22 machine learning competitions and gave it 24 hours to iterate autonomously. After each round, the model wrote a memory file, criticised its own results, and fed those reflections into the next attempt. Its best run earned 9 gold medals, 5 silver, and 1 bronze, a medal rate tying with Google’s Gemini 3.1.

What makes this useful beyond the AI world is the pattern it reveals. Any iterative R&D process, whether drug discovery, financial modelling, or product development, follows the same basic loop: plan, execute, analyse, review, iterate. MiniMax has shown that AI can already own the middle of that loop autonomously, while humans hold the edges. That boundary will shift, but the shape of the collaboration is becoming clear.

Takeaways: MiniMax’s M2.7 gives us the first detailed blueprint for how AI labs are using models in their own development. The most productive teams will be the ones that learn to hand over the middle of the loop, the execution, monitoring, and analysis, while focusing human attention where it still matters most, at the creative and strategic edges.

The model that built itself

Harnesses are the new AI battleground

The adaptive thinking backlash

The early singularity runs in a loop

Superhuman adaptable intelligence

Subscribe to the ExoBrain Weekly Newsletter