Codex and the great developer displacement

Software is big business. With nearly 30 million developers worldwide and global software market approaching $1 trillion, the industry plays a big role in the modern economy. The development process generates trillions in economic value annually, and software forms the backbone of nearly every major industry. In essence, software isn’t just a sector, it’s the engine behind much of the world’s economic activity. And as such it’s the major focus for AI progress and the bleeding edge when it comes to the adoption of agents.

Things have been heating up in recent weeks, and today OpenAI launched Codex, a cloud-based system that represents its most ambitious move yet to transform how code is written, and who writes it. Powered by a new code-centric version of their o3 model, Codex cloud-based agents can work on multiple tasks simultaneously, handling everything from writing features to fixing bugs.

OpenAI appears to be assembling a three-tier approach to this market:

Codex in the cloud: The new cloud-based software engineering agent platform that has a deceptively simple interface is designed to offer autonomous operation outside of ChatGPT.
Codex in the terminal: Released last month, an open-source command line tool for developers working on the command line, which today they updated to use the new codex-1-mini model.
Windsurf (proposed $3 billion acquisition): A feature rich desktop development environment (IDE) with features like “Cascade” for codebase-wide context awareness and “Flows” for agentic engineering.

This strategy looks to cover the complete spectrum of development workflows and creates multiple entry points into OpenAI’s ecosystem. Interestingly the core of OpenAI’s newest offerings is not an existing model, but “codex-1”, a version of OpenAI’s o3 specifically optimised for software engineering through reinforcement learning on real-world coding tasks. Codex-1 is now the leading model on the SWE-Bench Verified benchmark. According to OpenAI, codex-1 produces “cleaner” code than o3, adheres more precisely to instructions, and will iteratively run tests until passing results are achieved. This makes it particularly effective for complex engineering tasks that require careful attention to project-specific requirements. Internal OpenAI teams report up to 3x improvement in code delivery when using Codex in well maintained codebases.

ExoBrain’s initial experiences with Codex reveal a tool with a distinctly different approach from other coding agents. Rather than a developer ready interface, Codex offers a clean task-oriented experience designed for parallel delegation. The interface features a simple text box with two distinct modes indicated by separate buttons, “ask” and “code”. For practical daily use, many OpenAI developers use a to-do file and simply instruct Codex to select and fix items from it or to generate feature plans.

According to Dan Shipper who tested Codex on his company’s production codebase: “Codex encourages a particular style of coding agent use: It emphasises the creation of small, self-contained tasks that turn into small, easy-to-review [changes]. This makes it a good fit for use by professional engineers working on production deployments.” Codex isn’t yet trying to replace senior engineers but rather transform them from programmers into managers who can delegate multiple tasks simultaneously. It performs best when given well-specified, self-contained tasks on existing codebases rather than exploratory development. Looking ahead, OpenAI plans to unify Codex with their Operator, Deep Research and memory systems, creating a comprehensive AI development ecosystem, and releasing a more powerful “pro” version of codex-1.

Despite being early to AI-assisted coding contributing to GitHub Copilot and the original Codex API in 2021, OpenAI finds itself playing catch-up in an increasingly crowded and valuable market. Cursor has emerged as a leader in this space. Founded by four MIT graduates in 2022, the company is now valued at $9 billion. Cursor’s platform generates nearly a billion lines of working code daily and has reached approximately $200 million in annual recurring revenue by April 2025, making it one of the fastest-growing software companies in history. Cursor 0.5, out this week, represents a move towards a multi-agent. With the introduction of the Background Agent feature, Cursor is no longer limited to on-screen collaborative assistance only but now allows developers to run multiple agents simultaneously in parallel environments.

Meanwhile, Google’s Gemini 2.5 models and Anthropic’s Claude 3.7 continue command a loyal following, and Cognition Labs Devin offers a similar task-based flow but with far greater configurability than Codex. Also, this week, Google announced their highly specialised AlphaEvolve agent that can discover new algorithms and is already finding ways to speed up Google’s global infrastructure. This relentless competition helps explain the reported $3 billion Windsurf acquisition by OpenAI (a massive sum for a company with only $40 million in annual revenue) reflecting the urgency it feels to secure its position in this market, and its need to construct a multi-layer offering.

Amongst the acquisition rumours, Windsurf are not standing still releasing a dedicated coding model SWE-1 this week, focusing on what they call “flow awareness” and the entire software engineering process rather than just code generation. By supporting a shared timeline where AI and humans seamlessly interact across editors, terminals, and browsers, they’re addressing the reality that coding is only a fraction of software development.

For business, the economics are compelling. When a single developer can orchestrate multiple agents simultaneously, the productivity gains could be immense. The stratification of software development roles appears inevitable and the employment changes now more structural than cyclical. At the top tier, we’ll see a growing demand for what we might call “agent wranglers”, skilled engineers who can direct multiple AI agents, understand system architecture and business needs, and focus on the most complex and creative aspects of full solution design. The middle and lower tiers face the greatest vulnerability. Microsoft announced cuts affecting approximately 6,000 employees this week, and despite the press releases to the contrary, software engineers possibly bore the brunt of these reductions. Bloomberg reported that over 40% of the roughly 2,000 positions cut in Washington state were in development roles. As companies across the tech sector continue their layoffs (IT unemployment rising to 5.7% in the US according to WSJ), we’re witnessing the beginning of what may indeed be “The Great Displacement” in software development, with demand consolidating into fewer, more elite roles.

Takeaways: The software sector serves as the canary in the coal mine for knowledge work generally. As we’ve said before, other professional fields would be wise to study this transformation closely. First, the pattern of specialised AI models following general-purpose ones will likely repeat across domains. Just as codex-1 improved upon o3 for coding tasks, we’ll see domain tuned models for legal, medical and financial work. Second, the bifurcation of interfaces, collaborative versus delegative. Different work styles require different AI interaction paradigms. Creative fields might benefit from the collaborative approach, while financial services might favour delegation. Third, the economic benefits will accrue disproportionately to those who adapt quickly and orchestrate. This pattern of companies investing heavily in AI whilst simultaneously reducing their technical workforce serves as an indicator of what may await other knowledge-work sectors.

Codex and the great developer displacement

Alien tools with no manual

Code speeds past human oversight

Harnesses are the new AI battleground

The adaptive thinking backlash

Subscribe to the ExoBrain Weekly Newsletter