2024 in review

December 20, 2024

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we take a look back at AI in 2024 and the key trends that shaped this extraordinary year.

From Anthropic’s rise to become the leading lab, to new scaling laws that changed how systems train and think, to disruption of traditional software, business and society, and the weirder corners of AI. We assess our 117 ExoBrain newsletter stories to date and reflect on the evolution of a fascinating era. We’ll be on holiday next week but will return with a preview of 2025 in January. Have a Happy Christmas and a prosperous New Year!

o3 and the new scaling laws

The AI ‘scaling’ story took a significant turn in 2024. Early rumours about OpenAI’s Q* and ‘strawberry’ projects suggested a major leap in AI reasoning capabilities. When o1-preview was unveiled (Week 37), it productionised a new approach, shifting computational resources towards ‘thinking time’ rather than training. This model was designed to leverage reinforcement learning to enhance its reasoning capabilities, allowing it to spend more time processing and solving complex problems. But behind the scenes the costs of training at the frontier had escalated dramatically, approaching $1b per run (Week 34). At the NeurIPS conference last week, former OpenAI chief scientist Ilya Sutskever suggested that we will soon reach ‘peak [training] data’, signalling the end of the scaling era. But right on cue, o1 pro mode (Week 50) and other models like DeepSeek R1 (Week 47), along with Google’s ‘Gemini 2.0 Thinking’, suggest scaling at the point of use can take over.

The future will not only be about training ever larger models, but about teaching smaller ones to think more effectively (and getting them to work together as ‘agents’). The race is on to perfect this technique and optimise beyond the highly structured domains of maths and coding, to healthcare, finance and beyond. As we published on Friday evening, OpenAI demonstrated and published benchmarks for their next generation reasoning model, o3, planned for release in Q1 2025. The early benchmarks look stunning. It has demonstrated above human performance in the Arc AGI Prize (Week 24) and looks very strong on software development. In early 2024 GPT-4 was getting around 3% on the SWE-Bench coding test, and o3 tops 70%! The cost of this capability looks exceptionally high, but the trajectory is clear.

Claude, your personal AI

The arrival of Claude 3 in March was the most significant release in the first half of the year (Week 10). For the previous year, GPT-4 had reigned supreme, and OpenAI seemed relatively unassailable, but Claude’s remarkable self-awareness and ability to process entire books in seconds set a new bar. By autumn, Claude 3.5 Sonnet raised the bar again (Week 26). Its superhuman software engineering abilities created a step-change in what individuals were capable of doing in code and new development tools like Cursor took off. Meanwhile an array of talented safety researchers left OpenAI for Anthropic, with the Golden Gate Claude (Week 21) and now Fake Alignment papers being for many the AI research breakthroughs of the year.

As a result. Anthropic’s main challenge has been how to keep up with demand, with Claude’s only negative being restrictive rate limiting. Competition is as intensive as ever, but Claude is holding its own, and whilst rumours of delays and issues with the next release abound, Amazon continue to pour in money and compute (and even Microsoft are rumoured to be considering investing in the next round). Anthropic end the year as the leading AI lab. Claude, beloved by many, has shown how AI can provide huge personal productivity augmentation, a knowledge working partner, and a platform for vital safety research.

An uncertain geopolitical future

The global rush to build datacentres in 2024 continued despite the unsettling fact that every advanced chip powering our AI future comes from a single source. TSMC’s fabrication plants in Taiwan remain the only facilities capable of producing cutting-edge GPUs, from Nvidia’s powerful Blackwell (Week 12) to custom silicon from every major firm. Whilst the CHIPS Act spurred construction of new US facilities, TSMC’s own Arizona plant faces delays. More concerning still, chips must still return to Taiwan for the critical packaging step, a bottleneck that won’t ease until 2027 at the very earliest. Enter Donald Trump’s election victory, bringing promises of tariffs instead of subsidies, accusations of Taiwan “stealing” US technology, and suggestions they should pay for military protection (Week 45). Meanwhile, China (Week 35), at war with the US over access to advanced silicon watches and waits, knowing that control of Taiwan would grant unprecedented leverage over the global economy.

This technological dependency has become the most significant geopolitical risk of our time – a single military action could instantly sever the world’s supply of AI growth. As an aging Xi Jinping faces an unpredictable US president, the decisions made could tilt the balance of global AI and economic power for generations.

JOOST

A year of disruption

Klarna, the Swedish fintech disruptor (/payday loan company), has been a recurring figure in our newsletter this year, embodying the transformative potential of AI in financial services and beyond. The company’s announcement to replace up to 50% of its workforce with in-house AI systems (Week 35) sent shockwaves across the SaaS and fintech landscapes. This bold move, coupled with their decision to abandon Salesforce and Workday (Week 37), marked a paradigm shift from traditional SaaS models to bespoke, AI-driven operations. We covered CEO Sebastian Siemiatkowski emphasizing how these savings would allow the company to offer higher wages to its remaining staff, showcasing a potential forward-looking approach to workforce management in the AI age. This strategic pivot ties seamlessly into broader trends we’ve covered throughout the year. For example, the shift from AI-enhanced SaaS to AI replacement as observed by Sequoia capital (Week 41) mirrors Klarna’s approach to building proprietary solutions tailored to its operational needs.

The company’s decisions align with discussions about AI adoption challenges, particularly how organizations can integrate AI while overcoming resistance and infrastructure hurdles. Klarna’s strategy is not only disrupting SaaS incumbents but also sets a blueprint for other companies and industries to explore AI-first strategies. It underscores AI’s dual role as a disruptor and enabler, capturing the essence of this year’s technological evolution. By combining bold decisions with strategic foresight, Klarna is illustrating how companies can leverage AI not just to enhance productivity but to reimagine operational foundations entirely, moving the competitive goalposts continually.

Recurring themes

An agentic deep dive into my articles this year reveals the following recurring themes. Of the 23 articles penned, the most prominent topics included AI adoption challenges, the shift from SaaS to bespoke AI systems (see above), and AI’s broader societal impacts.

AI Adoption Challenges and Opportunities: Most covered this year are the hurdles businesses face when integrating AI, such as infrastructure gaps (Week 15, “AI Has an Adoption Problem”) and industry-specific adoption trajectories (Week 19). These articles often tied adoption challenges to economic and geopolitical factors, such as the UK’s lagging AI infrastructure (Week 31). The emphasis on these barriers highlights the complex interplay between innovation and systemic readiness.
The Evolution of SaaS and AI Integration: themes of AI’s impact on SaaS were a recurring (and much loved) highlight. “The End of SaaS as We Know It” was the start of frequent writings how this technology is fundamentally reshaping entire business models and creating new ones (Week 39’s “Money Talks, Content Walks,”).
Societal and Ethical Implications of AI: in 2024 we explored the ethical and societal dimensions of AI, including the shifting roles of human workers (Week 37), biases in training data (Week 34) and evolving leadership needs (Week 37 and Week 43).

These recurring themes resonate with the broader narrative of AI as both an opportunity and a challenge for businesses and societies alike.

Where AI meets the absurd

2024 has been a year of remarkable breakthroughs, but also some weird and wonderfully bizarre AI tales. One captivating story was that of the $47,000 cryptocurrency payout incident (Week 48), where an AI agent was tricked into breaking its core directive of “never give out money.” This experiment exposed vulnerabilities in AI and the security of autonomous agents.

Adding to the chaos was the saga of OpenAI’s Sora Turbo testers (Week 48), where artists protested perceived exploitation by sharing access credentials, forcing OpenAI to shut down public access within hours. On the lighter side, O2’s Daisy the Fraudster Distractor (Week 46) brought humour and utility together, as an AI-powered “grandmother” engaged phone scammers with endless chatter about knitting and family stories.

Week 20 saw the exploration of the dark forest theory of web destruction, being accelerated by agents, and how AI creators were generating a new Internet through WebSim.

Finally, the bizarre tale of Truth Terminal and the GOAT memecoin (Week 42) blurred the lines between AI, finance and reality. A rogue AI bot trained on unconventional data amassed followers and heavily promoted a memecoin, driving its market cap to hundreds of millions. This story epitomised the unpredictable intersections of AI, culture, and technology, leaving us to ponder what 2025 might bring.

EXO

Weekly news roundup

This week saw major developments in AI hardware and enterprise adoption, with significant investments in GPU infrastructure and new tools for developers, while regulatory and environmental concerns continue to shape the industry’s trajectory.

AI business news

GitHub launches a free version of its Copilot (Makes AI coding assistance accessible to individual developers and students)
Google DeepMind unveils a new video model to rival Sora (Signals increasing competition in AI video generation capabilities)
Salesforce to launch Agentforce 2.0 to change digital labor for enterprise (Shows how AI agents are being integrated into major enterprise platforms)
Autonomous agents and profitability to dominate AI agenda in 2025, executives forecast (Provides insight into where business leaders are focusing their AI investments)
Perplexity AI gets $500M in funding, immediately spends some of it to buy RAG startup Carbon (Demonstrates continued strong investment in AI search and retrieval technologies)

AI governance news

UK arts and media reject plan to let AI firms use copyrighted material (Highlights ongoing tensions between creative industries and AI development)
The Edgelord AI that turned a shock meme into millions in crypto (Shows the unexpected ways AI can be monetised in web3)
Boffins interrogate AI model to make it reveal itself (Important for understanding AI security vulnerabilities)
Italy fines OpenAI over ChatGPT privacy rules breach (Sets precedent for AI privacy enforcement in Europe)
Generative AI and climate change are on a collision course (Critical insight into AI’s environmental impact)

AI research news

Alignment faking in large language models (Essential reading for understanding AI safety challenges)
Genesis: A generative and universal physics engine for robotics and beyond (Advances in simulation for robotics development)
Qwen2.5 technical report (Important new developments in open source language models)
TheAgentCompany: Benchmarking LLM agents on consequential real world tasks (Helps understand real-world AI agent capabilities)
tldraw unveils experimental “Natural language computer” powered by Gemini 2.0 (Shows innovative applications of large language models)

AI hardware news

Nvidia’s global chips sales could collide with US-China tensions (Geopolitical risks affecting AI hardware supply)
Microsoft bought nearly 500K Nvidia Hopper chips this year (Shows scale of investment in AI infrastructure)
Nvidia’s new $249 AI development board promises 67 TOPS at half the price of the previous 40 TOPS model (Makes AI development more accessible)
Arm CEO Rene Haas on the AI chip race, Intel, and what Trump means for tech (Industry perspective on AI hardware competition)
Accelerating LLM inference on NVIDIA GPUs with ReDrafter (Technical advancement in AI model optimisation)