2024 Week 47 news

November 22, 2024

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we look at:

How DeepSeek is matching OpenAI’s capabilities with a fraction of the scale, and what this means for the US-China AI race.
Big-tech’s vision of billions of AI agents transforming work, and the products and startups making it happen.
The paradox of AI, and AI PCs promising huge productivity gains while many workers struggle with complexity.

DeepSeek’s deep thought

The Chinese lab DeepSeek has exploded assumptions about AI development this week by matching OpenAI’s o1’s reasoning capabilities just a few months after the leading lab fired the starting gun on the reasoning AI era. Their new R1-lite model not only matches o1 on key benchmarks like AIME but does so with remarkable efficiency through their ‘mixture of experts’ architecture – using a minuscule 16B parameters.

This achievement comes as the US-China Economic and Security Review Commission recommended a “Manhattan Project” for AGI to Congress, suggesting broad government control over AI compute resources. The US-China AI race is accelerating.

DeepSeek’s success builds on the emerging trend of maximising inference-time computation rather than just scaling up model size. By getting models to “think harder” when answering questions, smaller and more efficient architectures can match the capabilities of much larger systems. This approach, similar to o1’s strategy, suggests that raw size and training compute may matter less than previously thought.

The timing is particularly noteworthy given ongoing US export controls on AI chips to China. DeepSeek’s rapid progress, and access to at least 10,000 Nvidia H100 equivalent GPUs, alongside other Chinese labs, demonstrates how regulation and restrictions might reshape but not necessarily halt AI development, and possibly even accelerate it.

Whilst DeepSeek (backed by High-Flyer Capital a Hong Kong based hedge fund), open source their models they still clearly conform to the kinds of content censorship that is required via Chinese AI regulation. Their model exhibits clear content boundaries – refusing to discuss sensitive political topics like Taiwan for example (or indeed this article).

Meanwhile, US policy discussions increasingly frame AI development as a geopolitical race. The commission’s report suggests giving the Department of Defence first priority on AI compute resources – from GPUs to data centre capacity. Some policymakers are even considering restrictions on open-weight AI models, although given the nature of training AI, Chinese labs could continue to innovate independently in any case. But the US will likely step-up restrictions on US investment into the Chinese AI market. The Biden administration’s compute threshold approach may be expanded significantly under Trump, whose appointees like Marco Rubio have pushed for broader constraints on US-China tech investment. The current “small yard, high fence” approach could shift to wider restrictions across multiple sectors like biotech and batteries, though some Trump-aligned business leaders with Chinese investments may push back.

Takeaways: DeepSeek R1 proves that the next phase of AI competition will not be about who can build the biggest models, but who can build the most efficient ones. DeepSeek’s achievement suggests that focused technical innovation within regulatory boundaries could be more productive than government-directed moonshots. For businesses, this means opportunities might lie in optimising existing models rather than chasing ever-larger architectures. Watch for more developments in inference-time computation and efficient architectures as the field continues to evolve. You can try the new model now via DeepSeek chat.

Building a billion-agent workforce

Major tech firms are betting big on AI agents becoming the new enterprise workforce, with Microsoft, Amazon and Google all launching agent initiatives this week. Gartner predict a third of enterprise software will incorporate such agents by the end of 2025. Salesforce have made some bold claim about releasing a billion agents by 2026, and Mark Zuckerberg has suggested there will soon be more agents than people. Microsoft announced their Azure AI Agent Service enabling developers to build secure, stateful autonomous AI agents that can automate business processes, Amazon launched their multi-agent orchestrator for managing multiple AI agents and complex conversations, and Google unveiled an AI agent ecosystem program with a new marketplace section called AI Agent Space.

AI agents can be thought of as digital workers that can independently understand instructions, reason about how to complete tasks, and take necessary actions – whether that’s analysing data, writing code, completing general tasks or coordinating with other agents. Gartner predicts that at least 15% of daily work decisions will be made by such systems by 2028, and the overall market is considered by many to be a multi-trillion-dollar opportunity. But Gartner also warns against organisations repeating past RPA mistakes (where thousands of bots were created without proper governance or documentation). A recent analysis highlights how ‘agentic’ AI, whilst promising significant benefits, requires careful management to avoid security risks and poor customer experiences.

The race to evolve AI agents from experimental technology to well-managed, secure and sustainable production-ready systems is well and truly on. According to recent LangChain research 58% of firms plan to adopt agents ‘soon’. Software development, traditionally the canary in the coal mine for AI adoption, is seeing the early impact. StackBlitz’s Bolt platform has been generating buzz in recent weeks, with examples such as one user transforming a $5,000, three-month development project into a two-week effort costing just $50. StackBlitz achieved $4 million in annual recurring revenue within just four weeks of launching their Bolt platform using Claude 3.5. Wordware, who raised $30 million this week, aims to make agent creation as simple as writing in a word processor, and is already attracting enterprise customers. Wordware CEO Filip Kozera notes, “We believe we’re witnessing a paradigm shift, and AI agents represent a new kind of software… they will play a central role in driving the economy.” The VCs are also thinking big, where they saw vertical SaaS as software used by a type of business, they now envisage vertical agent solutions taking over entire verticals and replacing those businesses.

The broader agent landscape is exploding with hundreds of startups emerging across multiple categories. From generic web agents like Multi-On to ‘vertical’ solutions like Harvey for legal use-cases. Enterprise platforms like Microsoft’s Azure and Amazon’s Bedrock provide the infrastructure layer, while orchestration frameworks such as OpenAI’s Swarm and CrewAI enable complex multi-agent coordination. To keep these proliferating agents in check, monitoring, security and management tools are springing up such as AgentOps. Firms such as Lindy are also providing top-to-bottom low-code platforms, that have many built in templates for general admin agents and allow users to construct their own for more role specific activities. We’re moving toward a future where each knowledge worker might command thousands of agents, fundamentally reshaping how work gets done.

But who will ultimately build all of these agents? The answer… agents. Research shows great promise in capabilities of meta-agents that can create other agents (although commercial solutions remain limited). This self-replicating capacity is what some are calling the ‘agent flywheel’ where an ecosystem of agents builds, deploys, and optimises itself with minimal human intervention. What started as a means to automate narrow tasks is now evolving into a vision of self-directed, self-improving digital workforces.

Takeaways: Organisations should start experimenting with AI agents now, focusing first on medium-stakes, high-value use cases, and at the same time deploying agent infrastructure, orchestration, and management. The most successful implementations will likely combine specialised vertical agents with the best-of-breed horizontal agents whilst keeping an eye on emerging meta-agents. The winners in this new landscape won’t be those with the most software developers, but those who most effectively empower their domain experts to create and deploy.

JOOST

AI’s productivity puzzle

For all the promises of AI revolutionising work, recent findings highlight a paradox: productivity, the metric AI is supposed to supercharge, is still in decline. An article in The Register this week reports that while new Intel ‘AI PCs’ aim to enhance workflows, their complexity often overwhelms users, leaving them frustrated rather than empowered. Instead of simplifying work, poorly designed tools and new AI communication overheads can create more barriers, reducing productivity and eroding trust.

Is AI being overhyped for this kind of personal productivity, or are we failing to integrate it effectively? The answer lies not in AI itself but in how it’s implemented, understood, and utilised. Misaligned expectations, skill gaps, and inadequate design can undermine AI’s transformative potential. Yet, despite these challenges, AI’s capacity to reshape productivity remains unparalleled if we learn to harness it effectively.

The new OECD report “Miracle or Myth?” compares AI to electricity and the internet as a technology that could revive stagnating productivity. For businesses like Revolut, this means efficiency gains of up to 200% in certain teams. But rather than cutting staff, they’re reinvesting these gains to scale faster and grow.

Yet there’s a clear mismatch between AI’s capabilities and workers’ readiness. Resume Genius found that while most believe AI could make them more productive, over half feel unprepared to use it well. This gap between optimism and preparedness holds back AI’s potential.

Startups show a different picture. The Scaling Through Chaos report shows they’re using AI not to replace jobs but reimagine them. Founders expect headcounts to grow, especially in engineering and product teams. AI creates demand for roles focused on building and improving AI systems.

Some roles, particularly in marketing and customer support, may shrink as automation improves. However, Revolut’s Chief Marketing Officer describes how AI lets smaller teams achieve what once needed large groups.

This points to a key truth: AI isn’t replacing humans but changing how we work. By handling routine tasks, it lets us focus on innovation and complex problems. But this needs deliberate action. As Job van der Voort of Remote notes, teams must experiment and learn by doing, to overcome the adoption hurdles.

Takeaways: Organisations need to focus on training and integration alongside AI adoption. The winners will be those who use AI to amplify human capabilities and give their teams the necessary support and space to transition. Start small, learn fast, and keep humans at the centre of your strategy.

EXO

Weekly news roundup

This week’s developments show a strong focus on AI infrastructure and hardware, particularly with Nvidia’s continued dominance, while concerns about AI governance and responsible development remain at the forefront of industry discussions.

AI business news

OpenAI considers adding web browser and search partnerships (Signals potential major expansion of OpenAI’s capabilities into direct web interaction)
Figure AI’s humanoid robot ‘speeds up BMW production by 400%’ (Demonstrates real-world impact of robotics in manufacturing efficiency)
Elon Musk’s xAI doubles valuation after $5 billion funding round (Shows continued strong investor confidence in AI startups)
Taking a cue from X, Threads tests AI-powered summaries of trending topics (Indicates growing AI integration in social media content curation)
Microsoft will soon let you clone your voice for Teams meetings (Highlights advancement in practical AI voice synthesis applications)

AI governance news

Riot calls out Netflix for “disrespectful” AI-made Arcane season 2 poster (Illustrates growing tensions over AI-generated content in creative industries)
Deus in machina: Swiss church installs AI-powered Jesus (Shows AI’s expanding influence into religious and cultural spaces)
New York Times says OpenAI erased potential lawsuit evidence (Highlights legal challenges in AI training data disputes)
OpenAI releases a teacher’s guide to ChatGPT, but some educators are skeptical (Reveals ongoing challenges in AI integration in education)
Ben Affleck tells actors and writers not to worry about AI (Adds to the debate about AI’s impact on creative industries)

AI research news

The dawn of GUI agent: a preliminary case study with Claude 3.5 computer use (Advances understanding of AI interaction with graphical interfaces)
Enhancing the reasoning ability of multimodal large language models via mixed preference optimization (Improves AI reasoning capabilities across different modalities)
LLaVA-o1: let vision language models reason step-by-step (Develops more systematic visual reasoning in AI models)
Marco-o1: towards open reasoning models for open-ended solutions (Advances AI problem-solving capabilities)
RedPajama: an open dataset for training large language models (Provides new resources for open-source AI development)

AI hardware news

Nvidia stock seesaws following earnings beat, strong outlook as ‘age of AI is in full steam’ (Demonstrates continued dominance in AI hardware market)
Nvidia says its Blackwell chip is fine, nothing to see here (Addresses concerns about next-gen AI chip development)
Crusoe, a rumored OpenAI data center supplier, has secured $686M in new funds, filing shows (Indicates significant investment in AI infrastructure)
UK crashes out of global top 50 supercomputer ranking (Highlights shifts in global AI computing capabilities)
Sagence is building analog chips to run AI (Shows innovation in alternative AI hardware approaches)