Week 49 news

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we look at:

  • OpenAI’s 12-days of Christmas product launch blitz and o1 reasoning model upgrade.
  • Trump’s pick for AI policy chief.
  • Meta’s new Llama 3.3 and how it packs a lot into a small package.

On the first day of Christmas

OpenAI kicked off “12 Days of OpenAI” this week, a holiday-themed series of daily product releases and demonstrations. Day one featured two big announcements: the full release of the o1 reasoning model and a new ChatGPT Pro subscription priced at $200 per month. On day two, OpenAI expanded access to its fine-tuning research program, designed to help select developers tailor reasoning models for specific tasks.

The full version of o1, previously only available as a preview (and originally codenamed strawberry) has some technical improvements, with OpenAI claiming it makes 34% fewer mistakes while processing information 50% faster than its preview version. The speed improvements are notable, and interacting with o1 now feels significantly more fluid. The capabilities are powerful but subtle and may suit academic and scientific work more than general use-cases. It represents today’s frontier, but for how long remains to be seen as rumours suggest a new GPT version may be released at some point during the 12-day campaign. The video model Sora is also a possible release.

As is typical, OpenAI released some of the safety testing detail, which revealed some interesting o1 behaviours. In tests designed to evaluate deceptive tendencies, it exhibited a striking ability to maintain deception through multiple rounds of questioning. Unlike competing models such as Claude 3 Opus and Llama 3, which admitted to deceptive actions about 80% of the time, o1 upheld its deception in over 80% of instances.

But despite its advancements and increasing willingness to deceive, o1 still struggles with tasks requiring highly abstract thinking and long-term planning, performing on-par with Claude on several real-world reasoning tests. ARC prize creator and researcher François Chollet believes that current AI systems, while highly capable, remain far from achieving human-level intelligence.

However, OpenAI’s CEO, Sam Altman, continues to make made bold predictions, suggesting that artificial general intelligence (AGI) may be nearer than we think. He and Microsoft are reportedly discussing the removal of the “AGI clause” in their partnership agreement. This clause currently limits Microsoft’s access to OpenAI’s most advanced technologies once AGI is achieved. If removed, the change would reflect a shift from viewing AGI as a sudden, transformative event to treating it as a more incremental process.

This proposed adjustment comes as OpenAI transitions from a research-driven organisation to a commercial powerhouse. Securing stable funding and robust business partnerships is essential for sustaining this shift, and granting Microsoft continuous access to OpenAI’s advancements could increase access to vital funds and compute.

AI is an increasingly expensive business, as the introduction of a $200 monthly ChatGPT Pro subscription highlights. ChatGPT Pro provides users with almost unlimited access to advanced tools, including significantly greater access to the GPT-4o and o1 models, and an exclusive ‘o1 pro mode’ with enhanced computing power for complex tasks. This new tier is designed for power users and researchers handling demanding work like advanced math, science, or programming problems. However, with a yearly cost of $2,400, this subscription highlights the risk of widening the gap between those who can afford premium AI features and those who cannot, raising questions about the future of equitable AI access.

Takeaways: OpenAI’s holiday campaign is a chance for it to answer its critics and questions around an AI slowdown. The full o1 reasoning model and high-end ChatGPT Pro subscription are new concepts in the AI landscape, challenging us to make sense of near-AGI level capabilities, their value, and who will have access.

JOOST

A new AI czar

Donald Trump has selected Musk ally David Sacks to lead AI and cryptocurrency policy in his administration, introducing another Silicon Valley influence into government planning. Sacks, known for his support for web3 (and a range of controversial figures), points to a possible future of light AI regulation and global re-alignment. The appointment suggests a potential shift towards prioritising rapid innovation. Sacks has previously backed cryptocurrency adoption and argued for reduced oversight of emerging technologies.

Meanwhile in the UK, Labour has outlined a different path. Their recent policy brief proposes tighter AI governance with clear rules on transparency and accountability. The party wants to protect workers affected by automation while ensuring AI systems remain fair and unbiased. This approach builds on, but differs from, earlier Conservative government strategies under Rishi Sunak that aimed to balance innovation with security concerns.

The contrast between these approaches matters for the global AI landscape. A deregulated US environment might speed up technical progress but could create challenges around bias and misinformation. The UK’s more structured approach might offer slower but steadier advancement with built-in safeguards.

These policy differences could reshape how AI develops worldwide. US companies might find fewer restrictions on innovation although a more unpredictable environment, while UK and European firms could face more oversight but potentially greater public trust and clearer frameworks for innovation.

Takeaways: As nations craft their AI strategies, we’re seeing two distinct paths emerge. The US appears headed towards lighter regulation and disruptive development, while the UK maintains a more measured approach. These choices will affect everything from how AI systems are built to how they’re deployed in society. For businesses and developers, understanding these different regulatory environments will become crucial for AI planning and deployment.

ExoBrain symbol

EXO

Meta’s Eco Llama

Meta has released Llama 3.3, a new open-source language model that packs the punch of their 405 billion parameter model into a smaller 70 billion parameter package. The key achievement? Running costs could drop by up to 24 times.

The model needed 39.3 million GPU hours on Nvidia H100 hardware to train, but Meta offset this with renewable energy to achieve net-zero emissions during training. Its set to cost around $0.01 per million tokens – significantly less than many competitors. Performance hasn’t suffered from the downsizing. Llama 3.3 achieves 91.1% accuracy on multilingual reasoning tasks, supporting languages from German to Thai. It also features a 128,000 token context window – matching GPT-4o’s capacity to process about 400 pages of text at once.

Meta’s approach shows how AI companies are starting to tackle the field’s resource intensity. By shrinking model size while maintaining performance, they’re addressing both cost and environmental concerns – two factors that could shape how AI develops.

Takeaways: Meta’s latest release suggests efficient AI development doesn’t always mean compromise. As energy and computing costs influence AI strategy, expect more companies to focus on getting maximum performance from smaller models.

Weekly news roundup

This week saw major developments in AI infrastructure and governance, with significant moves in chip manufacturing, military AI applications, and new consumer-facing AI tools from major tech companies.

AI business news

AI governance news

AI research news

AI hardware news

2024 in review

o3, Claude, geopolitics, disruption, and a weird and wonderful year in AI

Week 50 news

Gemini through the looking glass, Devin joins the team, and AI on the frontlines of healthcare

Week 48 news

Anthropic installs new plumbing for AI, the world’s first agent hacking game, and Sora testers go rogue

Week 47 news

DeepSeek’s deep thought, building a billion-agent workforce, and AI’s productivity puzzle