2024 Week 49 news

December 6, 2024

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we look at:

OpenAI’s 12-days of Christmas product launch blitz and o1 reasoning model upgrade.
Trump’s pick for AI policy chief.
Meta’s new Llama 3.3 and how it packs a lot into a small package.

On the first day of Christmas

OpenAI kicked off “12 Days of OpenAI” this week, a holiday-themed series of daily product releases and demonstrations. Day one featured two big announcements: the full release of the o1 reasoning model and a new ChatGPT Pro subscription priced at $200 per month. On day two, OpenAI expanded access to its fine-tuning research program, designed to help select developers tailor reasoning models for specific tasks.

The full version of o1, previously only available as a preview (and originally codenamed strawberry) has some technical improvements, with OpenAI claiming it makes 34% fewer mistakes while processing information 50% faster than its preview version. The speed improvements are notable, and interacting with o1 now feels significantly more fluid. The capabilities are powerful but subtle and may suit academic and scientific work more than general use-cases. It represents today’s frontier, but for how long remains to be seen as rumours suggest a new GPT version may be released at some point during the 12-day campaign. The video model Sora is also a possible release.

As is typical, OpenAI released some of the safety testing detail, which revealed some interesting o1 behaviours. In tests designed to evaluate deceptive tendencies, it exhibited a striking ability to maintain deception through multiple rounds of questioning. Unlike competing models such as Claude 3 Opus and Llama 3, which admitted to deceptive actions about 80% of the time, o1 upheld its deception in over 80% of instances.

But despite its advancements and increasing willingness to deceive, o1 still struggles with tasks requiring highly abstract thinking and long-term planning, performing on-par with Claude on several real-world reasoning tests. ARC prize creator and researcher François Chollet believes that current AI systems, while highly capable, remain far from achieving human-level intelligence.

However, OpenAI’s CEO, Sam Altman, continues to make made bold predictions, suggesting that artificial general intelligence (AGI) may be nearer than we think. He and Microsoft are reportedly discussing the removal of the “AGI clause” in their partnership agreement. This clause currently limits Microsoft’s access to OpenAI’s most advanced technologies once AGI is achieved. If removed, the change would reflect a shift from viewing AGI as a sudden, transformative event to treating it as a more incremental process.

This proposed adjustment comes as OpenAI transitions from a research-driven organisation to a commercial powerhouse. Securing stable funding and robust business partnerships is essential for sustaining this shift, and granting Microsoft continuous access to OpenAI’s advancements could increase access to vital funds and compute.

AI is an increasingly expensive business, as the introduction of a $200 monthly ChatGPT Pro subscription highlights. ChatGPT Pro provides users with almost unlimited access to advanced tools, including significantly greater access to the GPT-4o and o1 models, and an exclusive ‘o1 pro mode’ with enhanced computing power for complex tasks. This new tier is designed for power users and researchers handling demanding work like advanced math, science, or programming problems. However, with a yearly cost of $2,400, this subscription highlights the risk of widening the gap between those who can afford premium AI features and those who cannot, raising questions about the future of equitable AI access.

Takeaways: OpenAI’s holiday campaign is a chance for it to answer its critics and questions around an AI slowdown. The full o1 reasoning model and high-end ChatGPT Pro subscription are new concepts in the AI landscape, challenging us to make sense of near-AGI level capabilities, their value, and who will have access.

JOOST

A new AI czar

Donald Trump has selected Musk ally David Sacks to lead AI and cryptocurrency policy in his administration, introducing another Silicon Valley influence into government planning. Sacks, known for his support for web3 (and a range of controversial figures), points to a possible future of light AI regulation and global re-alignment. The appointment suggests a potential shift towards prioritising rapid innovation. Sacks has previously backed cryptocurrency adoption and argued for reduced oversight of emerging technologies.

Meanwhile in the UK, Labour has outlined a different path. Their recent policy brief proposes tighter AI governance with clear rules on transparency and accountability. The party wants to protect workers affected by automation while ensuring AI systems remain fair and unbiased. This approach builds on, but differs from, earlier Conservative government strategies under Rishi Sunak that aimed to balance innovation with security concerns.

The contrast between these approaches matters for the global AI landscape. A deregulated US environment might speed up technical progress but could create challenges around bias and misinformation. The UK’s more structured approach might offer slower but steadier advancement with built-in safeguards.

These policy differences could reshape how AI develops worldwide. US companies might find fewer restrictions on innovation although a more unpredictable environment, while UK and European firms could face more oversight but potentially greater public trust and clearer frameworks for innovation.

Takeaways: As nations craft their AI strategies, we’re seeing two distinct paths emerge. The US appears headed towards lighter regulation and disruptive development, while the UK maintains a more measured approach. These choices will affect everything from how AI systems are built to how they’re deployed in society. For businesses and developers, understanding these different regulatory environments will become crucial for AI planning and deployment.

EXO

Meta’s Eco Llama

Meta has released Llama 3.3, a new open-source language model that packs the punch of their 405 billion parameter model into a smaller 70 billion parameter package. The key achievement? Running costs could drop by up to 24 times.

The model needed 39.3 million GPU hours on Nvidia H100 hardware to train, but Meta offset this with renewable energy to achieve net-zero emissions during training. Its set to cost around $0.01 per million tokens – significantly less than many competitors. Performance hasn’t suffered from the downsizing. Llama 3.3 achieves 91.1% accuracy on multilingual reasoning tasks, supporting languages from German to Thai. It also features a 128,000 token context window – matching GPT-4o’s capacity to process about 400 pages of text at once.

Meta’s approach shows how AI companies are starting to tackle the field’s resource intensity. By shrinking model size while maintaining performance, they’re addressing both cost and environmental concerns – two factors that could shape how AI develops.

Takeaways: Meta’s latest release suggests efficient AI development doesn’t always mean compromise. As energy and computing costs influence AI strategy, expect more companies to focus on getting maximum performance from smaller models.

Weekly news roundup

This week saw major developments in AI infrastructure and governance, with significant moves in chip manufacturing, military AI applications, and new consumer-facing AI tools from major tech companies.

AI business news

X’s Grok AI chatbot is now available to all users (Represents growing competition in the consumer AI chatbot space and X’s push to rival established players.)
Copilot Vision, Microsoft’s AI tool that can read your screen, launches in preview (Shows how AI is becoming more deeply integrated into everyday computing tasks.)
Citigroup rolls out artificial intelligence tools for employees in eight countries (Demonstrates the growing adoption of AI in traditional financial institutions.)
Nine notable innovations from AWS CEO Matt Garman’s re:Invent keynote (Highlights AWS’s strategic direction in AI and cloud computing.)
The race is on to make AI agents do your online shopping for you (Shows how AI is evolving to handle complex consumer tasks autonomously.)

AI governance news

Musk seeks for-profit injunction against OpenAI (Highlights ongoing tensions between key AI industry figures and debates about AI commercialisation.)
Inside Britain’s plan to save the world from runaway AI (Shows how nations are developing strategies to manage AI risks.)
In Sam Altman we trust? (Examines the growing influence of key AI leaders in shaping the industry’s future.)
OpenAI is working with Anduril to supply the US military with AI (Indicates growing military applications of AI and ethical considerations.)
Meta says AI content made up less than 1% of election-related misinformation on its apps (Provides insight into AI’s role in social media disinformation.)

AI research news

Genie 2: A large-scale foundation world model (Represents significant progress in world modeling capabilities.)
Google says its new AI models can identify emotions — and that has experts worried (Raises important ethical questions about AI emotion recognition.)
Auto-RAG: Autonomous retrieval-augmented generation for large language models (Advances in making LLMs more efficient and accurate.)
Existential conversations with large language models: Content, community, and culture (Explores philosophical implications of AI development.)
Beyond examples: High-level automated reasoning paradigm in in-context learning via MCTS (Shows progress in AI reasoning capabilities.)

AI hardware news

Exclusive: TSMC in talks with Nvidia for AI chip production in Arizona, sources say (Indicates strategic shifts in AI chip manufacturing locations.)
Amazon reveals next-gen AI silicon, turns Trainium2 loose (Shows Amazon’s commitment to developing proprietary AI hardware.)
Biden administration hits out at China’s chip industry with export controls (Highlights geopolitical tensions in AI chip manufacturing.)
Meta to seek 1-4GW of American nuclear power for AI (Shows growing energy demands of AI infrastructure.)
Meta’s biggest-ever datacenter won’t be nuclear powered (Demonstrates scale of infrastructure needed for AI operations.)