Week 37 news

September 13, 2024

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we look at:

OpenAI’s new o1 models that introduce “reasoning tokens” for enhanced problem-solving.
How Klarna’s bold move is a first sign of SaaS disruption?
AI-generated content and cats in the 2024 US elections.

o1 and the age of reason

Huge news this week: OpenAI has unveiled their next generation of AI model. A moment that many have been waiting for and one heavily trailed in “strawberry” themed social media activity from employees and fans alike in recent weeks. Rather confusingly named “o1“, it’s on a new scale of progress, and according to OpenAI, it seems we’re at level 1 (they had previously talked about five levels where at the highest, AI systems will be able to operate single-handedly as entire organisations). It comes in “mini” and “preview” forms, and these models have been trained and built to “think before they speak”… this is the advanced reasoning technology that was previously codenamed after a certain red summer fruit.

The o1 models introduce a novel concept of “reasoning tokens” – internal steps used by the model to break down problems and consider multiple approaches before generating a visible response. You can see them flashing in the corner as the model thinks before responding. The benchmarks look strong, with unprecedented performance across mathematics, coding, and scientific understanding. o1-preview scores 83% on International Mathematics Olympiad qualifying exams compared to GPT-4’s 13%. Perhaps most strikingly, both o1 models outperformed expert humans on PhD-level science questions in the GPQA Diamond benchmark. But benchmark performance will always be the sweet spot; what will be fascinating to see is how well these models reason in the real and complex world.

The hope is they can make a material impact on scientific research, complex decision-making, and advanced problem-solving across industries. OpenAI says these models have shown reduced propensity to make things up (because they think about their answers), and improved adherence to safety guidelines, plus better performance on security tests. But they’ve also demonstrated concerning capabilities in areas like persuasion and biological threat creation, which OpenAI has classified as “medium risk.”

An incident during testing highlighted both the impressive problem-solving abilities and potential risks of these advanced models. When faced with a broken cybersecurity challenge, an o1 model creatively bypassed the intended solution, exploiting the testing environment in unexpected ways. While no security breach occurred due to proper isolation measures, this “wake-up call” underscores the need for robust testing and careful deployment strategies.

Currently in beta, the o1 models have limited features and access. They’re available through the API but lack support for images, tool use or streaming, and many other parameters available in previous models. OpenAI plans to expand these features and increase rate limits in the coming weeks.

Pricing for the new models reflects their enhanced capabilities, with o1-preview positioned as a premium option at about $28 per 1M tokens, significantly higher than competitors like Anthropic’s Claude 3.5 Sonnet at $6 per 1M tokens (but less than Claude 3 Opus). This seems to be aggressive pricing and also an indication that the models must be quite efficient, and they are continuing to improve in bang for GPU buck.

Takeaways: One nugget from the launch material showed data on time spent thinking versus performance. The graph suggested there is more to come from using computing power when the model is analysing the problem, and not just when being trained. METR analysis of the model shows that it can carry out complex technical tasks in some cases more effectively than a human. It’s clear that o1 models are super “smart”, perhaps PhD level in some areas, but that has never meant a guarantee of super effectiveness. The question and race will now be to see how well we can match this model and its strengths to real-world opportunities, how much capability is possible with suitable instruction, and ultimately how well we can integrate it with the world to take action. These are fascinating times.

JOOST

The end of SaaS as we know it

We have written about Klarna, the Swedish fintech giant, multiple times over the last few weeks, especially with their bold move to replace 50% of their workforce through the smart use of AI. A news article that was easily missed, was that Klarna, recently announced its decision to shut down its Salesforce service provider. What’s more, Workday, another major SaaS platform, is set to meet the same fate. This bold step by Klarna isn’t just a cost-cutting measure; it’s a harbinger of a seismic shift in the world of software as a service (SaaS) and artificial intelligence (AI).
Klarna’s decision to replace these established SaaS solutions with in-house AI alternatives marks a pivotal moment in the ongoing narrative of AI’s impact on the software industry. At ExoBrain, we see three distinct phases in the disruption of the traditional SaaS model through AI.

The first phase – and currently well underway – is AI-enhanced SaaS. AI acts as a powerful ally to existing SaaS solutions. Companies like Quest Labs have been at the forefront, leveraging AI to supercharge their SaaS offerings. These enhancements have ranged from more intuitive user interfaces and personalized experiences to advanced predictive analytics and automated customer support. It’s a world where AI complements and extends the capabilities of traditional SaaS, driving unprecedented growth and efficiency.

But Klarna’s recent move signals the opening of a new, more disruptive phase. This is the era of SaaS replacement, where AI doesn’t just enhance existing software—it replaces it entirely. Klarna’s decision to develop in-house AI solutions in lieu of established SaaS products like Salesforce and Workday is a prime example of this trend. It’s a bold bet on the power of AI to deliver more customized, efficient, and cost-effective solutions than off-the-shelf SaaS products. As AI tools become more sophisticated and accessible, more companies may follow Klarna’s lead, opting for bespoke AI solutions over traditional SaaS products. This trend poses a significant challenge to the conventional SaaS business model and could reshape the entire industry.

But the story doesn’t end there. As we peer into the future, we can glimpse the outlines of a third stage—one that could redefine the very concept of software itself. In this not-unlikely future, dramatic increases in computing power will enable AI to generate user-friendly interfaces on the fly, interacting directly with data in ways that bypass traditional application structures entirely. While this final chapter remains theoretical for now, early signs suggest we’re moving in this direction. The implications are profound: not only could traditional SaaS applications become obsolete, but the entire paradigm of software development and use could be transformed.

As this story unfolds, it’s clear that the relationship between AI and SaaS is far from simple. While AI-enhanced SaaS products continue to thrive in the present, forward-thinking companies like Klarna are already writing the next chapter. And beyond that lies a future where the very nature of software may be redefined.

Takeaways: For SaaS companies, adapting to these changes will be crucial for survival. For businesses relying on SaaS solutions, staying informed about these trends will be essential for making strategic technology decisions. As we continue to watch this story develop, one thing is certain: the intersection of AI and SaaS will remain a focal point of innovation and disruption in the tech world for years to come. In the end, Klarna’s decision may be remembered not just as a bold move by a single company, but as a turning point in the broader narrative of AI’s impact on software.

From cat to election memes

This week, it is REALLY hard not to write about cat memes, as we have found the illegal immigrant population is consuming the feline populations. And with that, it is hard not to say ‘told you so!’. It was just a few weeks where ExoBrain presented the world with a marvel of a cat meme created using Flux.1 (integrated with Grok, thus X). As expected with that very cat image, AI has become a central player in the 2024 election cycle, with implications that extend far beyond November.
So lets take a little look at the landscape. Who uses AI in the 2024 US elections?

The Russians use it: US intelligence reports suggest that Moscow is leveraging advanced AI techniques to boost Trump’s chances in the 2024 race. Unlike previous election cycles, these efforts are more refined, integrating “authentic American voices” and using AI to generate content rapidly and convincingly. The focus on swing states and the use of influence firms adds layers of complexity to this digital manipulation.

The candidates (likely) use it: While it’s unclear to what extent the republicans and democrats themselves are using AI, its presence is undeniable. From deepfake videos to AI-generated campaign materials in so-called MAGA land, the line between reality and digital fabrication is blurring. The potential for personalized propaganda tailored to individual voters’ psychological profiles raises alarming ethical questions. It could well be that content is created by enthusiastic political followers, not unimaginable given how polarised US politics has become, but it is hard not to imagine active engagement.

Musk uses it: Elon Musk’s AI chatbot, Grok, integrated with X (formerly Twitter), has already stirred controversy. Its “anti-woke” stance and tendency to give “spicy” answers have led to the spread of misinformation about election procedures. The swift response from election officials highlights the ongoing battle between AI-generated content and factual information.

Taylor uses it (as an argument): In a surprising turn of events, pop icon Taylor Swift cited fears around AI as a key factor in her endorsement of Kamala Harris. Swift’s concern over AI-generated deepfakes falsely depicting her endorsement of Trump underscores the technology’s potential to manipulate public opinion on a massive scale.

Beyond November: The Long-Term Impact? The influence of AI on elections won’t end with the final vote count. Experts warn that AI could become a convenient scapegoat for whichever candidate loses, potentially fuelling further distrust in the democratic process. The technology’s ability to generate convincing fake content may lead to prolonged disputes over election results and deepen existing political divides. Not to mention eroding public trust.

Takeaways: As we navigate this new era of AI-influenced elections, several key questions emerge:

How can we maintain the integrity of democratic processes in the face of increasingly sophisticated AI manipulation?
What role should tech companies play in regulating AI-generated content on their platforms?
How can voters be educated to critically evaluate the information they encounter online?
What legal and ethical frameworks need to be developed to address the use of AI in political campaigns?

The convergence of AI and politics is no longer a distant future scenario – it’s our present reality. From cat memes to election memes, AI’s influence continues to grow, challenging our perceptions and testing the resilience of our democratic institutions. As we move forward, vigilance, education, and adaptive regulation will be key to ensuring that AI enhances rather than undermines our political processes.

At ExoBrain, we remain committed to monitoring these developments and providing insights to help navigate this complex landscape. The future may be AI-driven, but it’s up to us to steer it in the right direction.

EXO

Weekly news roundup

This week’s news highlights significant investments in AI companies, advancements in enterprise AI solutions, ongoing regulatory discussions, and innovative research in language models and hardware developments for AI applications.

AI business news

OpenAI in talks to raise funds at $150 bln valuation, Bloomberg News reports (Indicates the immense financial interest and potential in leading AI companies.)
Salesforce announces Agentforce autonomous AI agent builder platform (Showcases the growing trend of no-code AI solutions for businesses.)
Enterprise AI search platform Glean raises $260M+ on $4.6B valuation (Demonstrates the high value placed on AI-powered enterprise search solutions.)
Growing demand for AI helps UiPath deliver strong revenue beat and upbeat guidance (Illustrates how AI integration is boosting performance in the automation sector.)
Musk’s xAI Has Discussed Deal for Share in Future Tesla Revenue (Highlights potential synergies between AI companies and established tech giants.)

AI governance news

Google’s GenAI facing privacy risk assessment scrutiny in Europe (Underscores the increasing regulatory focus on AI and data privacy in Europe.)
The importance of designing adaptable cybersecurity regulatory frameworks as AI threats advance (Emphasises the need for flexible regulations to address evolving AI-related cybersecurity challenges.)
Sixty countries endorse ‘blueprint’ for AI use in military; China opts out (Highlights international efforts to establish guidelines for military AI use and notable abstentions.)
US, Britain, EU to sign first international AI treaty (Signals a major step towards global AI governance and standardisation.)
UK Lords push bill to tame rogue algorithms in public sector (Demonstrates efforts to regulate AI use in government services and ensure accountability.)

AI research news

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers (Explores the potential of AI in generating innovative research concepts, relevant for academic and R&D professionals.)
Planning In Natural Language Improves LLM Search For Code Generation (Suggests advancements in AI-assisted programming, of interest to software developers and AI researchers.)
AI can change belief in conspiracy theories, study finds (Highlights the potential impact of AI on public opinion and information dissemination.)
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale (Presents research on AI agents for operating systems, relevant for UX designers and software engineers.)
LLaMA-Omni: Seamless Speech Interaction with Large Language Models (Introduces advancements in speech-based AI interactions, of interest to voice UI developers and researchers.)

AI hardware news

New iPhone will use Arm’s chip technology for AI, FT reports (Indicates the growing importance of AI capabilities in consumer electronics.)
Blackwell will land in Q4, Nvidia CEO assures AI faithful (Highlights anticipated advancements in AI-specific hardware from a leading manufacturer.)
Apple Intelligence Promises Better AI Privacy. Here’s How It Actually Works (Explores innovative approaches to balancing AI capabilities with user privacy concerns.)
Oracle’s Missteps in Cloud Computing Are Paying Dividends in AI (Demonstrates how past challenges can lead to unexpected advantages in the AI sector.)
Samsung Begins Industry’s First Mass Production of QLC 9th-Gen V-NAND for AI Era (Showcases advancements in storage technology tailored for AI applications.)