Week 11 news

March 15, 2024

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This weeks news demonstrates that large language models are increasingly breaking free from their chatbot constraints. Whilst we’re seeing demos rather than live products, the developments are indicative of a diversification of impact and the building blocks for general purpose use and intelligence. General takeaway; imagine these use-cases in 6-months, when we can ‘drop’ in new models that are massively faster, cheaper and more intelligent…

A coding AI called Devin, that can chat, research, write, and run code excites and unsettles the software development community
An impressive update from robotics firm Figure + OpenAI, presaging the rise of the large language robots, imbued with conversational features, vision understanding, and reasoning
AI doesn’t just want to be free in a physical sense; Musk promises to open-source xAI’s Grok and add to a growing array of free AI models, many of which you can download and run on a laptop

The key themes this week…

Meet Devin the AI coder

This week saw demos from a Peter Thiel backed startup called Cognition AI. The company is run by several noted medal winning coders, who have been training AI to perform complex real world programming tasks. They call their automated ‘software engineer’ Devin, and it can build and debug a whole application from a single prompt. The system has 4 main components; a familiar chat window, plus a command window, a code editor and a browser. One of the demos shows Devin working out how to train its own AI models… this is a glimpse into a world where AI systems can build software on-demand and improve or extend themselves. In the more immediate term, Devin, like systems from MultiOn, Magic.ai, Rabbit and others show that LLMs are becoming smart enough to use the browser and other tools to effectively coordinate and execute actions beyond just generating content and images. Say hello to ‘action models’.

Takeaways: Software development is ahead of most other job sectors in adopting AI. As such its a window on the future AI disruption in ‘high-cognition’ jobs such as financial analysis, scientific research, healthcare, legal services and education. Across these domains we will also start to see technological job displacement and occupational polarization. As AI is adopted, a growing polarisation or divide develops between high-skill roles versus low-skill roles ((in the case of software it will be advanced AI development or strategic innovation versus basic documenting processes or data gathering and labelling), while middle-skill roles rapidly diminish. AI job displacement across these domains will trigger new dynamics in workforce demand, economic opportunity, social and political dimensions, but also increase the power malicious actors can wield.

The rise of the large language robots

Recently we reported that Figure AI raised $675 million at a $2.6 billion valuation from investors including Jeff Bezos, Nvidia, Microsoft, Intel and… OpenAI. This week they demonstrated the progress made on integrating OpenAI’s models into their humanoid machines, with this video of the robot conversing with a person, dextrously picking up items and dealing with… some washing-up. Figure have a bold vision for robots; replacing all manual labour and like Devin, manufacturing more of themselves.

Whilst AI is disrupting the virtual world, it has also been behind a dramatic acceleration in ‘embodied’ robotic capabilities in 2023-24. We’ve all seen the Boston Dynamics videos of acrobatic androids, but LLMs are radically improving fine motor skills, detailed vision capability, memory and human interactions. Figure 01 and other robots from the likes of Tesla, Agility and 1X use multiple AI models to plan ahead, decode sensor inputs and control complex actions in parallel.

Takeaways: With BMW starting a factory trial of Figure 01s (supposedly they are ~$100k per unit), and Goldman Sachs suggesting there will be 250,000 humanoid robots joining our workforce every year by the end of the decade the impact AI will have in this most physical form will be significant. What is less well known is that simulators, or game-like worlds are providing the primary training-ground for these new robotic developments. From DeepMind’s AlphaGo, Open AI winning in Dota esports, or even F1 drivers training in the sim, virtual worlds are where new solutions can be perfected. AI simulators will be ever more important in devising, researching, training and testing new solutions of any kind; this will be a capability that every business will need to understand and exploit to stay competitive.

AI wants to be free (of charge)

Fresh from launching rockets and filing his law suite against OpenAI (and promising to drop it if they change their name to “ClosedAI”) Elon Musk this week stated he would open up his Grok AI to the world. No details were available as we posted this, but this is likely to be billed as a ‘truth-seeking’ AI and as such may come with open training data and code.

This will add to hundreds of thousands of AI models now available to download from community websites, and runnable on your own laptop or desktop. As the state-of-the-art (SOTA) systems have been getting bigger, many much more compact models are now available for a range of specific uses; writing, coding, chatting, image generation etc.

Takeaways: Look out for the term open source in the context of AI models; its mostly used incorrectly. Models that are available for download such as Meta’s Llama 2 or from Mistral are ‘open-weight’, meaning the file containing the model can be copied and run by anyone. But… this is not open-source software. The code and data used to train them are rarely shared, and the ‘weights’ or patterns encoded into their neural networks are a meaningless mass of numbers to the human eye.

Nonetheless with a medium power laptop or desktop (and if you have a gaming GPU all the better) you can download apps like LM Studio. Once installed search for models which you can run for just the cost of the electricity. LM Studio will find models that are popular and indicate which size (quantisation) of model will run on your machine. Try searching for Google’s Gemma 2 (a ~2 billion parameter model) and get a capable AI assistant that will run in 8GB of local RAM, without the need for an internet connection or subscription fee. Look for LLM360s models for true open-source with everything from training data to underlying code shared with the community. The future is shaping up to see AIs of many sizes and skillsets running on every device from routers, washing machines, phones, and wearables to robots, drones and autonomous vehicles. This won’t be without its risks as this analysis of open-weight foundation models indicates.

ExoBrain x Gemini 1.5

On a final note, this week we got access to the pre-release version of Google Gemini 1.5 with its ability to process 1 million tokens. We were able to use this to speed up the news gathering process for example. We took a weeks worth of tech news; articles, social media posts, research papers, and forum conversations totalling some ~500,000 words, and Gemini 1.5 was able read all of this in just over 90 seconds and then select, analyse and expand on all the key themes and stories. Gemini 1.5 is still very much pre-release with a number of bugs, but the potential here for research automation, enterprise search, formulating content, decision making and prediction based on large volumes of unstructured content (with minimal preparation) is huge.

EXO

This week’s AI news roundup highlights the rapid diversification of AI beyond chatbots, with advancements in enterprise software, 3D environments, and multimodal understanding. The growing power of AI is being fuelled by innovations in hardware, infrastructure, and algorithms, as well as increasing investments and competition in the industry.

AI business news

Oracle Adds 50 New Generative AI Capabilities to Oracle Fusion Cloud Applications Suite (This development showcases the growing integration of generative AI into enterprise software suites.)
OpenAI announces new board members, reinstates CEO Sam Altman (This leadership change at OpenAI is significant given the company’s influential role in the AI landscape.)
Anthropic releases Claude 3 Haiku, an AI model built for speed and affordability (This new model from Anthropic highlights the ongoing efforts to make AI more efficient.)
Cohere releases powerful ‘Command-R’ language model for enterprise use (This release underscores the growing trend of specialized AI models for business applications, akin to the SaulLM-7B model for the legal industry that we covered earlier.)
Microsoft unveils Copilot GPT Builder for Pro subscribers (This new offering from Microsoft further demonstrates the company’s push to integrate AI into its products and services, building on their partnership with OpenAI and the development of AI-powered tools.)

AI governance news

What will the EU’s proposed act to regulate AI mean for consumers? (This legislation highlights the increasing focus on AI governance and the potential impact on end-users, echoing the concerns raised in previous stories about the need for robust regulatory frameworks.)
Drone Swarms Are About to Change the Balance of Military Power (This article underscores the transformative potential of AI in the military domain and the need for international cooperation and governance to address the associated risks and challenges.)
Stealing Part of a Production Language Model (This story further emphasizes the importance of intellectual property protection and security in the AI industry, as we saw with the case of the Chinese national charged with stealing AI secrets from Google.)
AI Extinction-Level Threat Discussed in U.S. Report (This report reflects the growing concern among policymakers about the existential risks posed by advanced AI systems and the need for proactive measures to ensure their safe and responsible development, a theme that has emerged in several of our previous discussions.)
NYT responds to OpenAI’s allegations of ChatGPT manipulation (This development adds another layer to the ongoing debate around IP.)

AI research news

SIMA generalist AI agent for 3D virtual environments (This research advances the development of AI agents capable of interacting with and understanding 3D environments, building on the progress made in areas like video understanding and generation, as exemplified by models like Sora.)
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context (This update to Google’s Gemini model pushes the boundaries of AI’s ability to process and understand large amounts of multimodal data, a key theme in several of the research papers we’ve discussed, such as LongRoPE.)
OpenAI Transformer Debugger (This tool could potentially help researchers and developers better understand and optimize transformer-based AI models, which have been at the forefront of many of the breakthroughs we’ve covered, including GPT-4 and Claude.)
GaLore: A Memory-Efficient Strategy for Training Large Language Models on Consumer Hardware (This research addresses the computational challenges associated with training large language models, a recurring theme in our discussions about the need for advanced hardware and more efficient algorithms to support the continued growth of AI.)
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation (This paper explores techniques for improving AI’s ability to engage in context-aware reasoning and generate coherent long-form content, building on the progress made by models like Gemini 1.5)

AI hardware news

Cerebras Unveils World’s Fastest AI Chip (This development highlights the ongoing competition and innovation in the AI hardware space, as companies strive to create faster and more powerful chips to support the growing demands of AI workloads.)
‘Another startup that will cause gaming GPU prices to spike’: AI firm claims Radeon RX 7900 XTX GPUs are better value than Nvidia’s H100 — nearly six hundred backers believe that is the case (This story underscores the increasing demand for high-performance GPUs in the AI industry and the potential impact on the gaming market, a trend that has been evident in the recent success of companies like Nvidia.)
Korean researchers power-shame Nvidia with new neural AI chip — claim 625 times less power draw, 41 times smaller (This research showcases the ongoing efforts to develop more energy-efficient AI hardware, a crucial consideration given the growing environmental impact of AI and the need for sustainable solutions, as highlighted by Amazon’s acquisition of a data centre powered by a nuclear power station.)
Truffle-1 is an AI inference engine designed to run opensource models at home, on 60 Watts. (This new AI hardware offering adds to the diverse range of solutions being developed to support the growing AI industry, alongside players like Groq, Taalas, and Nvidia.)
Building Meta’s Giant AI Infrastructure (This article provides insights into the infrastructure challenges and solutions associated with deploying large-scale generative AI models, a key focus for major tech companies like Meta as they seek to harness the power of AI across their products and services.)