Week 11 news

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This weeks news demonstrates that large language models are increasingly breaking free from their chatbot constraints. Whilst we’re seeing demos rather than live products, the developments are indicative of a diversification of impact and the building blocks for general purpose use and intelligence. General takeaway; imagine these use-cases in 6-months, when we can ‘drop’ in new models that are massively faster, cheaper and more intelligent…

  • A coding AI called Devin, that can chat, research, write, and run code excites and unsettles the software development community

  • An impressive update from robotics firm Figure + OpenAI, presaging the rise of the large language robots, imbued with conversational features, vision understanding, and reasoning

  • AI doesn’t just want to be free in a physical sense; Musk promises to open-source xAI’s Grok and add to a growing array of free AI models, many of which you can download and run on a laptop

The key themes this week…

Meet Devin the AI coder

This week saw demos from a Peter Thiel backed startup called Cognition AI. The company is run by several noted medal winning coders, who have been training AI to perform complex real world programming tasks. They call their automated ‘software engineer’ Devin, and it can build and debug a whole application from a single prompt. The system has 4 main components; a familiar chat window, plus a command window, a code editor and a browser. One of the demos shows Devin working out how to train its own AI models… this is a glimpse into a world where AI systems can build software on-demand and improve or extend themselves. In the more immediate term, Devin, like systems from MultiOnMagic.aiRabbit and others show that LLMs are becoming smart enough to use the browser and other tools to effectively coordinate and execute actions beyond just generating content and images. Say hello to ‘action models’.

Takeaways: Software development is ahead of most other job sectors in adopting AI. As such its a window on the future AI disruption in ‘high-cognition’ jobs such as financial analysis, scientific research, healthcare, legal services and education. Across these domains we will also start to see technological job displacement and occupational polarization. As AI is adopted, a growing polarisation or divide develops between high-skill roles versus low-skill roles ((in the case of software it will be advanced AI development or strategic innovation versus basic documenting processes or data gathering and labelling), while middle-skill roles rapidly diminish. AI job displacement across these domains will trigger new dynamics in workforce demand, economic opportunity, social and political dimensions, but also increase the power malicious actors can wield.

The rise of the large language robots

Recently we reported that Figure AI raised $675 million at a $2.6 billion valuation from investors including Jeff Bezos, Nvidia, Microsoft, Intel and… OpenAI. This week they demonstrated the progress made on integrating OpenAI’s models into their humanoid machines, with this video of the robot conversing with a person, dextrously picking up items and dealing with… some washing-up. Figure have a bold vision for robots; replacing all manual labour and like Devin, manufacturing more of themselves.

Whilst AI is disrupting the virtual world, it has also been behind a dramatic acceleration in ‘embodied’ robotic capabilities in 2023-24. We’ve all seen the Boston Dynamics videos of acrobatic androids, but LLMs are radically improving fine motor skills, detailed vision capability, memory and human interactions. Figure 01 and other robots from the likes of TeslaAgility and 1X use multiple AI models to plan ahead, decode sensor inputs and control complex actions in parallel.

Takeaways: With BMW starting a factory trial of Figure 01s (supposedly they are ~$100k per unit), and Goldman Sachs suggesting there will be 250,000 humanoid robots joining our workforce every year by the end of the decade the impact AI will have in this most physical form will be significant. What is less well known is that simulators, or game-like worlds are providing the primary training-ground for these new robotic developments. From DeepMind’s AlphaGo, Open AI winning in Dota esports, or even F1 drivers training in the sim, virtual worlds are where new solutions can be perfected. AI simulators will be ever more important in devising, researching, training and testing new solutions of any kind; this will be a capability that every business will need to understand and exploit to stay competitive.

AI wants to be free (of charge)

Fresh from launching rockets and filing his law suite against OpenAI (and promising to drop it if they change their name to “ClosedAI”) Elon Musk this week stated he would open up his Grok AI to the world. No details were available as we posted this, but this is likely to be billed as a ‘truth-seeking’ AI and as such may come with open training data and code.

This will add to hundreds of thousands of AI models now available to download from community websites, and runnable on your own laptop or desktop. As the state-of-the-art (SOTA) systems have been getting bigger, many much more compact models are now available for a range of specific uses; writing, coding, chatting, image generation etc.

Takeaways: Look out for the term open source in the context of AI models; its mostly used incorrectly. Models that are available for download such as Meta’s Llama 2 or from Mistral are ‘open-weight’, meaning the file containing the model can be copied and run by anyone. But… this is not open-source software. The code and data used to train them are rarely shared, and the ‘weights’ or patterns encoded into their neural networks are a meaningless mass of numbers to the human eye.

Nonetheless with a medium power laptop or desktop (and if you have a gaming GPU all the better) you can download apps like LM Studio. Once installed search for models which you can run for just the cost of the electricity. LM Studio will find models that are popular and indicate which size (quantisation) of model will run on your machine. Try searching for Google’s Gemma 2 (a ~2 billion parameter model) and get a capable AI assistant that will run in 8GB of local RAM, without the need for an internet connection or subscription fee. Look for LLM360s models for true open-source with everything from training data to underlying code shared with the community. The future is shaping up to see AIs of many sizes and skillsets running on every device from routers, washing machines, phones, and wearables to robots, drones and autonomous vehicles. This won’t be without its risks as this analysis of open-weight foundation models indicates.

ExoBrain x Gemini 1.5

On a final note, this week we got access to the pre-release version of Google Gemini 1.5 with its ability to process 1 million tokens. We were able to use this to speed up the news gathering process for example. We took a weeks worth of tech news; articles, social media posts, research papers, and forum conversations totalling some ~500,000 words, and Gemini 1.5 was able read all of this in just over 90 seconds and then select, analyse and expand on all the key themes and stories. Gemini 1.5 is still very much pre-release with a number of bugs, but the potential here for research automation, enterprise search, formulating content, decision making and prediction based on large volumes of unstructured content (with minimal preparation) is huge.

ExoBrain symbol

EXO

This week’s AI news roundup highlights the rapid diversification of AI beyond chatbots, with advancements in enterprise software, 3D environments, and multimodal understanding. The growing power of AI is being fuelled by innovations in hardware, infrastructure, and algorithms, as well as increasing investments and competition in the industry.

AI business news

AI governance news

AI research news

AI hardware news

  •  

Week 29 news

Language models do the math, MA(AI)GA, and intelligence too cheap to meter?

Week 28 news

Bursting the bubble narrative, reimagining public sector productivity, and the age of reason

Week 27 news

A tale of two elections, agents untethered, and the art of conversation

Week 26 news

Claude 3.5 Sonnet hits the high notes, the rise of the AI engineer, and Figma’s new creative toolkit