Week 26 news

June 28, 2024

Welcome to our weekly news post, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our AI platform Exo…

Themes this week

JOEL

This week we look at:

Anthropic’s Claude 3.5 Sonnet release and new productivity features.
The World Fair and rise of the AI engineer.
Figma’s new design features from this week’s Figma Config 24.

Anthropic’s new model and features

Anthropic surprised many with a new model launch last week, releasing a version upgrade to their Claude family just 3-months since their last big update. They launched Claude 3.5 in ‘Sonnet’ form (a supposedly mid-sized model) but a week in, the industry and user response has been universally positive. The model is not only superior to all the other Claude variants, but for many has surpassed Open AI’s GTP-4o to become the ultimate AI on the planet.

Anthropic, despite being seen as the big AI lab with the biggest focus on safety and cautious progress now appear to be pushing the frontier forward faster than anyone. Dario Amodei, Anthropic co-founder, told VentureBeat: “Claude 3.5 Sonnet is now the most capable, smartest, and cheapest model available on the market today.” Its 2x faster than Claude 3 Opus at a 5x lower cost.

The model is particularly strong at coding and vision understand, and we at ExoBrain can vouch for its superhuman coding skills and instant responses. The speed of feedback loops and the scope of what’s possible has dramatically increased with the new version, even if the sophistication of thought remains nearer to Claude 3 Opus levels. Social media has been full of entire games, such as working versions of Doom with auto-generated levels being developed from a single prompt. There has been much speculation on how Anthropic has been able to pull this off, they have likely benefitted from the scale of compute their backers Amazon can provide, increasing the size of the model but also improving efficiency and capability with ever more carefully curated synthetically generated data.

Whilst OpenAI have had to back-track on plans to release their controversial voice mode, Anthropic have also been busy designing new ways to interact. These centre around two interesting new concepts; ‘artifacts’ and ‘projects.’ These feel much more intuitive than the chat thread has felt to date when working on common business tasks. Artifacts get created when you work with the model to write code or a document for examples. Instead of having numerous steps in a conversion with snippets and versions of the collaborative work, the artifact window pops up and allows for the changes to be reflected there as you go. This feels much more organised. OpenAI launched custom GPTs last year, but they have failed to catch-on. Projects is Anthropic’s version of this, but on early testing it feels a more natural approach. A Claude project can have multiple chats, but with special instructions and custom uploaded ‘knowledge’, and the artifacts being worked on in that project can be shared across threads. The approach is not yet perfect, we found in our testing that the model would often forget custom instructions, but early bugs aside, this user experience is going in a very positive direction.

Finally, Anthropic have also broken new ground with a beta their Steering API. This offers a glimpse into the future of AI manipulation by allowing developers to influence the internal features of the language model (much like they demonstrated with their Golden Gate Claude experiment, forcing a version of Claude 3 to become entirely obsessed by the bridge). This opens up new possibilities for customisation and fine-tuning of AI outputs. This could lead to highly specialised AI assistants tailored for specific industries or tasks.

Takeaways: With Claude 3.5, projects and artefacts, we believe Anthropic have the strongest subset of features on the market today. They’re available on the pro plan and we would highly recommend you explore this option. With these new features and plans to release new models every few months, OpenAI, Microsoft and Google should be worried.

AI Engineer World Fair

The AI Engineer World Fair in San Francisco this week showcased the growth of the community of developers focused on building AI powered products, with attendance quadrupling to 2,000 since the first such event in October year. Sean Wang (otherwise known as Swyx) host of the Latent Space podcast, conference organiser and author of the influential essay “Rise of the AI Engineer”, emphasised the dramatic acceleration in this new role. “A wide range of AI tasks that used to take five years and a research team to accomplish, now just require a spare afternoon,” Wang explained. This shift underscores the increasing accessibility of AI technology and the role of AI engineers in quickly translating capabilities. The AI Engineer role is positioned as a link between the more research-oriented machine learning and data science roles and the more product-oriented software engineering roles. AI Engineers work primarily with existing models and APIs to create practical applications, rather than developing new ML models from scratch.

Friend of ExoBrain, AI Engineer, and agent expert Eddie Forson shares the following insights from conference floor, highlighting the energy and excitement while noting the sense of a field still in its infancy. Eddie observed that AI agents are still unreliable, with “agents on rails” (predetermined workflows) being the safer option over unpredictable dynamic configurations. His sentiments from the conference include the critical importance of evaluation frameworks and quality assurance and building for the future: “Models are getting better fast. You should build with the future in mind. Imagine what you will be able to accomplish with better models in 3-6-12+ months, not now”.

Labs, startups and big tech were all present demoing and launching a range of new features across the main tracks of RAG & LLM frameworks, open models, AI leadership, code generation and dev tools, AI in the Fortune 500, multimodality, evaluation, ops, GPUs and inference, and of course agents. There was a lot of interest and ideas from speakers on the potential for agents to transform workflows. From enhancing productivity in traditional industries to creating entirely new categories of products and services, the applications remain tantalising. The event also highlighted the growing ecosystem of tools and platforms designed to make AI development more efficient and accessible. From advanced evaluation testing to specialised cloud infrastructure, these innovations are enabling AI engineers to build and deploy solutions faster than ever before.

Takeaways: The AI Engineer World Fair provides concentrated access to a field evolving at breakneck speed. The videos on YouTube are worth your time; some are quite technical, others philosophical, but they all describe the components, trends and ideas that will make the next phase of AI products and development possible.

EXO

Figma’s new AI features

Following various AI design announcements from Adobe and Canva, this week Figma unveiled a suite of AI-powered tools at Figma Config 24 aimed at revolutionising design, presentations and product development workflows. Figma’s new AI features, currently in limited beta, promise to generate design drafts from text prompts, facilitate visual searches across team files, automate tedious tasks, and even create working prototypes from static mockups. “In a world where more software is being created and reimagined because of AI, designing and building products is everyone’s business,” said Dylan Field, Figma’s co-founder and CEO.

This development comes at a time when the integration of AI into creative workflows is rapidly accelerating. Adobe (Figma’s one-time suitor until the deal fell through), recently faced backlash over concerns about user data being used to train its Firefly AI models. In contrast, Figma has emphasised that its AI features use third-party models, and that no private customer data was used in training.

The introduction of Figma AI, alongside new tools like Figma Slides and developer-focused features, signals a broader trend of design platforms evolving into comprehensive product development ecosystems. For individual users, these AI-powered tools offer the promise of enhanced creativity and productivity. The ability to generate design drafts from text prompts or quickly prototype ideas could lower the barrier to entry for aspiring designers and entrepreneurs.

Takeaways: Looking ahead, the integration of AI into design tools is likely to accelerate the convergence of design and development processes. Figma’s new “Ready for Dev” view and Code Connect feature hint at a future where the handoff between design and implementation becomes increasingly seamless. This could lead to more rapid product development cycles but may also necessitate new approaches to project management and quality assurance.

EXO

Weekly news roundup

This week’s news highlights the rapid advancements in AI technology, from more accessible avatar creation tools to the ongoing debates around responsible AI development and deployment.

AI business news

Synthesia makes AI avatar creation more accessible for enterprises (Demonstrates the growing accessibility of AI-powered tools for enterprises, a trend we’ve discussed in the context of democratising AI.)
Gmail’s Gemini AI sidebar and email summaries are rolling out now (Highlights how AI is being integrated into mainstream productivity tools, improving user experience and efficiency.)
OpenAI launches ChatGPT app for Mac, but delays release of more advanced chat capabilities (Showcases the continued evolution of large language models and the challenges of balancing innovation with responsible development.)
Stability AI appoints new CEO and closes funding round reportedly worth $80M (Demonstrates the ongoing investment and growth in the AI industry, a trend we’ve been tracking.)
How 2 high school teens raised a $500K seed round for their API startup (yes, it’s AI) (Highlights the entrepreneurial spirit and innovation in the AI space, even among young founders.)

AI governance news

Major record companies sue Suno, Udio for ‘mass infringement’ of copyright (Demonstrates the ongoing legal and regulatory challenges around the use of AI in creative industries, a topic we’ve discussed before.)
Patch now: ‘Easy-to-exploit’ RCE in open source Ollama (Highlights the importance of security and responsible development in the AI ecosystem, a key concern for our readers.)
UK needs system for recording AI misuse and malfunctions, thinktank says (Emphasises the need for robust governance frameworks to address the risks and challenges of AI deployment, a recurring theme in our coverage.)
AI beats students in UK university exams, fooling human educators (Raises concerns about the potential misuse of AI in education and the need for appropriate safeguards, a topic we’ve discussed in the context of AI’s impact on various industries.)
Apple shelved the idea of integrating Meta’s AI models over privacy concerns, report says (Demonstrates the ongoing tensions between technological advancement and data privacy, a key consideration for our readers.)

AI research news

Symbolic Learning Enables Self-Evolving Agents (Highlights the latest advancements in AI research, exploring the potential of symbolic learning and self-evolving agents, an area of growing interest.)
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data (Showcases the impressive capabilities of large language models in extracting insights from diverse data sources, a key area of AI research.)
Evolutionary Scale · ESM3: Simulating 500 million years of evolution with a language model (Demonstrates the application of AI in modelling complex natural phenomena, such as evolutionary processes, a fascinating area of research.)
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges (Highlights the importance of evaluating the alignment and vulnerabilities of large language models, especially in sensitive applications like decision-making.)
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation (Showcases the development of new benchmarks for evaluating the performance and alignment of AI systems in personalized image generation, a key area of research.)

AI hardware news

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs (Demonstrates the ongoing advancements in AI hardware, with the potential for more efficient and cost-effective solutions, a topic we’ve covered in the past.)
Samsung teases investment to get into the GPU game (Highlights the growing competition in the AI hardware market, as major players like Samsung expand their offerings, a trend we’ve been tracking.)
Axelera lands new funds as the AI chip market heats up (Demonstrates the continued investment and growth in the AI chip market, a key enabler of AI innovation.)
AI chip startup Axelera raises $68M to broaden offerings from edge to cloud (Highlights the importance of AI hardware solutions that can span from the edge to the cloud, a topic we’ve discussed in the context of distributed AI systems.)
Innatera books $21M in funding for its ultra-low-power AI chips (Showcases the growing demand for energy-efficient AI hardware, especially in edge computing applications, a key focus area for our readers.)