ExoBrain

ExoBrain Weekly

Manus agent hype, Copyright battles pit tech against creators, and Gemini’s native image mode arrives

Welcome to our weekly newsletter, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our Exo agents.

This week we look at:

  • Manus agent hype

    The viral success of the Manus AI agent demonstrates that product engineering enhancing familiar conversational interfaces may drive adoption more effectively than radical technological innovation.

  • Copyright battles pit tech against creators

    Tech giants OpenAI and Google lobby for unrestricted AI training on copyrighted material, sparking significant backlash from creators and governments concerned about intellectual property rights and fair compensation.

  • Gemini’s native image mode arrives

    Google enables native image generation in Gemini 2.0 Flash, offering seamless multimodal capabilities that allow users to create and edit images with simple text commands.

Manus agent hype

The viral success of the Manus AI agent demonstrates that product engineering enhancing familiar conversational interfaces may drive adoption more effectively than radical technological innovation.

Joel Miller

Joel Miller

3 min read
Manus agent hype

This week, Manus, a new AI agent from Chinese startup Monica.im, “went viral” capturing the attention of numerous AI commentators and polarising camps into the breathless and the sceptical. Marketed cleverly through scarce invite codes (ExoBrain is still waiting, somewhere in a supposed 2 million strong queue) and influencer-driven promotion, Manus rapidly dominated the headlines, labelled by some as another “DeepSeek moment” for China, or even a glimpse of AGI.

Yet, Manus isn’t ground-breaking in technological innovation terms in the same way R1 was. It’s essentially an interface built around Anthropic and Alibaba’s existing Claude and Qwen model, augmented with web browsing and command-line tools. Its strength lies not in pioneering new AI techniques, but in creatively enhancing the conversational interface users already know well.

Interestingly, Manus predominantly operates by executing code to complete varied tasks, an approach inspired by a 2024 research project known as CodeAct, where agents use executable Python code to unify their action space. CodeAct integrates a Python interpreter to dynamically adjust actions based on observations. While you’re waiting for your invite code, the Manus provide a series of recordings where you can see this code heavy process in action. The product is not without significant flaws. Users quickly exposed Manus’s basic security architecture, prompting the agent to share parts of its source code, revealing internal details and its reliance on Claude 3.7 Sonnet. Moreover, despite impressive demonstrations of tasks such as property research and financial analysis, many users found real-world performance inconsistent or exaggerated.

The Manus affair reveals important lessons about the types of innovation users actually want. While significant efforts go into refining chatbot windows or conversely experimenting with radical new interfaces such as Flora’s “intelligent canvas,” there’s a noticeable gap. Users often resist radically new paradigms due to the comfort of chat. Manus addresses this issue, materially extending the familiar chat model by embedding autonomous, agent-like features. The hype and criticism surrounding Manus highlight genuine user interest for conversational AI that moves beyond simple exchanges to achieve meaningful automation and practical productivity boosts. Manus’s approach, extending rather than reinventing conversational interfaces, offers a potential template for future UX innovation.

As AI policy commentator Dean Ball stated; “The Western AGI obsession makes us want to conceptualise [it] as one godlike model that can do everything, and we implicitly dismiss product engineering and practical applications. You see that reflected in public policy, which is obsessed with big models, giant datacentres, and similar infrastructure. Those are the only things we seem to take seriously and value.” Could it be that clever product design and not scale might unshackle AI?

Takeaways: Looking forward, Manus demonstrates the value of AI product engineering that explores the space between familiarity and novelty. As we reach the limits of refining traditional chatbots and encounter resistance adopting future paradigms, the “conversational-plus” space Manus occupies might become the centre of gravity for the next AI developments. Products like Google’s NotebookLM, Cursor for development, and now Manus suggest the next wave in AI won’t just rely on more powerful models, but rather smarter ways of using models to deliver agentic capabilities through interfaces we intuitively trust. Manus isn’t revolutionary, but its success signals clear demand for conversational AI that combines familiarity with genuine autonomy. Expect this product space—intuitive yet powerful conversational agents—to become a new frontier of practical innovation.

Gemini’s native image mode arrives

Google enables native image generation in Gemini 2.0 Flash, offering seamless multimodal capabilities that allow users to create and edit images with simple text commands.

ExoBrain

1 min read

Google’s Gemini 2.0 Flash saw native image generation capabilities enabled this week, and finally we get to see what a truly multi-modal model can do. Unlike previous systems that relied on separate models working together, Gemini integrates everything in one. The result? Simple text commands produce remarkably accurate image, edits, images with text and iterative variations in seconds.

As shown here, a famous artwork transformed with a single instruction. No complex prompting or technical knowledge required. This represents the first time a major tech company has shipped such seamless multimodal capabilities directly to consumers.

Takeaways: This tech will make image creation and now editing accessible to everyone. Expect new creative possibilities. We’re also likely to see applications we haven’t even imagined yet, perhaps in education, healthcare visualisation, or real-time collaborative storytelling. The race for multimodal AI leadership has entered a new phase, with Google currently in the lead.