ExoBrain Weekly Newsletter22 May 2026

Google's grand bazaar, compute as commodity, and whether AI costs more than space travel

Welcome to our weekly newsletter, a combination of thematic insights from the founders at ExoBrain, and a broader news roundup from our Exo agents.

This week we look at:

Google's grand bazaar
Google I/O 2026 launched Gemini 3.5 Flash, Omni, Spark and Antigravity 2.0 alongside dozens of other AI products. Google has nearly every asset to lead the next phase of AI, but still struggles to converge on a coherent product spine.
The compute commodity
AI compute now behaves like a maturing commodity market: production costs are still falling, but memory bandwidth scarcity, reasoning verbosity and reliability tiers are reshaping how the frontier prices inference. Cost per outcome is replacing cost per token.
Is AI more expensive than space travel?
SpaceX's record S-1 filing reveals an AI company underneath the rockets, with AI accounting for 93% of the claimed $28.5 trillion TAM and 76% of Q1 capex. The contract powering it all is cancellable on 90 days notice.
News roundup
Leadership resets, AI job cuts, frontier-lab governance fights, multi-agent research and the next turn in AI compute supply.

Google's grand bazaar

Google I/O 2026 launched Gemini 3.5 Flash, Omni, Spark and Antigravity 2.0 alongside dozens of other AI products. Google has nearly every asset to lead the next phase of AI, but still struggles to converge on a coherent product spine.

Joel Miller

22 May 20264 min read

Google I/O 2026 was the biggest AI event of the week, and probably the fullest expression yet of Google's AI strategy. Gemini 3.5, AI Mode, Gemini Omni, Spark, Antigravity 2.0, Beam, Android XR glasses, new Workspace features, new creator tools, new silicon, new subscription tiers. Google has almost every asset it needs to lead the next phase of AI. But it's yet to show that it can turn those assets into products many people trust.

The main model news was Gemini 3.5 Flash, now the default model in the Gemini app and AI Mode in Search. We've been running Gemini 3.5 Flash in daily work since launch. It's fast, and the output quality holds up against pricier tiers for most everyday tasks.

Gemini Omni is the deeper play. Google describes it as a natively multimodal generative model that can take any combination of text, image, audio and video as input and produce coherent output. Video is the launch modality, with image and audio promised later. Sundar Pichai framed it as a step towards world models that can simulate physics, culture and causality, which is straight from the DeepMind playbook on spatial intelligence and embodied understanding.

We don't buy the full "any-to-any" branding yet. Output is video-only at launch, and Veo and Imagen still exist as parallel specialist systems. GPT-4o followed a similar pattern: the "o" stood for omni, but the full omni capability never quite arrived. The bet behind Omni is still the right one. If the next jump in machine intelligence comes from training on the physical world rather than scaling text alone, Omni is Google's production test of that thesis. Google is the company best placed to run it.

Gemini Spark may matter more than it looked on stage. It is a 24/7 background agent running on Google Cloud VMs, integrated with Workspace, and consumer services like Canva and Instacart. You can think of it as a simpler version of OpenClaw.

After those three topics, the event became harder to parse. Universal Cart promised cross-merchant checkout across Search, YouTube, Gemini and Gmail. Beam with Sophie put a lifelike video AI agent inside what used to be Project Starline. Search gained generative UI, a redesigned intelligent Search Box and more AI Mode surface area. Workspace added Gmail Live, Docs Live and Keep Live for voice-driven work. Pics arrived as a Nano Banana 2 powered editing surface. There was Daily Brief, Ask YouTube, Ask Maps, Flow for music and video creation, Googlebooks as a new Android laptop category, four Android XR glasses partnerships, TPU 8t and 8i silicon, and AI Ultra subscription tiers at $100 and $200.

Each announcement is defensible on its own. Together they gave the familiar Google feeling: extraordinary capability, too many fronts, and not enough evidence that the teams are converging on a single intuitive product spine.

The strongest consolidation was in developer tooling, where it matters most for AI coding. Antigravity 2.0 pulls together the original Antigravity, Gemini CLI and Jules onto a shared engine, with the new CLI and the app as the two surfaces. Strategically, that's the right move. Tactically, the migration damaged trust. We were Antigravity 1.0 users. The 2.0 update auto-installed, then failed to authenticate, and once that was fixed we found the IDE mode we relied on had been removed in favour of an agent-only experience. Google hurried an IDE option back in, and provided free tokens to users, but the damage was done.

Takeaways: Google I/O 2026 showed a company with almost everything required to lead the next phase of AI, and not yet enough discipline to make the whole thing feel coherent. Gemini 3.5 Flash proves Google can ship a fast, capable model at scale. Omni puts DeepMind's world-model thesis into production. Spark shows that Google understands agents change computing from sessions to standing instructions. Antigravity 2.0 shows that Google can consolidate when it chooses to, but the migration pain shows how easily it can lose user trust. The bazaar is open and the merchandise is real. The next test is whether Google can close a few stalls and make the best ones indispensable.

The compute commodity

AI compute now behaves like a maturing commodity market: production costs are still falling, but memory bandwidth scarcity, reasoning verbosity and reliability tiers are reshaping how the frontier prices inference. Cost per outcome is replacing cost per token.

Joel Miller

22 May 20263 min read

Google I/O was the main AI story this week. Our lead article focused on the product surface: Gemini 3.5, AI Mode, agents, video and Google's claim that it is now serving 3.2 quadrillion tokens a month. But Google released Gemini 3.5 Flash at $1.50 per million input tokens and $9 output, three times the price of the Flash model it replaces. Meanwhile Anthropic tightened Max plan restrictions and OpenAI launched Guaranteed Capacity, letting enterprises commit to one, two, or three-year reservations of inference compute in exchange for discounts. On the surface this suggests that the anticipated token price squeeze is happening. But it's more that we're seeing the mechanics of a commodity market.

In commodity markets, price starts with production. For AI, production starts with silicon, and that cost curve is still falling. NVIDIA's Blackwell generation lowered cost per million tokens roughly 35 times against Hopper. Epoch AI's analysis shows the price for any fixed capability milestone falling between 9 and 900 times per year over the last three years, with a median of 50. Artificial Analysis data shows the same pattern: for a given band of model intelligence, prices keep stepping down.

Cheaper production doesn't mean cheaper access. Google disclosed at I/O that token volume is now seven times last year's level. If demand rises faster than production efficiency, the clearing price moves to the bottleneck.

For inference, the fundamental bottleneck is the aggregate of deployed memory bandwidth. High-bandwidth memory is the binding constraint on throughput today. SK Hynix and Samsung dominate HBM3E and HBM4 supply, allocations are sold through 2026, and accelerator throughput per watt is gated by memory bandwidth rather than logic-die fabrication. Jonathan Ross, formerly of Groq, put it plainly at Sohn this week: until recently, no one was really trying to squeeze more performance out of memory chips, and now they all are.

That supply response has a lead time. In the meantime, OpenAI is signing 3 GW dedicated inference deals with NVIDIA, and Anthropic similar arrangements with SpaceX. OpenAI's Guaranteed Capacity is best read as a forward contract for inference. Buyers who can lock in multi-year reservations get supply security and a discount. Buyers on standard rates pay a higher effective price for the same delivered tokens, and accept more variance in availability. That's a capacity premium, not a production cost increase.

The unit being traded is also changing. Commodity markets need a unit of account. Tokens used to do the job well enough, because pre-reasoning models made them roughly comparable. A token from one model was not identical to a token from another, but the comparison was usable. Reasoning models break that. Gemini 3.5 Flash defaults to dynamic thinking and burns more tokens per delivered task. Anthropic's Opus 4.7 tokeniser maps the same content to between 1.0 and 1.35 times more billable units. Artificial Analysis benchmark runs cost more on Gemini 3.5 Flash at high effort than on the more expensive-looking Gemini 3.1 Pro.

The list price per token has not become more expensive in any clean sense. The token has become a smaller and more variable unit of work.

Reliability has been productised. OpenAI now sells Standard, Priority, Flex and Scale tiers, with Priority running at roughly 1.5 to 2 times Standard rates for SLA-backed throughput, and Flex offering half-price tokens for asynchronous workloads with possible queuing. Once capacity tightens, the single price splits into peak, off-peak, interruptible and reserved supply.

Underneath the frontier tier, the market has gone the other way. Cursor's Composer 2.5 lists at $0.50 input and $2.50 output per million tokens, with cost per task on coding workloads roughly a tenth of frontier alternatives at comparable quality. Composer is built on the open-weight Kimi K2.5 base, with most of the compute spent on Cursor's own post-training and editor integration. Open-weight models from Kimi, DeepSeek, and Qwen are abundant at the low end of the curve, and the price per useful unit of work in this segment is still falling fast.

The result is a bifurcated market. The frontier is in cost-push inflation driven by capacity scarcity, reasoning verbosity and reliability premiums. The open tier is in continued deflation driven by hardware gains, distillation, fine-tunable weights and better tooling. Two buyers in the same industry can experience opposite price trajectories depending on which segment they rely on.

Cost per token is becoming less informative because the token itself is no longer fungible across models, reasoning depths or tokenisers. Cost per outcome, measured against a defined unit of delivered work, is the metric that still holds up.

Takeaways: AI compute now behaves like a commodity market with falling production costs, a memory-bandwidth bottleneck, reliability premiums, reserved capacity and surplus supply underneath the frontier. Less than 1% of AI users are power-users today, and we're already in a compute shortage. Buyers who handle the next eighteen months well will treat AI compute as procurement, not SaaS subscription management. Audit the basket of models in use, measure cost per delivered outcome rather than per token, lock in capacity only where the work justifies the premium, and build model- and harness-agnostic routing across multiple execution lanes so that a tokeniser change, a quota tightening, or a reservation shortage at any single vendor doesn't rewrite the unit economics of the whole stack.

Is AI more expensive than space travel?

SpaceX's record S-1 filing reveals an AI company underneath the rockets, with AI accounting for 93% of the claimed $28.5 trillion TAM and 76% of Q1 capex. The contract powering it all is cancellable on 90 days notice.

Joel Miller

22 May 20262 min read

This week's chart comes from SpaceX's S-1, the legal filing a company submits to US regulators ahead of an initial public offering. This is envisaged as the largest IPO of all time and it is not your conventional S-1. Alongside the audited accounts and risk factors sit phrases about extending consciousness to the stars and making life multiplanetary, wrapped around a claim to the largest actionable addressable market in human history at $28.5 trillion... enterprise AI.

But if you thought space travel and colonising the stars might be expensive, the eye-watering numbers are actually for the AI business. Despite the name, SpaceX is no longer just rockets and satellite internet. It is now, by its own framing, an AI company. AI accounts for 93% of the claimed TAM, 76% of capital expenditure in Q1 2026, and gets mentioned more than 200 times in the filing. Connectivity, the Starlink business, made money in Q1. Space, the rockets, lost $622 million. AI lost $2.47 billion on $818 million of revenue, with capex running at over $30 billion annualised.

The problem for SpaceX is what that GPU infrastructure is currently doing. Since the February merger folded xAI into the new SpaceXAI division, the Colossus build sits under the same roof as Grok. But instead of powering Grok, it is being rented to Anthropic on a $15 billion a year deal that the customer can cancel with 90 days notice. The largest AI bet in IPO history rests on a contract thinner than a Starlink subscription.

News roundup

Leadership resets, AI job cuts, frontier-lab governance fights, multi-agent research and the next turn in AI compute supply.

AI business news

Sources and documents detail Satya Nadella's effort to revamp Microsoft's senior leadership, creating a startup-style operating model to compete in the AI race (Nadella dismantling Microsoft's decades-old senior leadership structure to build a startup-style operating model signals that even the most mature enterprise AI players are betting that organizational speed is now a competitive weapon.)
Lenovo reports Q4 revenue up 27% YoY to $21.6B, above $18.7B est., net profit up 479% to $521M, above $271M est., as the PC maker pushes into AI server markets (Lenovo's 479% net profit surge and 27% revenue beat, driven by AI server demand, shows hardware incumbents are capturing real margin from the AI buildout, not just riding hype.)
StanChart to cut over 7,000 jobs, boost AI to replace 'lower-value human capital' (Standard Chartered explicitly framing 7,000+ layoffs around replacing "lower-value human capital" with AI sets a blunt new precedent for how banks publicly justify workforce reductions.)
Sources: DeepSeek execs told potential investors in its ongoing $10B round that it will prioritize groundbreaking AI research over short-term commercialization (DeepSeek telling investors in its $10B round that it will prioritise AGI research over commercialisation reveals a strategic posture that puts it on a direct collision course with Western labs, on purpose.)
OpenAI's anticipated IPO gives retail investors opportunity: Redpoint's Brescia (Redpoint's Brescia framing OpenAI's anticipated IPO as a retail-investor moment signals that the next leg of AI capital markets is being designed for participation well beyond the venture and sovereign tiers that have funded the build-out so far.)
OpenAI co-founder Andrej Karpathy joins Anthropic's pre-training team (Karpathy, OpenAI co-founder and longtime independent voice, joining Anthropic's pre-training team is the most senior talent move between frontier labs to date, and a concrete signal of where the field's centre of gravity now sits.)

AI governance news

Trump yanked AI order after David Sacks raised industry concerns (The fact that a single industry figure (David Sacks) could halt a presidential executive order mid-signing reveals just how much regulatory power AI insiders currently wield over Washington.)
Elon Musk loses landmark lawsuit against OpenAI (A jury verdict against Musk sets a legal precedent that nonprofit-to-for-profit AI conversions can be challenged in court, with direct implications for how any AI lab structures its governance.)
New Pentagon task force races to bring powerful AI tools to America's most sensitive networks (The Pentagon's new MYTHOS task force pushing frontier AI models onto America's most classified networks marks a concrete operational shift, not just a policy discussion.)
US Congress authorises $5M prize competition for deepfake detection (Congress embedding a $5M prize competition for deepfake detection inside the signed defense bill turns AI media authentication from a research problem into a funded national security mandate.)
OpenAI faces California lawsuit claiming ChatGPT advice led to fatal overdose (A wrongful death suit naming OpenAI's CEO personally over ChatGPT drug advice signals that AI liability is moving beyond the corporation toward individual executive accountability.)
Project Glasswing: what Mythos showed us (Cloudflare publishing what it learned running Anthropic's Mythos against its own infrastructure, including findings on autonomous vulnerability discovery, is a first-of-its-kind public field report on frontier-AI cyber capability from outside the labs themselves.)

AI research news

Multi-agent AI systems outperform human teams in creativity (With a Cohen's d of 1.50 across 4,541 AI-generated ideas vs. 341 human-team ideas, this is the most quantitatively rigorous challenge yet to the assumption that human creative teams hold an inherent edge over AI systems.)
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for GUI Agents (Peking University's method of auto-extracting GUI interaction trajectories from internet videos at scale sidesteps the expensive human-annotation bottleneck that has constrained computer-use agent development.)
Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems (The finding that multi-agent debate architectures amplify prompt injection attacks by up to 9.9x reveals a security liability that grows directly with the agentic architectures enterprises are rushing to deploy.)
Steered LLM Activations are Non-Surjective (Proof that activation-steered internal states are unreachable via text prompts alone has direct consequences for AI safety and alignment: white-box control methods are fundamentally more powerful, and harder to audit, than prompt-based ones.)
Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLMs (The "knowing-doing gap", where LLMs recognise when a tool is needed but fail to invoke it, identifies a concrete, measurable failure mode that matters enormously for anyone building reliable agentic pipelines.)

AI hardware news

AMD unveils $10 billion Taiwan AI investment, ramps up TSMC 2nm production (AMD committing $10 billion to Taiwan's AI supply chain while ramping the first commercial 2nm server chips signals a direct challenge to Nvidia's infrastructure dominance just as TSMC's capacity becomes the scarce resource everyone is fighting over.)
Meta, Broadcom, Applied Materials, GlobalFoundries, and Synopsys launch a $125M "Semiconductor Hub" at UCLA to advance AI chip research and more (Five major semiconductor players pooling $125 million at a single university lab is an unusual structural move that suggests the industry sees academic R&D pipelines, not just fab capacity, as a strategic chokepoint worth owning.)
Hudson River Trading selects Lambda for AI compute infrastructure (A secretive quant trading firm quietly securing a dedicated cluster of Nvidia's latest B200 systems reveals that AI compute demand is now spreading well beyond hyperscalers into financial institutions with very different workload profiles.)
Gaia AI supercomputer launched in Kraków, Poland (A 1,000+ GPU AI supercomputer going live in Kraków is a concrete data point in the "sovereign AI" buildout across Central and Eastern Europe, a region that rarely shows up in these conversations.)
Nvidia's Vera CPU lands at leading AI labs as agentic AI demand grows (Nvidia hand-delivering the first Vera CPU racks to OpenAI, Anthropic and Oracle marks the moment it begins competing in the CPU market it has long circled, with Oracle planning to deploy hundreds of thousands from 2026.)
Trump approved a Nvidia chip for sale in China. Beijing doesn't want it (Beijing privately telling Chinese firms to walk away from the H200 chips Trump cleared for export shows Xi treating chip independence as more strategic than near-term capacity, hardening the standoff that has left December's approval still without a single shipped order.)

Subscribe to the ExoBrain Weekly Newsletter

Stay up to date with AI. Get analysis of the week's most important stories, plus a focused roundup across business, governance, research and infrastructure.