Google I/O Agent Tools | AI Field Notes #31

Server scaffolding connects via web threads to developers at workstations while financial documents rise from the base, suggesting AI infrastructure racing to become capital before the build is done.

Gemini 3.5 Flash launched via the Gemini API at Google I/O this week at $1.50 per million input tokens, running 4 times faster than comparable frontier models and outperforming Gemini 3.1 Pro on agentic benchmarks, alongside Managed Agents that provision a full Linux-sandboxed agent from one API call. Simultaneously, Google opened a Chrome 149 origin trial for WebMCP, a proposed open web standard letting sites expose JavaScript functions directly to browser AI agents as machine-callable tools, replacing the pixel-parsing approach that makes current web automation brittle. On the financial side, Anthropic told investors it expects $10.9 billion in Q2 revenue and a first operating profit of $559 million, a target the company was not projecting to hit until at least 2028. OpenAI is preparing a confidential IPO filing with Goldman Sachs and Morgan Stanley, targeting a September listing near $1 trillion, which will produce the industry's first audited financials at scale.

AI Agents LLM Evals AI Models AI Industry

Latest issue · About

AI Models ·Google I/O 2026 Developer Highlights

Gemini 3.5 Flash: Google ships fastest frontier model at $1.50 per million tokens

AnalysisGemini 3.5 Flash landed in the Gemini API on May 19, priced at $1.50 per million input tokens and $9.00 per million output tokens, with a 1,048,576-token context window. It runs four times faster than comparable frontier models and beats Gemini 3.1 Pro on the benchmarks that look most like real agent work: 76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas (a test suite for model-context-protocol tool use), and 84.2% on CharXiv Reasoning. That combination of price, speed, and agentic performance puts it on a short list for default model choice in agent loops, where latency and token cost multiply with every iteration. One notable default change from the preview version: thinking_level drops from high to medium unless explicitly overridden.

Read the source for Gemini 3.5 Flash: Google ships fastest frontier model at $1.50 per mi… · Google I/O 2026 Developer Highlights · blog.google

AI Models ·Cursor: Introducing Composer 2.5

Cursor Composer 2.5: first proprietary model matches Opus 4.7 on benchmarks at one-tenth the price

AnalysisCursor shipped Composer 2.5 on May 18, its first in-house coding model, built on Moonshot AI's open-source Kimi K2.5 base with 85% of compute budget spent on Cursor's own reinforcement-learning post-training: 25 times more synthetic coding tasks than the prior version. The model scores 79.8% on SWE-Bench Multilingual (Claude Opus 4.7 scores 80.5%), at standard API pricing of $0.50 per million input tokens, roughly one-tenth of Opus 4.7's rate. Cursor also disclosed a significantly larger model in training with SpaceXAI, using 10 times more total compute. Three coding agents (Cursor, Claude Code, Codex) now operate as a converging stack, but Cursor is the first to ship its own model into that stack rather than routing through someone else's API.

Read the source for Cursor Composer 2.5: first proprietary model matches Opus 4.7 on benc… · Cursor: Introducing Composer 2.5 · cursor.com

AI Agents ·PPC.land: WebMCP Chrome 149 origin trial

WebMCP: Chrome 149 origin trial lets websites expose JavaScript tools to AI agents

AnalysisWebMCP is a proposed open web standard from Google, announced at I/O on May 19, that lets site developers declare JavaScript functions and annotated HTML forms as structured tools for browser-based AI agents to call directly. Chrome 149 begins an origin trial for the spec now. The practical change: instead of simulating clicks and parsing pixels, agents call machine-defined tool functions with explicit JSON schema contracts, producing higher accuracy and less hallucination on multi-step web tasks. Google published companion documentation on May 18 formalizing how tools are declared and discovered. The spec is still a proposal, not a finalized standard, but the Chrome origin trial gives it a concrete deployment path, and Gemini in Chrome support is listed as coming soon.

Read the source for WebMCP: Chrome 149 origin trial lets websites expose JavaScript tools… · PPC.land: WebMCP Chrome 149 origin trial · ppc.land

AI Agents ·TechCrunch: Google introduces Gemini Spark

Gemini Spark: Google's 24/7 personal AI agent launches for AI Ultra subscribers

AnalysisGemini Spark is a persistent personal AI agent built on Gemini 3.5, announced at I/O and opening to Google AI Ultra subscribers ($100/month) next week. It runs continuously in the background, integrated with Gmail, Docs, and Slides, with connections to Canva, OpenTable, and Instacart at launch and more partner apps coming. The design commits to standing permissions: the agent acts without the user initiating a session, executing recurring tasks like flagging hidden fees in credit card bills or summarizing active email threads. Spark competes directly with OpenAI's Tasks feature and Apple Intelligence's Siri integration model, and puts Google in a race to establish the default personal agent layer before any single product has established clear consumer trust.

Read the source for Gemini Spark: Google's 24/7 personal AI agent launches for AI Ultra s… · TechCrunch: Google introduces Gemini Spark · techcrunch.com

AI Industry ·Bloomberg Law: OpenAI IPO preparation

OpenAI prepares confidential IPO filing, September listing near $1 trillion in view

AnalysisOpenAI is working with Goldman Sachs and Morgan Stanley to file a confidential IPO prospectus with the SEC, potentially as early as today, according to Bloomberg, CNBC, and the Wall Street Journal. The target is a September 2026 listing at a valuation near $1 trillion, above the $852 billion achieved in the March 2026 private round. The confidential filing precedes a public S-1 by roughly two months; once public, the filing will disclose audited financials, operating margins, revenue quality, and partner concentration for the first time. Cooley and Wachtell have been in place as legal advisors since March. The compressed timeline to a September listing leaves limited room for investors to revise risk assessments after reading the first public disclosures.

Read the source for OpenAI prepares confidential IPO filing, September listing near $1 tr… · Bloomberg Law: OpenAI IPO preparation · news.bloomberglaw.com

AI Industry ·Winbuzzer: Anthropic targets first profit

Anthropic targets first operating profit as Q2 revenue projections reach $10.9 billion

AnalysisAnthropic told investors it expects Q2 2026 revenue of $10.9 billion, a 130% sequential jump from the $4.7 billion it recorded in Q1, with a projected first operating profit of $559 million. That figure includes model training costs but excludes stock-based compensation. Last summer, the company had told investors full-year profitability was not expected before 2028. The acceleration traces to expanded compute capacity through a SpaceX partnership covering 300 megawatts, doubled Claude Code rate limits, and increased API capacity for Opus. These remain investor projections, not audited results. CEO Dario Amodei noted the company planned for 10x annual growth but saw 80x, which means the unit economics are real but the margin sustainability beyond this quarter is still unverified.

Read the source for Anthropic targets first operating profit as Q2 revenue projections re… · Winbuzzer: Anthropic targets first profit · winbuzzer.com

AI Industry ·TechCrunch: Hark raises $700M

Hark raises $700 million Series A to build personal AI hardware at $6 billion valuation

AnalysisHark, a personal AI hardware startup founded by Brett Adcock (also behind Figure AI and Archer Aviation) in late 2025, raised $700 million in a Series A led by Parkway Venture Capital, reaching a $6 billion valuation. Investors include Nvidia, AMD Ventures, Intel Capital, Qualcomm Ventures, Brookfield, and Salesforce Ventures. The company has 70 employees and an Nvidia B200 GPU cluster. First multimodal models are due this summer, followed by dedicated hardware devices. The pitch is a universal AI interface: a personal layer connecting to existing digital products and services. Adcock seeded the company with $100 million of his own capital. The chip-maker investor roster is notable, suggesting that hardware-specific inference optimization is central to the product rather than a generic cloud deployment.

Read the source for Hark raises $700 million Series A to build personal AI hardware at $6… · TechCrunch: Hark raises $700M · techcrunch.com

AI Industry ·TechCrunch: SpaceXAI bleeding staff

SpaceXAI exodus: all 11 co-founders gone, 50 researchers out since SpaceX merger

AnalysisAll 11 original xAI co-founders have departed SpaceXAI since SpaceX absorbed the company in February. More than 50 researchers and engineers have followed. The Grok pre-training team, which builds the model's foundational capabilities, shrank to a handful of people after its lead researcher, Juntang Zhuang, left. Meta has hired at least 11 former SpaceXAI staff; Thinking Machines Lab, founded by former OpenAI CTO Mira Murati, has taken at least 7 more. Sources cited impossible deadlines and deliberate corner-cutting on model training to meet them. SpaceX simultaneously filed its S-1 on May 20 asking public markets to value the combined company at $1.75 trillion, citing Grok's 117 million monthly active users as a key growth asset.

Read the source for SpaceXAI exodus: all 11 co-founders gone, 50 researchers out since Sp… · TechCrunch: SpaceXAI bleeding staff · techcrunch.com

AI Industry ·TechCrunch: xAI burn rate from SpaceX S-1

SpaceX S-1: xAI burned $6.4 billion against $3.2 billion in 2025 revenue

AnalysisSpaceX filed its public S-1 on May 20, seeking a June listing at a $1.75 trillion valuation (ticker SPCX). The filing disclosed xAI's 2025 financials for the first time: $6.4 billion in losses on $3.2 billion in revenue, against 2024's $1.56 billion loss on $2.62 billion. Q1 2026 capex for the AI segment hit $7.7 billion, implying roughly $31 billion annualized. Grok subscriptions and data licensing contributed $453 million in 2025, covering about 7% of the AI segment's annual spending. Grok AI features reached 117 million monthly active users out of 550 million combined platform users. SpaceX also disclosed plans for orbital AI data centers by 2028, suggesting the spending trajectory does not converge with revenue before that year at the earliest.

Read the source for SpaceX S-1: xAI burned $6.4 billion against $3.2 billion in 2025 reve… · TechCrunch: xAI burn rate from SpaceX S-1 · techcrunch.com

AI Industry ·Dataconomy: Chinese models hit 61% on OpenRouter

Chinese AI models hold 45% of OpenRouter API traffic at 10 to 20 times lower cost

AnalysisChinese models account for roughly 45% of API token traffic on OpenRouter, the multi-provider routing layer used heavily by US developers and automation tools, per April 2026 data. The share peaked at 61% in February before stabilizing. MiniMax M2.5 (a Chinese multimodal model) leads by weekly token volume. The driver is price: Chinese open-weight models run 10 to 20 times cheaper per token than leading US closed-source models, making them the default for cost-sensitive, high-volume agentic automation. US firms are running these models in production while export controls limit US chip sales to China, an asymmetry where the application-layer capability gap is closing faster than the hardware gap that export controls were designed to maintain.

Read the source for Chinese AI models hold 45% of OpenRouter API traffic at 10 to 20 time… · Dataconomy: Chinese models hit 61% on OpenRouter · dataconomy.com