AI Field Notes by Michael Nemtsev

AI Agent Cost Race | AI Field Notes #58

Traders swap clockwork figures wearing price tags at a stall while a scale tips toward coins over a sketched brain, showing AI agents now competing on cost over capability.

Anthropic's Claude Sonnet 5 launched at $2 per million input tokens, about half the price of Opus, making cost the deciding factor in how developers run AI agents. The same week X shipped a hosted MCP server that lets Claude, Cursor, and Grok read the platform through your own account, Google's Nano Banana 2 Lite started making images at three cents a thousand, and Base44 trained its own model to stop renting frontier tokens. Cursor moved coding-agent control onto the phone. Away from the model race, Etched hit a $5 billion valuation selling inference-only chips, South Korea pledged over $550 billion against a memory shortage, and Ford quietly rehired 350 veteran engineers after its AI quality tools fell short.

AI Agents ·TechCrunch

X launches a hosted MCP server so Claude and Cursor can read the platform

AnalysisAny AI tool can now search X, read posts, and analyze trends through the user's own account, with no custom plumbing required. X rolled out a hosted MCP server on June 30, MCP being the Model Context Protocol, a standard way for AI assistants to plug into outside services. Claude, Cursor, and Grok Build connect directly, authenticating with a person's own credentials. The limit sits in what it withholds: read-only access, no write endpoints, so an agent can mine the full stream but cannot post on its own. Posting still costs $0.015 a tweet, and $0.20 with a link.

AI Models ·TechCrunch

Claude Sonnet 5: Anthropic cuts agent pricing to $2 per million input tokens

AnalysisThe cheapest way to run a Claude agent just dropped to $2 per million input tokens and $10 per million output through August 31, then $3 and $15 after that. Anthropic shipped Sonnet 5 on June 30, a midsize model built to plan, drive a browser or terminal, and run without a human watching. It scores 63.2% on agentic coding (finishing multi-step software tasks on its own) against Opus 4.8's 69.2%, edges Opus on knowledge work, and hallucinates less than Sonnet 4.6. Anthropic's own framing says the quiet part out loud: the agent race now turns on who runs them cheapest and most reliably without oversight.

AI Models ·TechCrunch

Google's Nano Banana 2 Lite makes images at $0.034 per thousand in 4 seconds

AnalysisFour seconds and roughly three cents for a thousand images: that is Google's new floor for AI picture generation. Nano Banana 2 Lite landed June 30, a stripped-down version of Nano Banana 2 tuned for volume over polish, available through the Gemini API and AI Studio. It gives up the fuller model's quality for speed and price, aimed at anyone generating images by the batch rather than one hero shot at a time. Paired with Google's video tools, it lets a developer wire image and clip generation into a single pipeline. The number that matters is the per-image cost heading toward a rounding error.

AI Models ·TechCrunch

Base44 trains its own model, Base1, to stop paying frontier labs per token

AnalysisA vibe-coding platform that hit $150 million in annual revenue by May decided it no longer wants to rent its intelligence. Base44, owned by Wix since an $80 million acquisition last June, began rolling out Base1 on June 29, a model trained on tens of millions of real user sessions on the platform. Vibe-coding means describing an app in plain English and letting the AI write the code. Founder Maor Shlomo wants tighter control of latency and cost, and eventually output that beats Claude Opus on app-building while charging users less. The bet is that a narrow model, fed one exact workflow, can undercut a general one.

AI Agents ·TechCrunch

Claude Science wires 60 databases into one research workbench for scientists

AnalysisAnthropic's pitch to scientists skips the usual arms race over a smarter model. Claude Science, launched June 30, is a workbench that connects to more than 60 scientific databases with prebuilt toolkits for genomics, protein structure, and chemistry, run by a lead assistant that farms tasks out to sub-assistants and a separate fact-checker that verifies citations before anything ships. It runs the same Claude models everyone already has, including Opus 4.8, with no special access. Early users include Allen Institute and UCSF researchers. The wager is that a working biologist's missing piece is reliable plumbing and citation checks, the parts a general chatbot skips.

AI Industry ·TechCrunch

Amazon puts $1B into forward-deployed engineers to embed agents in customers

AnalysisGetting AI to actually work inside a company has become the bottleneck, and the big labs are throwing engineers at it. AWS committed $1 billion on June 30 to a forward-deployed engineering group, staff who embed inside a customer's business to build and wire up custom agents, then leave the client able to run them alone. It follows OpenAI's $4 billion joint venture and Anthropic's $1.5 billion one, both backed by private equity. Amazon is funding its version straight from AWS instead. The common admission underneath: the model is the easy part now, and deployment is where the deals stall.

AI Agents ·TechCrunch

Cursor's new phone app lets you steer coding agents from a train platform

AnalysisCoding is drifting off the desktop and onto the phone. Cursor released a mobile app on June 29 that lets a developer kick off a coding agent or check in on one started at their desk, holding a continuous conversation with a remote agent that works on the codebase in the cloud. Anthropic engineer Boris Cherny said most of his coding now happens on his phone, a sentence that would have read as a joke a year ago. Across the tools shipping this week, the daily work is shifting from writing lines to supervising the thing that writes them.

AI Industry ·TechCrunch

South Korea commits over $550B to memory fabs as an AI-driven RAM shortage bites

AnalysisThe AI boom has drained the world's memory chips, and Korea just pledged over half a trillion dollars to refill the well. Samsung and SK Hynix anchored commitments topping $550 billion on June 29: about $518 billion for four new memory fabrication plants and $52 billion for a high-bandwidth memory packaging hub, HBM being the stacked memory that feeds data to AI accelerators fast enough to keep them busy. Locals call the squeeze RAMageddon, a worldwide shortage of DRAM (the standard working memory in servers and PCs) set off by the data-center buildout. Existing plants in Yongin and Pyeongtaek have hit their limits. Relief arrives in years, not months.

AI Industry ·TechCrunch

Etched hits $5B valuation and $1B in orders selling inference-only AI chips

AnalysisA chip that does one thing has booked $1 billion in orders. Etched, run by two Harvard dropouts, reached a $5 billion valuation on June 30 after TSMC manufactured its transformer-specialized processor earlier this year, the transformer being the design behind most modern AI models. Rather than sell raw chips, it ships full inference clusters, racks tuned only for running models rather than training them, and claims better speed, cost, and power draw than Nvidia. Backers include Jane Street, Peter Thiel, and Andrej Karpathy. The pitch is that once a model's shape is fixed, a chip built solely for that shape wins.

AI Industry ·TechCrunch

California cuts a deal with Anthropic to run Claude across agencies at half price

AnalysisThe largest state government in the country is standardizing on Claude at a discount. Governor Gavin Newsom and Anthropic announced a deal on June 29 giving California's state and local agencies access to Claude at roughly half price, plus training and support, for drafting documents and analyzing information. Newsom framed it as help for state workers rather than a headcount cut, saying AI should let them move faster instead of taking their jobs. The subtext is a land grab: whichever lab wins the government account sets the default tool for hundreds of thousands of public employees and the vendors who serve them.

AI Industry ·TechCrunch

Ford rehires 350 veteran engineers after AI quality inspection fell short

AnalysisThe automated inspection line was supposed to retire the old hands. Ford has rehired 350 veteran engineers, some former staff and some pulled from suppliers, after AI-based quality and design tools failed to hit the mark on the factory floor. They now catch failure points before parts reach assembly and retrain the AI tools that missed them. COO Kumar Galhotra admitted the company leaned too hard on automated quality systems and paid for it. The rehires helped Ford top J.D. Power's initial quality survey among mainstream brands and cut warranty costs by hundreds of millions.

AI Agents ·TechCrunch

OKX opens a marketplace where AI agents hire and pay each other in stablecoins

AnalysisSoftware that pays other software is no longer a thought experiment. Crypto exchange OKX opened OKX AI to developers on June 30, a marketplace where AI agents hold wallets, hire one another for jobs, settle in stablecoins (crypto tokens pegged to a currency like the dollar), and build on-chain reputations. It plugs into coding tools including Claude Code and Codex, with CertiK checking wallet security and GenLayer handling disputes. OKX projects agentic commerce as a trillion-dollar market within five years, complete with one-person companies grossing over a million dollars on autonomous labor. The vision is bold. The guardrails around an agent with a spendable wallet are the part still being written.

LLM Evals ·TechCrunch

Arena, the AI leaderboard labs live and die by, becomes a $100M business

AnalysisThe scoreboard that decides which model looks best is now a real company. Arena, the crowdsourced leaderboard where users pit two anonymous models against each other and vote for the better answer, hit a $100 million annualized revenue run-rate on June 29, up from $30 million in January. More than 10 million human votes feed its rankings across text, coding, and images. Labs pay for its deeper evaluation service to tune models after training. That creates an obvious tension: the referee everyone trusts now sells analytics to the teams it scores, chasing the same money as Scale AI and Surge.

Want the next issue?

Get AI Field Notes by email.

A short morning brief on what actually changed in AI. Free, unsubscribe anytime.

Read on Substack