AI Field Notes by Michael Nemtsev

AI Agent Operations | AI Field Notes #56

Mechanical clerks work at desks on thin leashes held by an overseer with a ledger, while a hand pries a padlock off a stack of computer chips nearby.

AI agent operations became a real software category this week, as the tools to govern and test agents drew fresh funding while others shipped into live workflows. Runlayer raised $30M to police what MCP-connected agents touch inside companies, and Coval raised $28M to simulate and score voice agents before they go live. Anthropic put Claude into Slack as an always-on teammate, while Taktile and Assort Health raised $110M and $120M to run banking and healthcare decisions with agents. On the model side, Qualcomm agreed to buy Chris Lattner's Modular for $3.9B to loosen Nvidia's CUDA lock-in, and OpenAI turned GPT-5.5-Cyber loose on open-source security bugs. A quieter news week pulled some items into a 72-hour window.

AI Industry ·CNBC

Qualcomm AI software: $3.9B Modular deal aims at Nvidia's CUDA lock-in

AnalysisBreaking Nvidia's software grip is now a $3.9 billion bet. On June 24, Qualcomm agreed to buy Modular, the startup founded by Chris Lattner, the engineer behind Apple's Swift language and the LLVM compiler that sits under much of modern software. Modular's MAX engine and Mojo language let developers run AI models across chips from any vendor without rewriting code for each one, which chips away at CUDA, Nvidia's proprietary software that keeps inference work tied to its GPUs. Pair it with Qualcomm's reported talks to buy Jim Keller's chip startup Tenstorrent, and the company is assembling a roughly $14 billion stack aimed at Nvidia's weak spot. Qualcomm barely competes in data-center inference today.

AI Agents ·Anthropic

Anthropic Claude Tag: an always-on AI teammate that lives in your Slack

AnalysisClaude now sits inside Slack as a permanent coworker. Anthropic launched Claude Tag on June 23, a version of its assistant that takes a request, breaks it into steps, works through them on its own, and drops the result back in the channel. It learns a company's context across channels over time, and an ambient mode lets it chase forgotten threads without being asked. Inside Anthropic, Claude Tag already approves and merges 65% of the code changes the product team submits. The pitch is a colleague with no calendar and no memory of ever going home.

LLM Evals ·Tech Startups

Coval raises $28M to stress-test voice and chat agents before they go live

AnalysisBefore a voice agent picks up a customer call, something should confirm it does not collapse on the hundredth strange request. Coval raised $28 million on June 24, led by the venture firm Norwest, to run that confirmation at scale. Its platform simulates thousands of conversations, then scores and reviews how voice and chat agents handle them, the way self-driving teams rack up millions of simulated miles before a car touches a road. Zoom, Deepgram, and more than 60 other companies use it. As agents move from demo to call center, the testing layer is quietly becoming the thing that decides whether anyone trusts them.

AI Agents ·FinSMEs

Runlayer raises $30M to police what corporate AI agents actually touch

AnalysisAs companies hand AI agents access to their internal systems, someone has to watch what those agents reach for. Runlayer raised a $30 million Series A on June 24, led by the venture firm Felicis, to be that watcher. It wraps a control layer around MCP (Model Context Protocol, the open standard for plugging agents into company data and tools), adding approval gates and a full audit log of every action an agent takes. Instacart and dbt Labs are among the early customers. The wager is that agent governance becomes its own software market, the way identity management once did.

LLM Evals ·OpenAI

OpenAI Patch the Planet: GPT-5.5-Cyber set loose on open-source bugs

AnalysisAn AI model wrote 24 working privilege-escalation exploits against the Linux kernel, then helped close the holes it found. That is the demonstration behind Patch the Planet, an open-source security push OpenAI launched on June 22 with the security firms Trail of Bits and HackerOne. It runs on GPT-5.5-Cyber, which scores 85.6% on CyberGym, a benchmark that checks whether a model can find and fix real software flaws, against 81.8% for standard GPT-5.5. More than 30 projects signed on, including Python and cURL. Each fix still passes a human reviewer before it lands.

AI Models ·Tech Times

ByteDance Seedance 2.5: native 30-second AI video without stitching

AnalysisThirty seconds of AI video in one shot, with no clip-stitching, is the leap ByteDance is claiming with Seedance 2.5, shown on June 23 at its Volcano Engine conference. Earlier models generated a few seconds at a time and broke when asked for more, so editors faked length by joining clips and hiding the seams. Seedance 2.5 runs in a closed enterprise beta now, with a public launch targeted for early July. It lands while OpenAI's Sora and Google's Veo fight over the same ground. For anyone who edits video, the floor on good-enough footage keeps rising.

AI Industry ·PR Newswire

xCures raises $46M to turn messy medical records into decision-ready data

AnalysisA cancer patient's history is scattered across labs, hospitals, and imaging centers, often as scanned documents nobody can read quickly. xCures raised $46 million in a Series B, announced June 25, to fix that with AI that structures those records into what it calls decision-ready data. Hospitals use it to generate a patient history on demand and to estimate how long a surgery will take. The Oakland company counts 25 enterprise clients, including the diagnostics firms Exact Sciences and Caris, and grew from $3 million to $10 million in recurring revenue last year. The bottleneck in medicine is rarely the doctor; it is the paperwork.

AI Industry ·Tech Startups

Taktile raises $110M to automate the calls banks once paid analysts to make

AnalysisDecisions that used to need an analyst, whether to approve a loan or flag a transaction as fraud, are the product Taktile just raised $110 million to run. The Series C, led by Goldman Sachs Alternatives and closed on June 24, backs a platform that lets banks and insurers combine AI agents with hard rules and human sign-off. It already powers millions of decisions a day, and customers report 75% fewer false fraud alerts and tens of millions saved in claims handling. The rules and the oversight stay in place. The human who weighed each case by hand is the line item under pressure.

AI Agents ·Tech Startups

Assort Health raises $120M to run the front desk of American healthcare

AnalysisThe phone tree and the scheduling desk at your clinic are the next jobs going to software. Assort Health raised $120 million in a Series C on June 24 at a $1.2 billion valuation, led by Menlo Ventures, to put AI agents across the patient journey: booking, intake, referrals, refills, and billing. It says it has handled 190 million patient interactions and grew revenue twentyfold in 15 months. Health systems are short-staffed and drowning in administrative work, which makes the front office an easy target. The quiet question is what happens to the people who used to answer those calls.

AI Agents ·PR Newswire

Attention raises $30M to put AI agents in charge of sales busywork

AnalysisThe busywork around every sales deal, the follow-up email, the CRM update, the next-step nudge, is what Attention's AI now wants to handle on its own. The New York company raised a $30 million Series B, announced June 23, and says it runs more than 20 million agent actions a month, with revenue up fourfold in a year. Sales-operations staff have long done this glue work by hand. Attention drafts and sends the message, then updates the record and queues the next move. More than 500 companies use it, among them Scale and BambooHR. The job being eaten is the coordination, while the selling stays human.

Want the next issue?

Get AI Field Notes by email.

A short morning brief on what actually changed in AI. Free, unsubscribe anytime.

Read on Substack