AI Field Notes by Michael Nemtsev

Open Model Beats the Frontier | AI Field Notes #51

A small figure with an open blueprint outpaces towering locked machines while clockwork agents build from cards, suggesting open models overtaking closed systems.

An open model just beat the paid frontier and enterprise agent platforms shipped en masse: GLM-5.2 launched under a free license topping the leading coding benchmarks, while AWS and Databricks both turned agent-building into configuration. Z.ai's GLM-5.2 runs at about a sixth of the cost of the closed leaders, and anyone can self-host it. AWS shipped Continuum, an agent that finds and patches security holes on its own, and put its spec-first Kiro coder on the iPhone. Google retires its Gemini CLI today, pushing developers onto Antigravity, and Noam Shazeer left Google for OpenAI two years after a $2.7 billion deal brought him back.

AI Models ·VentureBeat

GLM-5.2: open-weight coding model beats GPT-5.5 at a sixth of the price

AnalysisA 753-billion-parameter model anyone can download now outscores the paid frontier on the coding tests teams rely on, and costs a sixth as much to run. Z.ai, the Chinese lab behind the GLM series, released GLM-5.2 on June 17 under an MIT license (free to use, change, and sell on), with the weights on Hugging Face and a 1-million-token context window. It scored 62.1 on SWE-bench Pro, a test of real software-engineering tasks, against GPT-5.5's 58.6, and became the first open model past 80 on Terminal-Bench. Run the open weights on your own hardware and the code stays in-house; use the hosted API and your prompts travel to servers in China.

AI Agents ·SiliconANGLE

AWS Continuum: an AI agent that finds, proves, and patches security holes

AnalysisHand a backlog of security flaws to an AI system and let it prove which ones are real, then fix them. That is AWS Continuum, launched June 17 in gated preview, which ingests a company's vulnerability backlog, ranks each flaw by whether the affected code is actually reachable and deployed, then builds a working exploit in a sandbox to confirm the danger before writing a patch. It starts in a learn mode with a person checking every call, then shifts to an enforce mode that automates the fix outright. The vulnerability triage that filled a security engineer's week becomes a queue that clears itself.

AI Agents ·AWS News Blog

AWS makes AgentCore generally available: production agents from a config file

AnalysisBuilding a production AI agent on AWS no longer means hand-writing the orchestration loop. Amazon made Bedrock AgentCore generally available on June 17 at its New York summit, letting developers define an agent's model, tools, skills, and instructions in configuration and run it without coding the plumbing that ties those pieces together. It is AWS planting a flag in the same ground as the agent frameworks, the LangGraphs and CrewAIs and vendor SDKs that developers have been stitching together by hand. The pitch is fewer moving parts. The price is one more runtime that gets to decide how your agents behave.

AI Agents ·Google Developers Blog

Google retires Gemini CLI on June 18, pushing developers to Antigravity

AnalysisStarting today, Google stops serving Gemini CLI requests for its Pro, Ultra, and free individual tiers, and points those developers at Antigravity CLI instead. The command-line tool, used to drive Gemini from a terminal, gets folded into Google's agent-first platform, which keeps the pieces people built around it: agent skills, hooks, subagents, and extensions, now called plugins. Code Assist for GitHub stops taking new installs the same day, with existing requests winding down over the following weeks. Enterprise and standard licenses keep working untouched. For anyone who scripted Gemini CLI into a build pipeline or a nightly job, June 18 is a hard migration deadline.

AI Industry ·Salesforce

Salesforce and Databricks link up so AI agents inherit who's allowed to see what

AnalysisLetting an AI agent touch company data is the easy part; making sure it only sees what the person behind it is cleared to see is the hard part. Salesforce and Databricks expanded their partnership on June 16 to handle exactly that, with federated authentication, identity mapping, and access controls that carry a user's permissions across both systems. MuleSoft Agent Scanner for Databricks and zero-copy data sharing went generally available, while a Slack Genie app and MCP-driven integrations arrive later in 2026. The plumbing reads as dull, and it is the thing that decides whether an agent is safe to point at real data.

AI Agents ·Databricks Blog

Databricks Agent Bricks now runs a quadrillion tokens a year for agent builders

AnalysisAgents built on Databricks burned through more than a quadrillion tokens in the past year, and on June 16 the company turned its Agent Bricks experiment into a full platform to handle more of them. At its San Francisco summit, Databricks opened the tool to any model and any framework, wired in Kimi alongside OpenAI, Anthropic, Gemini, and Qwen, and struck a deal with SpaceX to host the Grok models inside the warehouse. Adding MCP, a standard way for agents to reach external tools, to its catalog lets those agents pull from Slack, Jira, GitHub, and Google Drive under a user's existing permissions.

AI Agents ·TechTimes

Kiro goes mobile: AWS's spec-first coding agent lands on iOS

AnalysisBefore it writes a line of code, Kiro writes the spec. AWS used its June 17 summit to put that coding agent on the iPhone, in a gated-preview iOS app, and to add a Kiro Pro Max tier with higher limits and access to the newest frontier models. Kiro, the successor to Amazon Q Developer, forces a requirements-then-design-then-tasks sequence using EARS notation, a rigid format for writing requirements, before the agent builds anything, which is meant to stop it charging off in the wrong direction. More than 2,700 Southwest Airlines developers are already using it to modernize Southwest.com.

AI Industry ·TechCrunch

Anthropic joins $915M carbon-removal coalition as AI's power bill climbs

AnalysisAnthropic became the first AI startup to join Frontier, the carbon-removal buying club started by Stripe and Google, as the coalition added $915 million in commitments on June 17 and pushed its total past $1.8 billion. The money pre-buys credits for ways of pulling carbon dioxide out of the air, from ocean chemistry to direct air capture, which lowers the risk for those projects and helps them scale. The timing is not subtle. AI's data centers are driving electricity demand up sharply, and a public climate commitment is a useful thing to point at while your compute footprint keeps growing.

AI Industry ·The Star (Reuters)

Noam Shazeer leaves Google for OpenAI, two years after a $2.7B return

AnalysisGoogle paid roughly $2.7 billion two years ago to bring Noam Shazeer back, and on June 17 he announced he is leaving for OpenAI anyway. Shazeer, a co-lead of Google's Gemini models and a co-inventor of the transformer, the architecture under every modern large language model, joins OpenAI as its lead for architecture research as the company moves toward a public offering. Losing the person who co-led your flagship model to your closest rival, after paying a fortune to acquire him, says the talent market at the very top is still a seller's market, and that money does not buy loyalty at this level.

Want the next issue?

Get AI Field Notes by email.

A short morning brief on what actually changed in AI. Free, unsubscribe anytime.

Read on Substack