AI News | Field Notes by Michael Nemtsev

AI Coding Agent Push | AI Field Notes #13

A programmer's hands release the keyboard as shadowy AI agents work unseen in the background, illustrating the shift from human-controlled coding to autonomous systems operating in parallel.

AI coding agents took a real step out of the chat window this week. OpenAI's Codex now drives the desktop in the background, opening pull requests and Jira tickets while you work elsewhere, and Anthropic moved Claude Security to public beta so teams can scan codebases for vulnerabilities. Google's Gemini CLI added persistent memory and local Gemma routing in v0.40.0, and Alibaba open-sourced Qwen-Scope for steering open-weight models. Meanwhile the Pentagon picked eight AI vendors and pointedly skipped Anthropic, Meta confirmed 8,000 layoffs to fund $115 billion in compute, and Q1 GDP came in at 2% growth alongside 80,000 tech layoffs. If you ship code, set your agent permissions before someone else's agent does it for you.

Pentagon AI deal: eight vendors signed, Anthropic left out

AnalysisThe Pentagon (US Department of Defense) said Friday it signed agreements with eight AI companies, OpenAI, Google, Microsoft, Amazon Web Services, Nvidia, SpaceX, Reflection, and Oracle, to deploy their models inside the military's most secure classified networks (Impact Level 6 and 7). One company was not on the list: Anthropic, the AI lab behind Claude, which the administration designated a supply chain risk in March after Anthropic refused unrestricted use of Claude for autonomous weapons and domestic surveillance. The Pentagon's CTO Emil Michael said relying on one vendor was irresponsible. Anthropic is suing in San Francisco and Washington to overturn the order, while its rivals collect the contracts and the Pentagon's GenAI.mil platform now serves 1.3 million personnel.

AI AgentsAI Models ·OpenAI Codex changelog

OpenAI Codex update: agents now drive your desktop while you work

AnalysisOpenAI rolled out a major Codex update this week, turning its coding agent into something closer to a desktop coworker. Codex now operates the user's computer in the background, clicking and typing with its own cursor across other apps while the human keeps working in parallel. The April 30 release notes added persisted /goal workflows for long-running tasks, configurable permission profiles, first-class Amazon Bedrock authentication for AWS shops, and 90 new plugins covering Jira, GitLab, CircleCI, and Microsoft 365. OpenAI says more than 3 million developers use Codex weekly, and points to internal Finance and Comms teams running multi-day automations on top of it. Quietly, this is the moment 'agent' stopped meaning 'chatbot with tools' and started meaning 'another login on your machine.'

AI AgentsLLM Evals ·Business Standard

Claude Security goes public beta as AI vulnerability hunting accelerates

AnalysisAnthropic moved Claude Security from research preview to public beta for Enterprise customers on May 1. The product is powered by Claude Opus 4.7 and lets security teams scan codebases for vulnerabilities and generate fixes, with a reported false positive rate near 5% versus 30 to 60% for traditional static analysis (tools that look at code without running it). The release is a hedge against Anthropic's other model, Mythos, which the company says can autonomously discover and exploit bugs at the level of elite human researchers. CrowdStrike, Microsoft Security, Palo Alto Networks, SentinelOne, and Wiz are embedding Opus 4.7 into their tools. RunSybil's CEO told Black Hat Asia this week that the gap from bug discovery to working exploit has collapsed from five months in 2023 to ten hours in 2026. The arsonist now sells fire extinguishers.

AI Industry ·Bloomberg

Common Crawl revolt: news publishers move to cut off AI training data

AnalysisThe News/Media Alliance, which represents NBCUniversal, CNN, USA Today, McClatchy, Vox Media, and hundreds of regional outlets, sent a formal letter on April 29 demanding that Common Crawl, the nonprofit web archive whose scrapes have trained nearly every major language model, remove their content and explicitly bar AI training use. Bloomberg first reported the action on April 30. The list of publishers requesting opt-out is one of the largest coordinated moves yet against the dataset that underpins models like GPT, Claude, and Llama. About 79% of the world's top news websites already block at least one AI training crawler, but Common Crawl is the back door, since it scrapes once and resells access to anyone. If the letter sticks, the cheapest source of high-quality training text for the next generation of models just got more expensive.

AI Industry ·Bloomberg

KKR launches $10B AI infra firm Helix under ex-AWS chief Selipsky

AnalysisPrivate equity giant KKR confirmed on April 30 that it secured more than $10 billion to launch Helix Digital Infrastructure, a new company that will build and operate AI data centers, power generation, transmission, and connectivity for hyperscalers (giant cloud providers like AWS, Google Cloud, and Azure). Adam Selipsky, who ran AWS during its growth past $100 billion in revenue, will be CEO and chair. Helix is positioned as a full-stack landlord, owning the land, the power plant, the cooling, and the fiber. That scope is unusual for a data center play. KKR has already committed $42 billion across digital infrastructure and another $20 billion across power and renewables. The hyperscalers are now building so much capacity that even they are renting it from a private equity-owned utility company.

AI IndustryAI Models ·Tom's Hardware

Huawei AI chip revenue jumps 60% to $12B as Nvidia stalls in China

AnalysisHuawei's AI chip revenue is on track to hit roughly $12 billion in 2026, up 60% from $7.5 billion in 2025, the Financial Times reported this week, as the company's Ascend 950PR processor enters mass production and Nvidia's H200 sales to China stay stuck in regulatory limbo. Nvidia's China data center revenue once accounted for 25% of its data center business. The 950PR is positioned for inference (the cost of running a trained model for a user, distinct from training it), which is becoming the bigger AI workload as agents proliferate. Western analysts still rate Nvidia's best chips roughly five times more powerful than Huawei's best, and that gap is widening on paper. For Chinese labs running DeepSeek V4 and Qwen, 'good enough and available' beats 'world class and embargoed.'

AI Industry ·Future Forwarded

Q1 GDP shows 2% growth and 80,000 tech layoffs in the same quarter

AnalysisThe Bureau of Economic Analysis released its Q1 2026 advance estimate on April 30, showing the US economy grew at a 2.0% annualized rate, a rebound from Q4 2025's 0.5%. A material driver: data warehouse construction. The same quarter, tech industry layoffs hit roughly 80,000, with Layoffs.fyi tracking more than 92,000 cuts year to date and Nikkei Asia attributing about 48% of those directly to AI. Economist Claudia Sahm, who designed a recession indicator, has been pointing at the gap: GDP looks fine, corporate profits look fine, and the same companies posting record earnings are cutting white-collar staff at a pace not seen since the pandemic. The growth is going into data warehouse construction. The wages line stays flat.

AI AgentsAI Models ·Gemini CLI GitHub releases

Gemini CLI v0.40.0 ships tiered memory and auto-generated skills

AnalysisGoogle's Gemini CLI hit version 0.40.0 on April 30 with three changes that matter to developers building agents on top of it. A tiered memory system gives the agent persistent context across sessions instead of restarting fresh every time. Auto-generated 'skills' let the CLI study past sessions and turn them into reusable functions, a kind of self-curated playbook. Experimental local routing to Gemma, Google's open-weight model family, lets some calls run on the developer's machine instead of Google's servers, cutting cost and latency for cheap queries. None of these are headline features for an end user. For anyone building a serious coding agent on Gemini, all three change how the thing works under the hood, and the price.

AI ModelsLLM Evals ·Qwen GitHub

Qwen-Scope: Alibaba open-sources interpretability tools for steering models

AnalysisAlibaba's Qwen team released Qwen-Scope on April 30, an open suite of sparse autoencoders for the Qwen open-weight model family. Sparse autoencoders are a relatively new interpretability technique, a way to take the otherwise opaque internal activations of a large language model and turn them into discrete, human-readable features that can be observed or directly manipulated. The release lets developers steer Qwen's outputs by editing internal features instead of crafting longer prompts, classify training data automatically, and trace failure modes like multilingual confusion. Anthropic built similar tools for Claude but kept them closed. Qwen-Scope shipping under an open license puts a research tool that until recently lived only inside frontier labs onto every laptop with a GitHub account.

Subscribe for full archive access

Every past issue, weekly deep dives, and the full back catalogue — delivered free.

Read on Substack

Want this in your inbox?

One email a day, zero hype.

A short read every morning: what actually changed in AI, and what it means for work and daily life. Free, unsubscribe anytime.