Claude Code Limits Doubled | AI Field Notes #19

Pen-scratch cover: Create a hand-drawn pen scratch editorial illustration that shows the heavy doors of a great vault thrown open mid-room, a flood of small ge

Anthropic doubled Claude Code's five-hour rate limits and removed peak-hour throttling for paid subscribers, after a SpaceX compute deal added 300 megawatts (over 220,000 GPUs) at the Colossus 1 data center in Memphis. OpenAI shipped three new realtime voice models, including a live translator that handles speech in 70-plus languages. Google made Gemini 3.1 Flash-Lite generally available at $0.25 per million input tokens. Microsoft disclosed two remote-code-execution flaws in Semantic Kernel that turn prompt injection into shell access on the host.

AI Agents LLM Evals AI Models AI Industry

Latest issue · About

AI Agents ·Anthropic

Claude Code rate limits doubled: Anthropic adds 220,000 GPUs via SpaceX

AnalysisAnthropic doubled the five-hour rate limit on Claude Code for every paid plan (Pro, Max, Team, seat-based Enterprise) on May 6, and removed the peak-hours throttling that kicked in on Pro and Max during US working hours. The supply behind it: a deal with SpaceX to lease the entire Colossus 1 data center in Memphis, Tennessee, worth more than 300 megawatts of new capacity (over 220,000 Nvidia GPUs) coming online inside the month. Anthropic API tier-1 input limits on Opus jumped from 30,000 tokens per minute to 500,000. The same announcement floated multiple gigawatts of orbital data centers as a future direction, which is the line you read twice. The thing developers will actually feel is fewer 'you have hit your limit' walls during the workday.

Read the source for Claude Code rate limits doubled: Anthropic adds 220,000 GPUs via Spac… · Anthropic · anthropic.com

AI Models ·Google Cloud

Gemini 3.1 Flash-Lite GA: $0.25 per million tokens, 2.5x faster than 2.5

AnalysisGoogle moved Gemini 3.1 Flash-Lite into general availability on May 7, with input pricing at $0.25 per million tokens and output at $1.50. Google's own Artificial Analysis figures show a 2.5x speed-up on time to first token and a 45% gain on output speed compared with Gemini 2.5 Flash. The model is positioned for high-volume agentic workloads (tool calling, classification, structured extraction) where latency and unit cost matter more than reasoning depth. It is available in the Gemini API and Vertex AI. The interesting datapoint is the price. At $0.25 input, the cost of running a low-stakes agent step on Gemini Flash-Lite is now roughly an order of magnitude below GPT-5.5 and Claude Opus, which forces a real pricing comparison rather than a benchmark one.

Read the source for Gemini 3.1 Flash-Lite GA: $0.25 per million tokens, 2.5x faster than… · Google Cloud · cloud.google.com

AI Models ·OpenAI

OpenAI ships three voice models: realtime translate, GPT-5-class reasoning

AnalysisOpenAI added three voice models to its realtime API on May 7. The headline one is gpt-realtime-2, the first voice model with GPT-5-class reasoning. Its context window jumps from 32K to 128K tokens, with parallel tool calls, audible 'let me check that' preambles, and adjustable reasoning effort. Live translation gets its own model, gpt-realtime-translate, which takes speech in 70-plus languages and returns translated audio in 13. Speech-to-text is split out as gpt-realtime-whisper, a streaming transcriber. Pricing for the flagship realtime-2 lands at $32 per million audio input tokens and $64 per million audio output. Translate is $0.034 a minute. The shape of 'voice agent' just shifted from a wrapper over a chat model into its own product line.

Read the source for OpenAI ships three voice models: realtime translate, GPT-5-class reas… · OpenAI · openai.com

AI Agents ·GitHub Changelog

GitHub Copilot CLI runs Claude as a critic for GPT, and back

AnalysisGitHub Copilot CLI now lets developers pair models from rival labs in the same coding session. The Rubber Duck review agent (the second-opinion checker that catches architectural mistakes and cross-file conflicts the orchestrator missed) now dispatches a Claude-powered critic when your main model is GPT, and a GPT-5.5 critic when your main model is Claude. Released May 7. The point is structural rather than benchmark-driven: two models from different labs and different training data are more likely to disagree usefully than two prompts to the same model. Microsoft is baking that disagreement into the workflow at the CLI level, sidestepping the single-model lock-in that has shaped most coding tools so far. To use it, run copilot and toggle /experimental on.

Read the source for GitHub Copilot CLI runs Claude as a critic for GPT, and back · GitHub Changelog · github.blog

AI Agents ·Microsoft Security

Microsoft Semantic Kernel RCE: prompt injection turns into shell access

AnalysisMicrosoft disclosed two remote code execution vulnerabilities in Semantic Kernel, its open-source AI agent framework, on May 7. CVE-2026-26030 affects the Python SDK's InMemoryVectorStore filter functionality, where unsafe evaluation of filter expressions lets a crafted prompt run code in the application context. CVE-2026-25592 sits in the Python code interpreter plugin, which let agents read and write files on the host without path validation. Microsoft demonstrated a single attacker-controlled prompt launching calc.exe on the agent's host. The fix is to upgrade Semantic Kernel .NET to 1.71.0 or higher, and the Python package to 1.39.4 or higher. Microsoft also notes this is a class of bug developers should expect to keep finding. Filter strings and tool plugins are now part of the attack surface, the same way SQL queries became one in 2002.

Read the source for Microsoft Semantic Kernel RCE: prompt injection turns into shell acce… · Microsoft Security · microsoft.com

AI Industry ·Storyboard18

Freshworks cuts 500 jobs after CEO says half its code is AI-written

AnalysisFreshworks, a publicly listed customer-support software company, cut 500 jobs (11% of staff) on May 7 while reporting Q1 revenue of $228.6 million, up 16% year on year. CEO Dennis Woodside attributed the cuts to AI productivity, saying 'over half of our code is written by AI' and that the layoffs targeted 'rote work that technology can put to rest.' The company beat analyst estimates. The shape is hard to miss: profitable, growing, cutting headcount. That is the structural break from previous tech layoff waves, where revenue stalls came first. Freshworks took an $8 million one-time restructuring charge against its earnings beat. The pattern showing up across Coinbase, PayPal, Microsoft, and now Freshworks is the same story Woodside told plainly: revenue grew, AI did the work, the line cost less.

Read the source for Freshworks cuts 500 jobs after CEO says half its code is AI-written · Storyboard18 · storyboard18.com

AI Industry ·Nvidia

Nvidia commits up to $3.2B in Corning for AI fiber-optic factories

AnalysisNvidia and Corning, the optical-fiber maker that supplies most of the world's data-center fiber, announced on May 6 a multi-year partnership built around three new manufacturing plants in North Carolina and Texas. The deal gives Nvidia rights to invest up to $3.2 billion in Corning, a $500 million up-front purchase of share rights, and 3,000 new manufacturing jobs. Corning will lift US optical-connectivity manufacturing capacity 10x and US fiber output by more than 50%. The mechanic worth flagging: an AI cluster the size Nvidia is selling now needs more fiber per GPU than every previous data-center generation, because the GPUs are interconnected with optics rather than copper. Nvidia is locking in supply the same way TSMC's customers lock in wafer slots. It is a quiet capacity move dressed up as a stock-price move.

Read the source for Nvidia commits up to $3.2B in Corning for AI fiber-optic factories · Nvidia · nvidianews.nvidia.com

LLM Evals ·OpenAI

OpenAI rolls out GPT-5.5-Cyber to vetted security teams

AnalysisOpenAI announced on May 7 a limited preview of GPT-5.5-Cyber, a variant of GPT-5.5 tuned for offensive security work, including reverse-engineering malware, finding vulnerabilities, building exploits, and running pentests (simulated attacks against a system to find weaknesses). Access is gated to 'vetted defenders' working on critical systems, and most users will need to enable Advanced Account Security to call the model after June 1. The UK AI Security Institute (AISI) called the model 'one of the strongest we have tested on our cyber tasks.' It solved one of AISI's 32-step corporate-network attack ranges end-to-end on 1 of 10 attempts, an exercise estimated to take a human expert 20 hours. AISI's framing is careful, but the read is that autonomous end-to-end attacks against weakly secured small enterprises are now within a frontier model's reach.

Read the source for OpenAI rolls out GPT-5.5-Cyber to vetted security teams · OpenAI · openai.com