Claude Opus 4.8 trades benchmark bragging for catching its own bad code
AnalysisOpus 4.8 sells on honesty more than horsepower. Anthropic (the AI lab behind Claude) says its new model, out 41 days after the last at the same price ($5 per million input tokens, $25 per million output), is about four times less likely to let a flaw in code it wrote pass unremarked, and quicker to flag when it is unsure. The low-latency fast mode now costs a third of what it did. For anyone who merges what a model writes, a system that catches its own bad code beats another point on a coding benchmark.