Skip to content

Claude Opus 4.8 as Incremental Improvement: Fewer Hallucinations, Greater Transparency

The point: Claude Opus 4.8 reduces hallucinations by approximately 75 percent by abstaining more frequently on uncertain questions instead of providing unfounded answers.

Anthropic has released Claude Opus 4.8 and deliberately markets the model as a “moderate, but tangible” improvement over its predecessor. The core advance is significantly higher factual reliability coupled with increased transparency about uncertainties.

The new model was optimized with a modified training approach: Opus 4.8 marks uncertainties more consistently and makes fewer unsupported claims — a particularly relevant point for code generation, where the model overlooks erroneous sections approximately four times less frequently according to evaluation. On six established benchmarks, Opus 4.8 showed the lowest error rate, but achieves this primarily through restraint on questionable queries rather than through correctly answered additional questions.

On the pricing level, Opus 4.8 remains stable at $5 USD per million input and $25 USD per million output tokens (identical to 4.5/4.6/4.7). Fast Mode now costs $30/$150 per million tokens and is thus cheaper than previous Fast Mode versions — however, only for organizations with Research Preview access. Knowledge cutoff and training data cutoff are January 2026, context window remains 1,000,000 tokens with a maximum output length of 128,000 tokens.

Relevant for agent loop scenarios: Mid-Conversation System Messages allow inserting role commands after user turns without repeating the complete system prompt — this increases prompt cache hits and reduces input costs. The lower threshold for prompt caching decreases to 1,024 tokens (from 4,096 in 4.7), which favors cost-efficient operations on longer contexts.


Source: simonwillison.net · Published May 29, 2026
Lumi AI News — AI-assisted curation in accordance with Article 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.2.0.

Share on: