ITBench-AA: Frontier Models Fall Short of 50-Percent Mark on Enterprise IT Tasks

1. June 2026
AI Models, Claude AI, Claude Code

Current frontier models achieve less than 50 percent success rate on the new ITBench-AA benchmark for evaluating agentic IT capabilities, revealing a significant gap between model capabilities and production readiness for autonomous IT tasks.

Share on:

ITBench-AA: Frontier Models Fall Short of 50-Percent Mark on Enterprise IT Tasks

Lumi AI News

Legal

Topics