ITBench-AA: Frontier Models Fall Short of 50-Percent Mark on Enterprise IT Tasks1. June 2026AI Models, Claude AI, Claude CodeCurrent frontier models achieve less than 50 percent success rate on the new ITBench-AA benchmark for evaluating agentic IT capabilities, revealing a significant gap between model capabilities and production readiness for autonomous IT tasks. Share on: