GauntletBench: New Benchmark Reveals Limitations of AI Agents26. June 2026AI Models, Claude Code, Claude CoworkCurrent AI agents fail at complex visual tasks in professional applications far more frequently than previous benchmarks suggest. Share on: