A new multi-agent harness architecture with planner, generator, and evaluator enables Claude to autonomously develop full-stack applications over hours, with explicit context resets and structured handoffs between agent sessions being key to success.