The Bottom Line: An innovative multi-agent harness design with context resets instead of compression solves the problem of coherence loss in long-running application development. Claude can now develop high-quality full-stack applications in multi-hour autonomous sessions without human intervention.

An innovative harness design with generator and evaluator agents enables Claude to create high-quality frontend designs and develop complete applications across multi-hour autonomous coding sessions – without human intervention.

Developing language models for complex software projects requires novel engineering approaches. A core problem lies in coherence loss during long tasks: as the context window grows, models tend to degrade in quality or even prematurely conclude their work – a phenomenon known as “Context Anxiety.”

The solution does not lie in context compression, but in strategic context resets. Unlike traditional compression, where earlier conversation segments are summarized, a reset provides the agent with a completely clean slate. This eliminates Context Anxiety entirely, but requires structured handoff artifacts that carry sufficient state for the next agent.

The three-agent system developed consists of: a planner that decomposes specifications into manageable tasks; a generator that sequentially implements these tasks; and an evaluator that reliably and stylishly assesses outputs. The evaluator uses concrete, measurable criteria to translate subjective judgments such as “Is this design good?” into objective metrics – inspired by Generative Adversarial Networks.

This architecture enabled successful multi-hour autonomous coding sessions for complex full-stack applications, with structured handoffs maintaining consistent context between sessions.

Source: www.anthropic.com

Share on:

Multi-Agent Architecture for Long-Running Application Development

Lumi AI News

Legal

Topics