Tangram: Static KV-Cache Compression for Faster Multi-Turn LLM Serving16. June 2026AI Models, Claude CodeTangram achieves statically predictable memory budgets per attention head to eliminate fragmentation and latency drag caused by dynamic KV-cache compression. Share on: