Latent Context Language Models: Scalable KV-Cache Compression for Long Contexts10. June 2026AI Models, Claude CodeLCLMs compress KV-caches through encoder-decoder architecture up to 1:16 more efficiently than previous methods while reducing peak memory consumption and processing time. Share on: