VaSE: Stochastic KV-Cache Eviction for Reasoning Models3. June 2026AI Models, Claude CodeVaSE achieves higher accuracy than existing sparse-attention methods at 4x KV-cache compression, thereby reducing the memory bottleneck of reasoning models. Share on: