Language Compression in LLMs: Output Reduces Costs, Input Increases Them26. June 2026AI ModelsOutput compression reduces inference costs by 1.4–3x, while input compression increases them by an average of 1.15x because models respond to imprecise prompts with longer answers. Share on: