Skip to content

STRIDE: Tracking Training Data Influence in LLMs via Sparse Recovery

The Bottom Line: STRIDE formalizes training data attribution as a sparse recovery problem in activation space, achieving an order of magnitude faster results than gradient-based methods.

Researchers introduce STRIDE, a method for tracing model predictions back to individual training data in large language models. The method achieves 13× faster computation than previous approaches by utilizing activations instead of parameter gradients.

Training Data Attribution (TDA) aims to trace a model’s predictions back to its training data. The gold standard follows causal interventions: observing model changes when data is added or removed. For large language models, however, repeated retraining is computationally prohibitively expensive.

The approach described in STRIDE shifts the problem from parameter space to activation space. Rather than tracking gradients across billions of parameters — a practically impossible task — the method learns lightweight “steering operators” that reflect behavioral changes resulting from training on data subsets. By measuring how these operators influence test predictions, individual training example influences are recovered via sparse linear decomposition.

The method formulates the problem in terms of compressive sensing. Empirically, STRIDE achieves state-of-the-art results on LLM pre-training data with 13× faster computation than previous methods. Practical applications include data selection, detection of data contamination, and qualitative model analysis.


Source: arxiv.org · Published June 2, 2026
Lumi AI News — AI-assisted curation in accordance with Art. 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.2.9.

Share on: