Skip to content

MemTrain: Self-Supervised Memory Training for LLM Agents

Bottom line: MemTrain enhances memory capabilities of LLM agents through self-supervised pretraining based on two complementary reconstruction tasks, without requiring costly annotated data.

Researchers have developed MemTrain, a framework for preoptimizing the contextual memory of large language models without annotated training data. The approach uses two coupled proxy tasks over unlabeled Wikipedia corpora and demonstrates improvements of up to 17.67 points over direct task-specific training.

The core problem is that LLM agents must store and retrieve information over long interaction sequences. Previous approaches typically require end-to-end training with reinforcement learning on concrete tasks. However, this is labor-intensive and expensive: high-quality annotated problems for memory-intensive scenarios are difficult to obtain, and the resulting training data often exhibits insufficient diversity for general memory behaviors.

MemTrain addresses this through two simultaneous proxy tasks over unlabeled Wikipedia: (1) An end-to-end reconstruction objective, where the model must restore masked entities after multiple memory update rounds — this promotes memory stability from the final outcome; (2) An intermediate-memory recall task that forces the model to reconstruct deleted historical information from intermediate states of the memory. This promotes consistent compression and memory completeness. Both objectives are jointly optimized via GRPO.

Experiments on long-text QA and retrieval-augmented QA benchmarks show consistent improvements: across various models, MemTrain achieves gains of up to 17.67 points in downstream training of memory-intensive reasoning tasks compared to direct task-specific post-training.


Source: arxiv.org · Published June 1, 2026
Lumi AI News — AI-assisted curation pursuant to Article 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.2.9.

Share on: