Skip to content

Structure-Aware Curriculum Learning for LLMs via Manifold Bandits

The bottom line: Structured curriculum learning strategies that leverage task relationships in latent space achieve better downstream performance than pure difficulty prioritization.

Researchers propose Bayesian Manifold Curriculum (BMC), a new approach to intelligently select training tasks for Large Language Models — instead of prioritizing isolated difficult problems, the method accounts for the geometric structure of latent space and the relationships between tasks.

Training Large Language Models with reinforcement learning requires efficient strategies for selecting training tasks. Previous adaptive curriculum learning methods typically treat task selection as a standard bandit problem with independent arms, focusing primarily on prompts of medium difficulty. However, this approach overlooks the structured and heterogeneous nature of the task space.

The newly presented Bayesian Manifold Curriculum (BMC) instead formulates task selection as a manifold-structured bandit problem with endogenous non-stationarity: tasks are linked through the model’s latent representation, and sampling decisions can deliberately shape how learning signals distribute across this space. The approach organizes problems into a hierarchical task tree and uses Bayesian learning to guide selection.

Empirical results show that different sampling strategies generate non-trivial trade-offs between three dimensions: productivity (strength of learning signal), diversity (coverage of the task manifold), and utility (relevance for evaluations). The experiments suggest that focusing purely on difficult tasks is insufficient to achieve strong downstream performance. Instead, it becomes clear that structure and type awareness in task selection are crucial.


Source: arxiv.org · Published June 17, 2026
Lumi AI News — AI-assisted curation in accordance with Article 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.7.1.

Share on: