Skip to content

CMU-AIRe/MRT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Optimizing Test-Time Compute via Meta Reinforcement Finetuning

This repository contains the code for our paper titled "Optimizing Test-Time Compute via Meta Reinforcement Finetuning." In this work, we introduce a novel approach to optimizing test-time compute through meta reinforcement learning, aiming to balance the efficiency and discovery capabilities of Large Language Models (LLMs).

Citation

If you use our work or codebase in your research, please cite our paper:

@misc{qu2025optimizingtesttimecomputemeta,
      title={Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning},
      author={Yuxiao Qu and Matthew Y. R. Yang and Amrith Setlur and Lewis Tunstall and Edward Emanuel Beeching and Ruslan Salakhutdinov and Aviral Kumar},
      year={2025},
      eprint={2503.07572},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2503.07572},
}

About

Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published