GitHub - sabijun/MT-RewardTree

Overview of MT-RewardTree

Feature

We employ the Monte Carlo Tree Search (MCTS) method to generate token-level translation preference pairs (Prefixed data) for both model training and testing purposes. For contrast, we also utilize conventional approaches to generate sequence-level translation preference pairs (Arbitrary data). We partition both datasets into training and testing subsets for further evaluation.
To assess the effectiveness of our custom dataset, we use prefixed data and Arbitrary data to train our Implicit Process Reward Model. You can reach our models in Hugging Face.
Finally, we deploy our Implicit Process Reward Model in both Test-time Alignment and Hypothesis Ensembling frameworks, demonstrating significant performance improvements across evaluation metrics.
We also provide the code for generating token-level prefixed translation preference pairs and utilizing our Process RM to obtain sequence-level results. You can access these codes in the code directory.

Citation

@article{feng2025mtrewardtree,
  title={MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling},
  author={Feng, Zhaopeng and Ren, Jiahan and Su, Jiayuan and Zheng, Jiamei and Tang, Zhihang and Wang, Hongwei and Liu, Zuozhu},
  journal={arXiv preprint arXiv:2503.12123},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
code		code
images		images
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview of MT-RewardTree

Feature

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview of MT-RewardTree

Feature

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages