🌐 Demystifying Multilingual Reasoning in Process Reward Modeling

Overview

Multilingual PRM is a framework for training and evaluating process reward models across multiple languages, focusing on multilingual reasoning. This repository provides code, datasets, and scripts to reproduce the experiments from our paper.

📦 Datasets

Multilingual PRM800K: vicky23456/prm800k-phrase2
Multilingual Math Shepherd: vicky23456/multilingual-mathshepherd

Download datasets to the /data folder:

# Example: using huggingface-cli
huggingface-cli download vicky23456/multilingual-PRM800K --local-dir /data
huggingface-cli download vicky23456/multilingual-mathshepherd --local-dir /data

🏋️‍♂️ Training

Train the Multilingual PRM model:

bash sft.sh

🔍 Analysis & Evaluation

Sample N candidates:

sh infer.sh

Best-of-N Evaluation:

sh best-of-n.sh

📊 Results

Our experiments demonstrate that process reward models can effectively generalize reasoning across languages, outperforming standard reward models in multilingual settings. See our paper for detailed results and analysis.

📖 Paper

If you use this work or datasets, please cite:

@article{wang2025demystifying,
  title={Demystifying Multilingual Chain-of-Thought in Process Reward Modeling},
  author={Wang, Weixuan and Wu, Minghao and Haddow, Barry and Birch, Alexandra},
  journal={arXiv preprint arXiv:2502.12663},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
paper		paper
README.md		README.md
best-of-n.sh		best-of-n.sh
eval_prm.py		eval_prm.py
ft.py		ft.py
grader.py		grader.py
infer-mgsm.py		infer-mgsm.py
infer.sh		infer.sh
math_normalize.py		math_normalize.py
sft.sh		sft.sh
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌐 Demystifying Multilingual Reasoning in Process Reward Modeling

Overview

📦 Datasets

🏋️‍♂️ Training

🔍 Analysis & Evaluation

📊 Results

📖 Paper

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌐 Demystifying Multilingual Reasoning in Process Reward Modeling

Overview

📦 Datasets

🏋️‍♂️ Training

🔍 Analysis & Evaluation

📊 Results

📖 Paper

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages