GitHub

This code builds on the codebase from "Reward Model Ensembles Help Mitigate Overoptimization" It provides setup and scripts for:

Supervised fine-tuning (SFT)
Reward model (RM) training
Policy optimization

Pretrained SFT Models For experiments involving Pythia models, you can use the following pretrained SFT checkpoints:

Pythia 70M SFT on AlpacaFarm: https://huggingface.co/tlc4418/pythia_70m_sft

Pythia 1.4B SFT on AlpacaFarm: https://huggingface.co/tlc4418/pythia_1.4b_sft_policy

For experiments on 44M, 1.3B RM we can use the given SFT's to train the RMs, for experiments on 2.6B RM you need to train the SFT and then use it for RM training.

Reward Model (RM) Training: You can train the reward model using the following datasets:

AlpacaFarm preference data
Preferences collected from the 1.4B SFT policy on AlpacaFarm instructions
Custom training data Refer to the dataset preparation guide here: Dataset Guide (Coste et al.) and can be filtered using filterdata.py

For experiments on skywork we feed the training data Skywork-Reward-Preference-80K-v0.2 in the Skywork-Reward-Llama-3.1-8B-v0.2 RM and save the final embedding and rewards, and then normailze the rewards before training the EBRM.

EBM Model Training Once RM data is prepared, you can train the Energy-Based Model (EBM) using: python -m src.reward_modeling.ebm_training.ebm_nce_plus

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
EBM_models		EBM_models
configs		configs
src		src
Readme.md		Readme.md
skywork.py		skywork.py
skywork_normalize.py		skywork_normalize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

AnamikaLochab/EBRM

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages