GitHub - night-chen/InF_Embed: Official Code Repository for paper "Towards Better Instruction Following Retrieval Models"

IF_Embed

This is the implementation of IF-Embed repository.

Setup

Create conda environment and install relevant packages:

# Create a conda environment named 'if_embed' with Python 3.10
conda create -n if_embed python=3.10 -y

# Activate the environment
conda activate if_embed

# Install required Python packages
pip install -r requirements.txt

# Install flash-attn
python -m pip install flash_attn

Train with different configurations:

Modify key configurations in update_args.py, you can create a list of sequential training jobs:

experiments = [
        {"model_type": "basic", "model": "Qwen/Qwen2.5-1.5B", "pooling": "last", "share_encoder": True, "num_train_epochs": 2, "contrast_mode": "qk", "data_reverse": False, "padding_side": "left", "train_file": "aarontrinh02/ms_marco_synthetic_data"},
    ]

Please refer to run.py for detailed hyperparameters. Use one-line command for running a list of sequential training jobs:

python update_args.py

Evaluation

For evaluation, we also provide one-line commands for both Bright and MAIR:

### For Bright
python bright_update_args.py

### For MAIR
python mair_update_args.py

The Loss Map

`model_type`	`contrast_mode`	Corresponding Loss
basic	qk	$\ell^{\text{uni}}_{P}$
basic	kq	$\ell^{\text{uni}}_{IQ}$
basic	only_neg	$\ell^{\text{uni}}_{I}$
map	no_trick	$\ell^{\text{uni}}_{P, IQ}$
map	qk_with_neg	$\ell^{\text{uni}}_{P, I}$
map	kq_with_neg	$\ell^{\text{uni}}_{I, IQ}$
map	no_trick_with_neg	$\ell^{\text{uni}}_{P, I, IQ}$
map_add	no_trick	$\ell^{\text{multi}}_{P, IQ}$
map_add	qk_with_neg	$\ell^{\text{multi}}_{P, I}$
map_add	kq_with_neg	$\ell^{\text{multi}}_{I, IQ}$
map_add	no_trick_with_neg	$\ell^{\text{multi}}_{P, I, IQ}$

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bright_configs		bright_configs
collators		collators
dataloaders		dataloaders
evals		evals
models		models
trainers		trainers
utils		utils
.gitignore		.gitignore
README.md		README.md
bright_update_args.py		bright_update_args.py
config.py		config.py
data_utils.py		data_utils.py
eval_mair.py		eval_mair.py
logger_config.py		logger_config.py
mair_update_args.py		mair_update_args.py
metrics.py		metrics.py
run_train.py		run_train.py
update_arg.py		update_arg.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IF_Embed

Setup

Train with different configurations:

Evaluation

The Loss Map

About

Uh oh!

Releases

Packages

Languages

night-chen/InF_Embed

Folders and files

Latest commit

History

Repository files navigation

IF_Embed

Setup

Train with different configurations:

Evaluation

The Loss Map

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages