IDMR

The official repo for IDMR: Towards Instance-Driven Precise Visual Correspondence in Multimodal Retrieval.

Model

IDMR-8B
IDMR-26B
More to come!

Installation

git clone https://github.com/BwLiu01/IDMR.git
cd IDMR
pip install -r requirements.txt

Inference & Examples

Inference examples with Gradio: IDMR-Demo
Inference locally:

python inference.py

Data

We release both the training and test splits of IDMR on Hugging Face Datasets:

Training Set: 🤗 lbw18601752667/IDMR-train
Test Set: 🤗 lbw18601752667/IDMR-test

Training

Run the following script to train IDMR.

MODEL_NAME=OpenGVLab/InternVL2_5-8B
DATA_DIR=./data/IDMR/train/parquet
IMAGE_DIR=./data/IDMR/train/images
OUTPUT_DIR=./ckpt/IDMR-8B

WANDB_API_KEY=YOUR_WANDB_API_KEY
wandb login --relogin $WANDB_API_KEY

torchrun --nproc_per_node=8 --master_port=22459 --max_restarts=0 train.py \
 --model_name $MODEL_NAME --model_backbone internvl_2_5 --bf16 --pooling last \
 --dataset_name $DATA_DIR  \
 --lora_target_modules qkv,wqkv,wo,w1,w2,w3 \
 --subset_name MMEB_train IDMR_train_coco IDMR_train_objects365 IDMR_train_openimages \
 --image_dir $IMAGE_DIR \
 --max_len 1024 --output_dir $OUTPUT_DIR --logging_steps 20 \
 --lr_scheduler_type linear --learning_rate 2e-5 --num_train_epochs 1 \
 --warmup_steps 120 --save_steps 100 --normalize True \
 --temperature 0.02 --per_device_train_batch_size 64 \
 --lora --lora_r 8 \
 --grad_cache True --gc_q_chunk_size 8 --gc_p_chunk_size 8 --wandb True\

Evaluation

Please run eval.sh.

Evaluates both in-domain and out-of-domain splits.
Evaluation data can be directly downloaded from IDMR-test.

Citation

@article{liu2025idmr,
  title   = {IDMR: Towards Instance-Driven Precise Visual Correspondence in Multimodal Retrieval},
  author={Bangwei Liu and Yicheng Bao and Shaohui Lin and Xuhong Wang and Xin Tan and Yingchun Wang and Yuan Xie and Chaochao Lu},
  journal = {arXiv preprint arXiv:2504.00954},
  year    = {2025}
}

Acknowledgement

We have adapted code from VLM2Vec, a comprehensive implementation of transforming MLLMs to embedding models.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
grad_cache		grad_cache
src		src
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
eval_IDMR.sh		eval_IDMR.sh
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py
train_IDMR-8B.sh		train_IDMR-8B.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IDMR

Model

Installation

Inference & Examples

Data

Training

Evaluation

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IDMR

Model

Installation

Inference & Examples

Data

Training

Evaluation

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages