Our work follows the relevant configuration requirements of ViLBERT, please read the README.md of ViLBERT for more information.
- Download the datasets. for example:MIRFLICKR and COCO
- For the method of extracting picture features, please refer to ViLBERT's Data Setup.
- Finetune the network with the classification task to accelerate training convergence. Such as:
python pre_teacher_main.py --from_pretrained <pretrained_model_path>
- Teacher NetWork: set the "from_pretrained" and start training.
python train_teacher.py --from_pretrained <pretrained_model_path>
- Student NetWork: set the teacher network model path and start training.
python train_student.py --from_pretrained <teacher_network_model_path>
If you find this work is helpful in your research, please cite:
@article{wang2022ckdh,
title={Crossmodal knowledge distillation hashing},
author={Wang, Jinhui and Jin, Lu and Li, Zechao and Tang, Jinhui},
journal={SCIENTIA SINICA Technologica},
year={2022},
doi={10.1360/SST-2021-0214}
}
Thanks to ViLBERT for their help in this work.