IREG

Official implement for paper: "Whether you can locate or not? Interactive Referring Expression Generation"

🔥 News

2024.3.20: Release the codebase.
2023.7.26: Our paper is accepted by ACM MM 2023 Main Track.

Step1 Feature extract

First download two sets of data:

COCO2014: Train images and Train/Val annotations
Refcoco: Refcoco,Refcoco+,Refcocog

The feature extraction code is under misc/feature_extraction and is divided into two types:

Extract the features of the 36 proposal bboxes that touch the base: the code is in refcocog_proposal.py
Extract features of a given box: The code is in refcocog_target.py

Feature extraction requires the installation of detectron2. Just refer to the installation link in VLT5 github. It can be solved with one line of commands;

Pretrained Checkpoints

Download from：

VLT5 Epoch30.pth link: https://drive.google.com/drive/folders/12Acv2YLQSxgrx_-4mahUvqNikcz7XfPi
OFA Refcoco, Refcoco+, Refcocog base ckpt, details can be seen in:https://github.com/OFA-Sys/OFA/blob/main/checkpoints.md#finetuning-ofa-base

Enviroment setup

python version 3.7.4

pip install -r requirements.txt

Start

cd Dialog
bash scripts/REG_VLT5.sh 2 refcoco unc 0 1 25552

Project structure

1.1 ckpt

Store all checkpoints during the training process;

1.2 misc

Including feature extraction, bad re collection, testing, visualization and draft code, etc.;

1.3 OFA

OFA's base warehouse has modified the refcoco_eval part;

1.4 scripts

The training startup script needs to change the data and pre-training model weight paths accordingly;

1.5 src

The main code is here

1.5.1 eval_utils

Ref test codebase;

1.5.2 modeling

Main model file, huggingface style;

1.5.3 tools

Various functional functions, parameter files, distributed tool functions, training base framework, etc. are all here;

reg_data.py, reg_model.py, reg.py: Mainly responsible for base and RL training. The tests here only include the most basic one-shot test;

multitask_reg_data.py, multitask_reg_model.py, multitask_reg.py are mainly responsible for:

Dialog Training, Dialog Training only needs to modify the Dataset. In fact, a DialogDataset is added to reg_data;
The process of Dialog Test is a little more complicated. It needs to be determined whether it has passed the OFA test, but in fact there is only one function written in multitask_reg_model;
In the main process of multitask_reg.py, it is necessary to change the logic that one more base model is needed to serve the test. In multitask_reg, self.model is the refiner and basemodel is the basemodel.

✒ Citation

Please cite our paper if you find it helpful :)

@misc{ye2023locate,
      title={Whether you can locate or not? Interactive Referring Expression Generation}, 
      author={Fulong Ye and Yuxing Long and Fangxiang Feng and Xiaojie Wang},
      year={2023},
      eprint={2308.09977},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IREG

🔥 News

Step1 Feature extract

Pretrained Checkpoints

Enviroment setup

Start

Project structure

1.1 ckpt

1.2 misc

1.3 OFA

1.4 scripts

1.5 src

1.5.1 eval_utils

1.5.2 modeling

1.5.3 tools

✒ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
OFA		OFA
misc		misc
scripts		scripts
src		src
README.md		README.md
kill_all.sh		kill_all.sh
requirements.txt		requirements.txt

superhero-7/IREG

Folders and files

Latest commit

History

Repository files navigation

IREG

🔥 News

Step1 Feature extract

Pretrained Checkpoints

Enviroment setup

Start

Project structure

1.1 ckpt

1.2 misc

1.3 OFA

1.4 scripts

1.5 src

1.5.1 eval_utils

1.5.2 modeling

1.5.3 tools

✒ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages