Skip to content

ReXTime/ReXTime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos (NeurIPS 2024 D&B Poster Paper)

ReXTime is designed to test AI models' temporal reasoning within video events, focusing on understanding cause-and-effect across different video segments, with 921 validation samples and 2,143 test samples.

|🏠Project Page | 🐙Github | 🤗Huggingface Dataset | 🏆Leaderboard | 📖Paper |

Teaser

Table of Contents

Getting Started

Clone this repo

git clone https://github.com/ReXTime/ReXTime.git
cd ReXTime

Clone dataset from Huggingface

git clone https://huggingface.co/datasets/ReXTime/ReXTime

Source video downloading

  1. ActivityNet

Download the raw video data from the Download page at ActivityNet official website. You need to fill in their request form to have a 7-day-access to download the videos from the drive folders. You can find the form here.

  1. QVHighlights

Download raw video data from the link provided by Moment-DETR. Extract the file.

wget https://nlp.cs.unc.edu/data/jielei/qvh/qvhilights_videos.tar.gz
tar -xvzf qvhilights_videos.tar.gz

Directory structure

.
├── videos/                                     # Path to the QVHighlights raw videos, can be anywhere.
│   ├── 9c_w8HU3hqc_210.0_360.0.mp4             # Video 1
│   └── efCSWDWjm6g_360.0_510.0.mp4             # Video 2
├── Anet_videos_15fps_short256/                 # Path to the ActivityNet raw videos, can be anywhere.
│   ├── v_5R3h6lxne90.mp4                       # Video 1
│   └── v_aQ-F9wr0HQ4.mp4                       # Video 2
├── ReXTime/                                    # Code repo
│   ├── ReXTime/                                # Huggingface dataset repo
│   ├── evaluation/                             # Evaluation code
│   ├── demo/                                   # Inference demo script
│   └── requirements.txt                        # Packages for environment
...

Install dependencies

conda create --name=rextime python=3.10 -y
conda activate rextime
pip install -r requirements.txt

Inference Demo

Here we provide open source model evaluation demo and proprietary models evaluation demo. You need to modify the path to the dataset repo and paths to the directory of two source raw videos in the following scripts. For proprietary models evaluation, you need to fill in your API key.

Open source MLLM demo:

python ./demo/inference.py \
    --dataset_path ./ReXTime \
    --anet_vid_dir ${Path to the AcrivityNet video directory} \
    --qvh_vid_dir ${Paht to the QVHighlights video directory}

Proprietary MLLM demo:

OPENAI_API_KEY="sk-***********************************" python ./demo/request.py \
    --dataset_path ./ReXTime \
    --anet_vid_dir ${Path to the AcrivityNet video directory} \
    --qvh_vid_dir ${Paht to the QVHighlights video directory}

Evaluation

This is an example of output/submission file in .jsonl format. For the assessment of moment grounding, you only need to provide "qid" and "pred_relevant_windows". For the assessment of multi-choice VQA, you only need to provide "qid" and "ans". For the assessment of grounding VQA, you need to provide "qid" "pred_relevant_windows" and "ans" in your submission file. For grounding VQA evaluation, the predicted answer should be conditioned on the predicted time span.

{"qid": "anet_val384", "pred_relevant_windows": [[0.0, 15.8304]], "ans": "A"}
{"qid": "qvh_val114", "pred_relevant_windows": [[0.0, 25.50]], "ans": "A"}
...

Modify the file paths in the following and run:

python ./evaluation/rextime_eval.py \
    --submission_path ${submission_path} \
    --gt_path ${gt_path} \
    --save_path ${save_path}

Here we only provide the ground truth file of validation set in 'data/rextime_val.jsonl'. To access on the test set, please submit the predicted file to ReXTime Leaderboard.

Acknowledgement

License

The annotation files are under CC BY-NC-SA 4.0 license. All the code are under MIT license, see LICENSE.

Cite

BibTeX:

@article{chen2024rextime,
  title={ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos},
  author={Chen, Jr-Jen and Liao, Yu-Chien and Lin, Hsi-Che and Yu, Yu-Chu and Chen, Yen-Chun and Wang, Yu-Chiang Frank},
  journal={arXiv preprint arXiv:2406.19392},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •