Skip to content

Official implementation of "Character-Centric Understanding of Animated Movies" (ACM MM 2025).

Notifications You must be signed in to change notification settings

zhrgui/Animated_AD

Repository files navigation

Character-Centric Understanding of Animated Movies

Zhongrui Gui1, Junyu Xie1, Tengda Han1, Weidi Xie 2, Andrew Zisserman1

1 Visual Geometry Group (VGG), University of Oxford
2 School of Artificial Intelligence (SAI), Shanghai Jiao Tong University

Project page

Method and Evaluation

In this work, we propose to construct an audio-visual character bank automatically to enable audio-visual recognition of animated characters. We further leverage the results for downstream tasks, including Audio Description (AD) Generation and Character-Aware Subtitling. There are several main components in our work, and we list them below.

Pipeline

  • See here for constructing the Audio-Visual Character Bank.
  • See here for Audio-Visual Recognition for Animated Characters.
  • See here for Application on Downstream Tasks.

Evaluation

  • Videos can be downloaded here.
  • All annotations and the corresponding meta-information can be found here.
  • Evaluation scripts, including Character Box mIoU, Character Name AP, and Audio Recognition AP can be found here. For CRITIC and CIDEr, please refer to the original AutoAD repository.

Predicted Results

  • The visual character recognition results can be downloaded here.
  • The audio character recognition results can be downloaded here.
  • The AD predictions (by Qwen2-VL w/ LLaMA3 or VideoLLaMA2 w/ LLaMA3) can be downloaded here.
  • The character-aware subtitling results can be downloaded here.

Installation

The base environment is mostly based on DINOv2 and SAM2. To set up the required dependencies, please follow the instructions below:

conda env create -f conda.yaml
conda activate animated_ad

cd ..
git clone https://github.com/facebookresearch/sam2.git && cd sam2
pip install -e .

This environment is set up for automatic construction of character bank and visual character recognition.

Citation

If you find this repository helpful, please consider citing our work! 😊

@article{gui2025character,
          title={Character-Centric Understanding of Animated Movies},
          author={Gui, Zhongrui and Xie, Junyu and Han, Tengda and Xie, Weidi and Zisserman, Andrew},
          journal={arXiv preprint arXiv:2509.12204},
          year={2025}
        }

References

AutoAD-Zero: https://github.com/Jyxarthur/AutoAD-Zero
Qwen2-VL: https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct
LLaMA3: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

About

Official implementation of "Character-Centric Understanding of Animated Movies" (ACM MM 2025).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages