Character-Centric Understanding of Animated Movies

Zhongrui Gui¹, Junyu Xie¹, Tengda Han¹, Weidi Xie ², Andrew Zisserman¹

¹ Visual Geometry Group (VGG), University of Oxford
² School of Artificial Intelligence (SAI), Shanghai Jiao Tong University

Method and Evaluation

In this work, we propose to construct an audio-visual character bank automatically to enable audio-visual recognition of animated characters. We further leverage the results for downstream tasks, including Audio Description (AD) Generation and Character-Aware Subtitling. There are several main components in our work, and we list them below.

Pipeline

See here for constructing the Audio-Visual Character Bank.
See here for Audio-Visual Recognition for Animated Characters.
See here for Application on Downstream Tasks.

Evaluation

Videos can be downloaded here.
All annotations and the corresponding meta-information can be found here.
Evaluation scripts, including Character Box mIoU, Character Name AP, and Audio Recognition AP can be found here. For CRITIC and CIDEr, please refer to the original AutoAD repository.

Predicted Results

The visual character recognition results can be downloaded here.
The audio character recognition results can be downloaded here.
The AD predictions (by Qwen2-VL w/ LLaMA3 or VideoLLaMA2 w/ LLaMA3) can be downloaded here.
The character-aware subtitling results can be downloaded here.

Installation

The base environment is mostly based on DINOv2 and SAM2. To set up the required dependencies, please follow the instructions below:

conda env create -f conda.yaml
conda activate animated_ad

cd ..
git clone https://github.com/facebookresearch/sam2.git && cd sam2
pip install -e .

This environment is set up for automatic construction of character bank and visual character recognition.

Citation

If you find this repository helpful, please consider citing our work! 😊

@article{gui2025character,
          title={Character-Centric Understanding of Animated Movies},
          author={Gui, Zhongrui and Xie, Junyu and Han, Tengda and Xie, Weidi and Zisserman, Andrew},
          journal={arXiv preprint arXiv:2509.12204},
          year={2025}
        }

References

AutoAD-Zero: https://github.com/Jyxarthur/AutoAD-Zero
Qwen2-VL: https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct
LLaMA3: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
build_character_bank		build_character_bank
character_recognition		character_recognition
resources		resources
README.md		README.md
conda.yaml		conda.yaml
requirements.txt		requirements.txt
teasor.png		teasor.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Character-Centric Understanding of Animated Movies

Method and Evaluation

Pipeline

Evaluation

Predicted Results

Installation

Citation

References

About

Uh oh!

Releases

Packages

Languages

zhrgui/Animated_AD

Folders and files

Latest commit

History

Repository files navigation

Character-Centric Understanding of Animated Movies

Method and Evaluation

Pipeline

Evaluation

Predicted Results

Installation

Citation

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages