DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations [CVPR 2025]

Official PyTorch implementation for the paper:

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations, CVPR 2025.

Ziqiao Peng, Yanbo Fan, Haoyu Wu, Xuan Wang, Hongyan Liu, Jun He, Zhaoxin Fan

Paper | Project Page | Code

Comparison of single-role models (Speaker-Only and Listener-Only) with DualTalk. Unlike single-role models, which lack key interaction elements, DualTalk supports speaking and listening role transition, multi-round conversations, and natural interaction.

Environment

Linux
Python 3.6+
Pytorch 1.12.1
CUDA 11.3
ffmpeg
MPI-IS/mesh
pytorch3d

Clone the repo:

git clone https://github.com/ZiqiaoPeng/DualTalk.git
cd DualTalk

Create conda environment:

conda create -n dualtalk python=3.8.8
conda activate dualtalk
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
python install_pytorch3d.py

Before installation, you need to create an account on the FLAME website and prepare your login and password beforehand. You will be asked to provide them in the installation script. Then, run the install.sh script to download the FLAME data and install the environment.

bash install.sh

Demo

Download the pretrained model.

# If you are in China, you can set up a mirror.
# export HF_ENDPOINT=https://hf-mirror.com
pip install huggingface-hub
huggingface-cli download ZiqiaoPeng/DualTalk --local-dir model

Given the audio and blendshape data, run:

python demo.py --audio1_path ./demo/xkHwlcDSOjc_sub_video_109_000_speaker2.wav --audio2_path ./demo/xkHwlcDSOjc_sub_video_109_000_speaker1.wav --bs2_path ./demo/xkHwlcDSOjc_sub_video_109_000_speaker1.npz

The results will be saved to result_DualTalk folder.

Dataset

Download the dataset from DualTalk_Dataset, unzip it, and place it in the data folder. The data folder format is as follows:

data/
├── train/
│   ├── xxx_speaker1.wav          # Speaker 1 audio files
│   ├── xxx_speaker1.npz          # Speaker 1 FLAME parameters
│   ├── xxx_speaker2.wav          # Speaker 2 audio files
│   ├── xxx_speaker2.npz          # Speaker 2 FLAME parameters
│   └── ...
├── test/
│   ├── xxx_speaker1.wav
│   ├── xxx_speaker1.npz
│   ├── xxx_speaker2.wav
│   ├── xxx_speaker2.npz
│   └── ...
└── ood/                          # Out-of-distribution test data
    ├── xxx_speaker1.wav
    ├── xxx_speaker1.npz
    ├── xxx_speaker2.wav
    ├── xxx_speaker2.npz
    └── ...

Training and Testing

Training

To train the model, run:
```
 python main.py
```
You can find the trained models in save_DualTalk folder.

Testing

To test the model, run:
```
python test.py
```
The results will be saved to result_DualTalk folder.

Visualization

To visualize the results, run:
```
 cd render
 python render_dualtalk_output.py
```
You can find the outputs in the result_DualTalk folder.

To stitch the two speakers' video, run:

 cd render
 python two_person_video_stitching.py

Evaluation

To evaluate the model, run:
```
 cd metric
 python metric.py
```
You can find the metrics results in the metric folder.

Citation

If you find this code useful, please consider citing:

@inproceedings{peng2025dualtalk,
  title={Dualtalk: Dual-speaker interaction for 3d talking head conversations},
  author={Peng, Ziqiao and Fan, Yanbo and Wu, Haoyu and Wang, Xuan and Liu, Hongyan and He, Jun and Fan, Zhaoxin},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={21055--21064},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations [CVPR 2025]

Environment

Demo

Dataset

Training and Testing

Training

Testing

Visualization

Evaluation

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
demo		demo
media		media
metric		metric
render		render
result_DualTalk		result_DualTalk
DualTalk.py		DualTalk.py
README.md		README.md
dataset_test.py		dataset_test.py
dataset_train.py		dataset_train.py
demo.py		demo.py
install.sh		install.sh
install_pytorch3d.py		install_pytorch3d.py
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
test.py		test.py
wav2vec.py		wav2vec.py

ZiqiaoPeng/DualTalk

Folders and files

Latest commit

History

Repository files navigation

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations [CVPR 2025]

Environment

Demo

Dataset

Training and Testing

Training

Testing

Visualization

Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages