MoCha Demo Implementation

This repository provides a demo implementation of MoCha Towards Movie-Grade Talking Character Synthesis. built on top of HunyuanVideo.

We fine-tune HunyuanVideo on the Hallo3 dataset. Due to differences in training data, model scale, and training strategy, this demo does not fully reproduce the performance of the original MoCha model, but it reflects the core design and and serves as a baseline for further research and study.

This implementation supports two generation modes:

st2v: speech + text → video
sti2v: image + speech + text → video

Many thanks to the community for sharing — An emotional narrative, created with light manual editing on clips generated by MoCha, has surpassed 1 million views on X.

🔔News

🔥[2025-12-27]: Released a demo implementation built on HunyuanVideo — 🤗 Checkpoints and Code

How to use

1. Create Conda Environment

conda env create -f environment.yml
conda activate mocha

This environment is tested with:

Python 3.11
PyTorch 2.4.1 + CUDA 12.1
diffusers 0.36.0
transformers 4.49.0

2. Download Checkpoint

Download the MoCha transformer checkpoint to a local path:

python download_ckpt.py

After downloading, record the local path to the checkpoint file (e.g. model.ckpt).

3. Inference

Speech + Text → Video (st2v)

python inference.py \
  --task st2v \
  --audio_path demos/man_1.mp3 \
  --output_path demos/output.mp4 \
  --transformer_ckpt_path /path/to/your/model.ckpt

Speech + Image + Text → Video (sti2v)

python inference.py \
  --task sti2v \
  --audio_path demos/man_1.mp3 \
  --i2v_img_path demos/man_1.png \
  --output_path demos/output.mp4 \
  --transformer_ckpt_path /path/to/your/model.ckpt

Citation

🌟 If you find this work useful, please give us a free cite:

@article{wei2025mocha,
  title={Mocha: Towards movie-grade talking character synthesis},
  author={Wei, Cong and Sun, Bo and Ma, Haoyu and Hou, Ji and Juefei-Xu, Felix and He, Zecheng and Dai, Xiaoliang and Zhang, Luxin and Li, Kunpeng and Hou, Tingbo and others},
  journal={arXiv preprint arXiv:2503.23307},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
demos		demos
.gitignore		.gitignore
README.md		README.md
download_ckpt.py		download_ckpt.py
embed_audio.py		embed_audio.py
environment.yml		environment.yml
inference.py		inference.py
pipeline_hunyuan_video_mocha.py		pipeline_hunyuan_video_mocha.py
requirements.txt		requirements.txt
transformer_hunyuan_video_mocha.py		transformer_hunyuan_video_mocha.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MoCha Demo Implementation

🔔News

How to use

1. Create Conda Environment

2. Download Checkpoint

3. Inference

Citation

About

Uh oh!

Releases

Packages

Languages

congwei1230/MoCha-Demo

Folders and files

Latest commit

History

Repository files navigation

MoCha Demo Implementation

🔔News

How to use

1. Create Conda Environment

2. Download Checkpoint

3. Inference

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages