RelightVid: Temporal-Consistent Diffusion Model for Video Relighting
Ye Fang*,
Zeyi Sun*,
Shangzhan Zhang,
Tong Wu,
Yinghao Xu,
Pan Zhang,
Jiaqi Wang,
Gordon Wetzstein,
Dahua Lin
*Equal Contribution
3919_1743547437.mp4
✨ [2025/4/2] The github code, project page, video and huggingface demo are released!
✨ [2025/1/27] We release the paper of RelightVid!
- 🔥 We propose RelightVid, a flexible framework for realistic and temporally consistent video relighting with excellent performance compared to existing baselines.
- 🔥 We build LightAtlas, a large-scale video dataset with real-world and 3D-rendered lighting pairs to provide rich illumination priors.
- 🔥 By incorporating temporal layers, the framework ensures strong frame-to-frame consistency while maintaining high-quality relighting throughout the video.
- 🔥 Supporting diverse inputs such as text prompts, background videos, and HDR maps, RelightVid allows for versatile and adaptive lighting manipulation across various video scenarios.
git clone https://github.com/Aleafy/RelightVid.git
cd RelightVid
conda create -n relitv python=3.10
conda activate relitv
pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txtTo reproduce the results, please download the following models, and organize them according to the following directory structure:
You can download all the models through here.
RelightVid ├── models │ ├── realistic-vision-v51 // stable diffusion base model │ │ ├── text_encoder │ │ │ ├── config.json │ │ │ └── model.safetensors │ │ ├── tokenizer │ │ │ ├── merges.txt │ │ │ ├── special_tokens_map.json │ │ │ ├── tokenizer_config.json │ │ │ └── vocab.json │ │ ├── unet │ │ │ └── diffusion_pytorch_model.safetensors │ │ ├── vae │ │ │ ├── config.json │ │ │ └── diffusion_pytorch_model.safetensors │ ├── iclight_sd15_fbc.safetensors // ic-light weights │ ├── relvid_mm_sd15_fbc.pth // relightvid motion weights
python inference.py \
--input "./assets/input/lion.mp4" \
--mask "./assets/mask/lion" \
--bg_cond "./assets/video_bg/stage_light2.mp4" \
--output_path "output/lion_stagelight2.mp4"Here, --input is the original video, --mask is the per-frame foreground mask directory, --bg_cond is the background lighting condition video, and --output_path is where the relit result will be saved. You can freely combine any input video with any background video.
Click for more example bash commands of RelightVid
python inference.py --input "./assets/input/woman.mp4" --mask "./assets/mask/woman" --bg_cond "./assets/video_bg/universe1.mp4" --output_path "output/woman_universe1.mp4"python inference.py --input "./assets/input/woman.mp4" --mask "./assets/mask/woman" --bg_cond "./assets/video_bg/beach.mp4" --output_path "output/woman_beach.mp4"python inference.py --input "./assets/input/man.mp4" --mask "./assets/mask/man" --bg_cond "./assets/video_bg/tunnel.mp4" --output_path "output/man_tunnel.mp4"python inference.py --input "./assets/input/man2.mp4" --mask "./assets/mask/man2" --bg_cond "./assets/video_bg/fantasy.mp4" --output_path "output/man2_fantasy.mp4"python inference.py --input "./assets/input/lion.mp4" --mask "./assets/mask/lion" --bg_cond "./assets/video_bg/stage_light1.mp4" --output_path "output/lion_stagelight1.mp4"python inference.py --input "./assets/input/truck.mp4" --mask "./assets/mask/truck" --bg_cond "./assets/video_bg/universe3.mp4" --output_path "output/truck_universe3.mp4"python inference.py --input "./assets/input/truck.mp4" --mask "./assets/mask/truck" --bg_cond "./assets/video_bg/universe1.mp4" --output_path "output/truck_universe1.mp4"python inference.py --input "./assets/input/glass.mp4" --mask "./assets/mask/glass" --bg_cond "./assets/video_bg/snow.mp4" --output_path "output/glass_snow.mp4"python inference.py --input "./assets/input/dance.mp4" --mask "./assets/mask/dance" --bg_cond "./assets/video_bg/sunscape.mp4" --output_path "output/dance_sunscape.mp4"- [✔] Release the arxiv and the project page.
- [✔] Release the relighting inference code and huggingface demo using background condition.
- Release the relighting inference code using more (text/hdr) conditions. (Expected in May)
- Release LightAtlas data augmentation pipeline and datasets. (Expected by the end of June)
- Release training and evaluation code.
If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝
@article{fang2025relightvid,
title={RelightVid: Temporal-Consistent Diffusion Model for Video Relighting},
author={Fang, Ye and Sun, Zeyi and Zhang, Shangzhan and Wu, Tong and Xu, Yinghao and Zhang, Pan and Wang, Jiaqi and Wetzstein, Gordon and Lin, Dahua},
journal={arXiv preprint arXiv:2501.16330},
year={2025}
}
This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
- AnimateDiff: A text-to-video diffusion framework that extends image diffusion models with lightweight temporal modules for generating coherent and controllable animations.
- IC-Light: enables intuitive lighting control in image generation by integrating editable illumination conditions into diffusion models.
- Relightful Harmonization: introduces a method for consistent subject relighting in diverse scenes using synthetic data and background guidance.
- Switchlight: performs intrinsic decomposition of portraits into identity and lighting components, enabling flexible and realistic relighting through disentangled editing.