Skip to content

Aleafy/RelightVid

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RelightVid

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting
Ye Fang*, Zeyi Sun*, Shangzhan Zhang, Tong Wu, Yinghao Xu, Pan Zhang, Jiaqi Wang, Gordon Wetzstein, Dahua Lin

*Equal Contribution

     

3919_1743547437.mp4

📜 News

✨ [2025/4/2] The github code, project page, video and huggingface demo are released!

✨ [2025/1/27] We release the paper of RelightVid!

💡 Highlights

  • 🔥 We propose RelightVid, a flexible framework for realistic and temporally consistent video relighting with excellent performance compared to existing baselines.
  • 🔥 We build LightAtlas, a large-scale video dataset with real-world and 3D-rendered lighting pairs to provide rich illumination priors.
  • 🔥 By incorporating temporal layers, the framework ensures strong frame-to-frame consistency while maintaining high-quality relighting throughout the video.
  • 🔥 Supporting diverse inputs such as text prompts, background videos, and HDR maps, RelightVid allows for versatile and adaptive lighting manipulation across various video scenarios.

💾 Installation

git clone https://github.com/Aleafy/RelightVid.git
cd RelightVid

conda create -n relitv python=3.10 
conda activate relitv

pip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

📦 Model Preparation

To reproduce the results, please download the following models, and organize them according to the following directory structure:

You can download all the models through here.

RelightVid
├── models
│   ├── realistic-vision-v51                              // stable diffusion base model
│   │   ├── text_encoder
│   │   │   ├── config.json
│   │   │   └── model.safetensors
│   │   ├── tokenizer
│   │   │   ├── merges.txt
│   │   │   ├── special_tokens_map.json
│   │   │   ├── tokenizer_config.json
│   │   │   └── vocab.json
│   │   ├── unet
│   │   │   └── diffusion_pytorch_model.safetensors
│   │   ├── vae
│   │   │   ├── config.json
│   │   │   └── diffusion_pytorch_model.safetensors
│   ├── iclight_sd15_fbc.safetensors                      // ic-light weights
│   ├── relvid_mm_sd15_fbc.pth                            // relightvid motion weights

⚡ Quick Start

Perform video relighting with customized background condition

python inference.py \
  --input "./assets/input/lion.mp4" \
  --mask "./assets/mask/lion" \
  --bg_cond "./assets/video_bg/stage_light2.mp4" \
  --output_path "output/lion_stagelight2.mp4"

Here, --input is the original video, --mask is the per-frame foreground mask directory, --bg_cond is the background lighting condition video, and --output_path is where the relit result will be saved. You can freely combine any input video with any background video.

Click for more example bash commands of RelightVid
python inference.py --input "./assets/input/woman.mp4" --mask "./assets/mask/woman" --bg_cond "./assets/video_bg/universe1.mp4" --output_path "output/woman_universe1.mp4"
python inference.py --input "./assets/input/woman.mp4" --mask "./assets/mask/woman" --bg_cond "./assets/video_bg/beach.mp4" --output_path "output/woman_beach.mp4"
python inference.py --input "./assets/input/man.mp4" --mask "./assets/mask/man" --bg_cond "./assets/video_bg/tunnel.mp4" --output_path "output/man_tunnel.mp4"
python inference.py --input "./assets/input/man2.mp4" --mask "./assets/mask/man2" --bg_cond "./assets/video_bg/fantasy.mp4" --output_path "output/man2_fantasy.mp4"
python inference.py --input "./assets/input/lion.mp4" --mask "./assets/mask/lion" --bg_cond "./assets/video_bg/stage_light1.mp4" --output_path "output/lion_stagelight1.mp4"
python inference.py --input "./assets/input/truck.mp4" --mask "./assets/mask/truck" --bg_cond "./assets/video_bg/universe3.mp4" --output_path "output/truck_universe3.mp4"
python inference.py --input "./assets/input/truck.mp4" --mask "./assets/mask/truck" --bg_cond "./assets/video_bg/universe1.mp4" --output_path "output/truck_universe1.mp4"
python inference.py --input "./assets/input/glass.mp4" --mask "./assets/mask/glass" --bg_cond "./assets/video_bg/snow.mp4" --output_path "output/glass_snow.mp4"
python inference.py --input "./assets/input/dance.mp4" --mask "./assets/mask/dance" --bg_cond "./assets/video_bg/sunscape.mp4" --output_path "output/dance_sunscape.mp4"

📝 TODO List

  • [✔] Release the arxiv and the project page.
  • [✔] Release the relighting inference code and huggingface demo using background condition.
  • Release the relighting inference code using more (text/hdr) conditions. (Expected in May)
  • Release LightAtlas data augmentation pipeline and datasets. (Expected by the end of June)
  • Release training and evaluation code.

✒️ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and citation 📝

@article{fang2025relightvid,
  title={RelightVid: Temporal-Consistent Diffusion Model for Video Relighting},
  author={Fang, Ye and Sun, Zeyi and Zhang, Shangzhan and Wu, Tong and Xu, Yinghao and Zhang, Pan and Wang, Jiaqi and Wetzstein, Gordon and Lin, Dahua},
  journal={arXiv preprint arXiv:2501.16330},
  year={2025}
}

📄 License

Creative Commons License
This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

📚 Related Works

  • AnimateDiff: A text-to-video diffusion framework that extends image diffusion models with lightweight temporal modules for generating coherent and controllable animations.
  • IC-Light: enables intuitive lighting control in image generation by integrating editable illumination conditions into diffusion models.
  • Relightful Harmonization: introduces a method for consistent subject relighting in diverse scenes using synthetic data and background guidance.
  • Switchlight: performs intrinsic decomposition of portraits into identity and lighting components, enabling flexible and realistic relighting through disentangled editing.

About

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages