SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation

CVPR Workshop 2025

Yonwoo Choi

Overview

SDAV creates photorealistic 3D human avatars from a single image input. Our approach combines video diffusion models with 3D Gaussian Splatting to generate high-fidelity, animatable avatars while preserving the subject's identity and appearance details.

Key Features:

Single-image input to full-body 3D avatar generation

News

[2025/09/09] Model files have been hosted on Hugging Face.
[2025/05/08] Code has been released.
[2025/03.30] SVAD has been accepted to CVPR 2025 Workshop.

Installation

Requirements

CUDA 12.1
Python 3.9
PyTorch 2.3.1

Environment Setup

First you need to clone the repo:

git clone https://github.com/yc4ny/SVAD.git
cd SVAD

Our default installation method is based on Conda package and environment management:

cd envs/
conda env create --file svad.yaml
conda activate svad

Install required packages via pip:

pip install pip==24.0 # For pytorch-lightning==1.6.5
pip install cmake     # For dlib 
pip install -r svad.txt

Install CUDA related packages:

conda install -c "nvidia/label/cuda-12.1.1" cuda-nvcc cuda-cudart-dev libcurand-dev
conda install -c nvidia cuda-profiler-api

Install MMPose:

pip install --no-cache-dir -U openmim 
pip install mmdet
pip install mmcv==2.2.0 -f https://download.openmmlab.com/mmcv/dist/cu121/torch2.3/index.html
pip install mmpose

Install modified version of 3DGS:

cd diff_gaussian_rasterization_depth/
python setup.py install
cd .. # Return to envs directory

Install 3DGS Renderer:

cd diff-gaussian-rasterization
pip install . 
cd ../../ # Return to base avatar directory

Model Checkpoints

All required model checkpoints and assets are hosted on Hugging Face:

👉 yc4ny/SVAD-models

You can clone the models directly with git lfs:

git lfs install
git clone https://huggingface.co/yc4ny/SVAD-models

🚨Package Compatibility Notes🚨

To ensure seamless execution of our pipeline, certain package modifications are required. Please apply the fixes detailed in FIX.md before running the code. These modifications address compatibility issues with dependencies and will prevent runtime errors during execution.

Creating Synthetic Data for Avatar Training

Pipeline

Training Avatars with Script

We provide a script to automate the entire pipeline for avatar generation from a single image. This script supports parallel processing (training multiple avatars at the same time 1 per 1 gpu).
Put your input image in data/{name_of_image}/ref.png.
In our example we place image in data/female-3-casual/ref.png/.

# For a single avatar
bash script.sh <name_of_image> <gpu_id>

# Examples for parallel processing on multiple GPUs
bash script.sh female-3-casual 0  # Running on GPU 0
bash script.sh thuman_181 1    # Running on GPU 1

If you have correctly installed the environment and applied the package fixes in FIX.md, there should be no problem running the automated script. However, if there are issues, we also provide a step by step guide to train your avatar. Follow the details in TRAIN.md.

Avatar Rendering

Neutral Pose

After training your avatar, you can render the rotating avatar with neutral pose:

cd avatar/main/
python get_neutral_pose.py --subject_id {SUBJECT_ID} --test_epoch 4

Animation

You can animate your avatar with :

python animation.py --subject_id {SUBJECT_ID} --test_epoch 4 --motion_path {MOTION_PATH}

The rendering code is based on ExAvatar so to prepare motion sequences, please refer to this link.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Citing

If you find this code useful for your research, please consider citing the following paper:

@inproceedings{choi2025svad,
  title={SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation},
  author={Choi, Yonwoo},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={3137--3147},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
assets		assets
avatar		avatar
docs		docs
envs		envs
face_warp		face_warp
fitting		fitting
submodules		submodules
utils		utils
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
script.sh		script.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation

CVPR Workshop 2025

Yonwoo Choi

Overview

News

Installation

Requirements

Environment Setup

Model Checkpoints

🚨Package Compatibility Notes🚨

Creating Synthetic Data for Avatar Training

Pipeline

Training Avatars with Script

Avatar Rendering

Neutral Pose

Animation

Acknowledgements

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

yc4ny/SVAD

Folders and files

Latest commit

History

Repository files navigation

SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation

CVPR Workshop 2025

Yonwoo Choi

Overview

News

Installation

Requirements

Environment Setup

Model Checkpoints

🚨Package Compatibility Notes🚨

Creating Synthetic Data for Avatar Training

Pipeline

Training Avatars with Script

Avatar Rendering

Neutral Pose

Animation

Acknowledgements

Citing

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages