Skip to content

wufan-cse/IC-World

Repository files navigation

   

This is the official implementation for paper, IC-World: In-Context Generation for Shared World Modeling. This implementation is based on FastVideo & DanceGRPO, supporting advanced Wan2.1-I2V-14B with efficient multi-nodes training and 4-steps inference.

Updates

  • [2026.01.10]: 🔥 We released training & inference codes.
  • [2025.12.28]: We released dataset used in IC-World.
  • [2025.12.13]: We released evaluation codes.
  • [2025.12.03]: 🔥 We released the paper in arXiv!

If you have any research or engineering inquiries, feel free to open issues or email us directly at [email protected].

Getting Started

Weights Preparation

  1. Our trained model can be download in fffan/IC-World-I2V-14B.
  2. LEPARD model can be downloaded here.
  3. Pi3 model can be downloaded using huggingface, please refer here.
  4. SpatialTrackerV2 model contains two parts, Front model and Offline model.
IC-World/weights
    ├── IC-World-I2V-14B
    ├── lepard/pretrained/3dmatch/model_best_loss.pth
    ├── Pi3
    │   ├── config.json
    │   ├── model.safetensors
    ├── SpatialTrackerV2_Front
    │   ├── config.json
    │   ├── model.safetensors
    └── SpatialTrackerV2-Offline    
        ├── config.json
        └── model.safetensors

Installation

# clone the code
git clone https://github.com/wufan-cse/IC-World.git
cd IC-World
git submodule update --init --recursive

# create environment
conda create -n icworld python=3.10
conda activate icworld

pip install -e .

Inference

python inference.py \
    --pretrained_model_name_or_path ./weights/IC-World-I2V-14B \
    --lora_weights_path ./weights/IC-World-I2V-14B \
    --lora_weight_name pytorch_lora_weights.safetensors \
    --input_image1 ./assets/img.png \
    --input_image2 ./assets/img1.png \
    --prompt "" \
    --height 480 \
    --width 832 \
    --num_frames 49 \
    --fps 16 \
    --guidance_scale 1.0 \
    --num_inference_steps 4 \
    --seed 42 \
    --output output.mp4

Training

Two settings, adapt the enviroment variables:

  1. static_scene_dynamic_camera_train
  2. dynamic_scene_static_camera_train
# preprocessing with 8 H20 GPUs
# setup PROMPT_FILE & OUTPUT_DIR for the two settings
export PROMPT_FILE="./data/IC-World-dataset/static_scene_dynamic_camera_train.txt"
export OUTPUT_DIR="./data/preprocess/static_scene_dynamic_camera_train"

bash scripts/preprocess/preprocess_wan_rl_embeddings_ic_world.sh

# using the following script for training with 8 H20 GPUs or other GPUs with more than 80GB, such as H200
# setup DATA_JSON_PATH
export DATA_JSON_PATH="./data/preprocess/static_scene_dynamic_camera_train/videos2caption.json"
bash scripts/finetune/finetune_wan_i2v_grpo_ic_world.sh 

Evaluation

More details can be found in benchmark.

# calculate the geometry consistency score
python fastvideo/models/geometry_model.py \
    --video_dir <your_own_directory> \
    --confidence_threshold 0.1 \
    --interval 5

# calculate the motion consistency score
python fastvideo/models/motion_model.py \
    --video_dir <your_own_directory> \
    --grid_size 10 \
    --interval 5

Arguments:

  • --video_dir: Path to the input video directory. Note that each video is a horizontal combination of two sub-video. (Default: assets)
  • --confidence_threshold: Confidence threshold for point filtering (choose from: 0.1, 0.5, 0.7).
  • --grid_size: Grid size of query points(choose from: 10, 20, 30).
  • --interval: Frame sampling interval.

Video Demos of IC-World

Static scene + Dynamic camera

  • In the left demo, the letter above the door first appears in the left view and later reappears in the right view.
  • In the right demo, the advertising tag on the lower-right table first appears in the right view and subsequently reappears in the left view.
demo1.mp4
demo2.mp4

Dynamic scene + Static camera

demo3.mp4

TODOs

  • Support more video foundation models.
  • Release checkpoints (before 2026.01.31).
  • Release training & inference codes.
  • Release dataset.
  • Release inference codes.
  • Release evaluation metrics codes.
  • Release paper.

Acknowledgement

We learned and reused code from the following projects:

FastVideo, DanceGRPO, Wan2.1, LightX2V and Diffusers.

We thank the authors for their contributions to the community!

Citation

If you find IC-World useful and insightful for your research, please consider giving a star ⭐ and citation.

@article{wu2025icworld,
  title={IC-World: In-Context Generation for Shared World Modeling},
  author={Wu, Fan and Wei, Jiacheng and Li, Ruibo and Xu, Yi and Li, Junyou and Ye, Deheng and Lin, Guosheng},
  journal={arXiv preprint arXiv:2512.02793},
  year={2025}
}

About

This repository contains the code of the paper "IC-World: In-Context Generation for Shared World Modeling".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages