This is the official implementation for paper, IC-World: In-Context Generation for Shared World Modeling. This implementation is based on FastVideo & DanceGRPO, supporting advanced Wan2.1-I2V-14B with efficient multi-nodes training and 4-steps inference.
- [2026.01.10]: 🔥 We released training & inference codes.
- [2025.12.28]: We released dataset used in IC-World.
- [2025.12.13]: We released evaluation codes.
- [2025.12.03]: 🔥 We released the paper in arXiv!
If you have any research or engineering inquiries, feel free to open issues or email us directly at [email protected].
- Our trained model can be download in fffan/IC-World-I2V-14B.
- LEPARD model can be downloaded here.
- Pi3 model can be downloaded using huggingface, please refer here.
- SpatialTrackerV2 model contains two parts, Front model and Offline model.
IC-World/weights
├── IC-World-I2V-14B
├── lepard/pretrained/3dmatch/model_best_loss.pth
├── Pi3
│ ├── config.json
│ ├── model.safetensors
├── SpatialTrackerV2_Front
│ ├── config.json
│ ├── model.safetensors
└── SpatialTrackerV2-Offline
├── config.json
└── model.safetensors# clone the code
git clone https://github.com/wufan-cse/IC-World.git
cd IC-World
git submodule update --init --recursive
# create environment
conda create -n icworld python=3.10
conda activate icworld
pip install -e .python inference.py \
--pretrained_model_name_or_path ./weights/IC-World-I2V-14B \
--lora_weights_path ./weights/IC-World-I2V-14B \
--lora_weight_name pytorch_lora_weights.safetensors \
--input_image1 ./assets/img.png \
--input_image2 ./assets/img1.png \
--prompt "" \
--height 480 \
--width 832 \
--num_frames 49 \
--fps 16 \
--guidance_scale 1.0 \
--num_inference_steps 4 \
--seed 42 \
--output output.mp4Two settings, adapt the enviroment variables:
- static_scene_dynamic_camera_train
- dynamic_scene_static_camera_train
# preprocessing with 8 H20 GPUs
# setup PROMPT_FILE & OUTPUT_DIR for the two settings
export PROMPT_FILE="./data/IC-World-dataset/static_scene_dynamic_camera_train.txt"
export OUTPUT_DIR="./data/preprocess/static_scene_dynamic_camera_train"
bash scripts/preprocess/preprocess_wan_rl_embeddings_ic_world.sh
# using the following script for training with 8 H20 GPUs or other GPUs with more than 80GB, such as H200
# setup DATA_JSON_PATH
export DATA_JSON_PATH="./data/preprocess/static_scene_dynamic_camera_train/videos2caption.json"
bash scripts/finetune/finetune_wan_i2v_grpo_ic_world.sh More details can be found in benchmark.
# calculate the geometry consistency score
python fastvideo/models/geometry_model.py \
--video_dir <your_own_directory> \
--confidence_threshold 0.1 \
--interval 5
# calculate the motion consistency score
python fastvideo/models/motion_model.py \
--video_dir <your_own_directory> \
--grid_size 10 \
--interval 5Arguments:
--video_dir: Path to the input video directory. Note that each video is a horizontal combination of two sub-video. (Default:assets)--confidence_threshold: Confidence threshold for point filtering (choose from:0.1,0.5,0.7).--grid_size: Grid size of query points(choose from:10,20,30).--interval: Frame sampling interval.
- In the left demo, the letter above the door first appears in the left view and later reappears in the right view.
- In the right demo, the advertising tag on the lower-right table first appears in the right view and subsequently reappears in the left view.
demo1.mp4 |
demo2.mp4 |
demo3.mp4
- Support more video foundation models.
- Release checkpoints (before 2026.01.31).
- Release training & inference codes.
- Release dataset.
- Release inference codes.
- Release evaluation metrics codes.
- Release paper.
We learned and reused code from the following projects:
FastVideo, DanceGRPO, Wan2.1, LightX2V and Diffusers.
We thank the authors for their contributions to the community!
If you find IC-World useful and insightful for your research, please consider giving a star ⭐ and citation.
@article{wu2025icworld,
title={IC-World: In-Context Generation for Shared World Modeling},
author={Wu, Fan and Wei, Jiacheng and Li, Ruibo and Xu, Yi and Li, Junyou and Ye, Deheng and Lin, Guosheng},
journal={arXiv preprint arXiv:2512.02793},
year={2025}
}