WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion via Training-Free Guidance
Chenxi Song1, Yanming Yang1, Tong Zhao1, Ruibo Li2, Chi Zhang1*
1AGI Lab, Westlake University
2The College of Computing and Data Science, Nanyang Technological University
*Corresponding Author
- Paper released on arXiv
- Project page available
- Code and implementation details (Coming very soon)
- Inference pipeline and usage manual
- [2025.09] arXiv preprint is available.
- [2025.09] Project page is online.
Welcome to WorldForge! WorldForge is a training-free framework that unlocks the world-modeling potential of video diffusion models, delivering controllable 3D/4D generation with unprecedented realism. Our method leverages the rich latent world priors of large-scale video diffusion models to achieve precise trajectory control and photorealistic content generation without requiring additional training.
- Training-Free Framework: No additional training or fine-tuning required, preserving pretrained knowledge
- Intra-Step Recursive Refinement (IRR): Recursive refinement mechanism during inference for precise trajectory injection
- Flow-Gated Latent Fusion (FLF): Leverages optical flow similarity to decouple motion from appearance in latent space
- Dual-Path Self-Corrective Guidance (DSG): Adaptively corrects trajectory drift through guided and unguided path comparison
- 3D Scene Generation: Generate controllable 3D scenes from single view images
- 4D Video Re-cam: Dynamic trajectory-controlled re-rendering of video content
- Video Editing: Support for object removal, addition, face swapping, and subject transformation
WorldForge adopts a warping-and-repainting pipeline with three complementary mechanisms:
- IRR (Intra-Step Recursive Refinement): Enables precise trajectory injection through recursive optimization
- FLF (Flow-Gated Latent Fusion): Selectively injects trajectory guidance into motion-related channels
- DSG (Dual-Path Self-Corrective Guidance): Maintains trajectory consistency and visual fidelity
We showcase diverse capabilities including:
- 3D Scene Generation: Voyager experiences in artworks, AIGC content, portrait photography, city walks
- 4D Video Re-cam: Camera arc rotation, local close-ups, outpainting, viewpoint transferring, video stabilization
- Video Editing: Object removal/addition, face swapping, subject transformation, try-on applications
# Installation instructions will be provided upon code release
# Requirements: Python 3.8+, PyTorch, etc.# Code and detailed usage instructions coming soon
# Example usage will be provided here# Detailed documentation will be available upon releaseOur method demonstrates superior performance compared to existing SOTA methods:
- 3D Scene Generation: More consistent scene content under novel viewpoints with improved detail and trajectory accuracy
- 4D Video Re-cam: Realistic high-quality content re-rendering along target trajectories
- Quantitative Metrics: Superior results in realism, trajectory consistency, and visual fidelity
@misc{song2025worldforgeunlockingemergent3d4d,
title={WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance},
author={Chenxi Song and Yanming Yang and Tong Zhao and Ruibo Li and Chi Zhang},
year={2025},
url={https://arxiv.org/abs/2509.15130},
}We thank the research community for their valuable contributions to video diffusion models and 3D/4D generation. Special thanks to the following open-source projects that inspired and supported our work:
- Wan-2.1 - Large-scale video generation model
- SVD (Stable Video Diffusion) - Video diffusion model by Stability AI
- VGGT - 3D generation and reconstruction toolkit
- ReCamMaster - Trajectory-controlled video generation
- TrajectoryCrafter - Trajectory-based video synthesis
- NVS-Solver - Novel view synthesis solution
- ViewExtrapolator - View extrapolation for 3D scenes
- DepthCrafter - Video sequence depth estimation
- Mega-SAM - Video depth and pose estimation
For questions and discussions, please feel free to contact:
- Chenxi Song: [email protected]
- Chi Zhang: [email protected]








