NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction

[ICLR 2026] The repository contains the official implementation of NOVA3R. Given unposed multi-view images, NOVA3R recovers complete, non-overlapping 3D geometry, reconstructing visible and occluded regions with physical plausibility.

NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction
Weirong Chen, Chuanxia Zheng, Ganlin Zhang, Andrea Vedaldi, Daniel Cremers
ICLR 2026

[Paper] [Project Page]

Requirements

Python: 3.10
PyTorch: 2.2+ with CUDA 12.1+
GPU: NVIDIA GPU with ≥24GB VRAM (48GB recommended). Evaluated on NVIDIA L40s GPU.

Installation

# Clone with submodules
git clone --recursive https://github.com/wrchen530/nova3r.git
cd nova3r

# Automated setup
bash setup.sh

# Download checkpoints
bash scripts/download_checkpoints.sh

See docs/INSTALL.md for manual installation.

Demo

Run 3D reconstruction on your own images:

conda activate nova3r

# Single image (scene-level)
python demo_nova3r.py \
  --images demo/examples/scene_1.png \
  --ckpt checkpoints/scene_n1/checkpoint-last.pth \
  --resolution 518 392

# Two images (multi-view, scene-level)
python demo_nova3r.py \
  --images demo/examples/scrream_scene09_200.png demo/examples/scrream_scene09_275.png \
  --ckpt checkpoints/scene_n2/checkpoint-last.pth \
  --resolution 518 392

Output .ply point clouds and .mp4 360° videos are saved to demo/outputs/<image_name>/ (configurable with --output_dir).

Point Cloud AE

Reconstruct a point cloud using the Stage 1 point-conditioned autoencoder:

# Point cloud autoencoding from a SCRREAM scene
python demo_nova3r_ae.py \
  --input_ply demo/examples/scrream_scene09.ply \
  --ckpt checkpoints/scene_ae/checkpoint-last.pth \
  --num_queries 50000

Python API

from demo_nova3r import predict

# Single image → 3D point cloud
pts3d = predict(
    ckpt_path="checkpoints/scene_n1/checkpoint-last.pth",
    image_paths=["path/to/image.png"],
    resolution=(518, 392),
    output_path="output.ply",
)
# pts3d is a numpy array of shape (N, 3)

Checkpoints

Download all checkpoints:

bash scripts/download_checkpoints.sh

Model	Training Dataset	Input	Checkpoint	Size
Pts2Pts (AE)	3DFront + Scannetpp	point cloud	`checkpoints/scene_ae/checkpoint-last.pth`	262 MB
Img2Pts (N=1)	3DFront + Scannetpp	1 image	`checkpoints/scene_n1/checkpoint-last.pth`	5.8 GB
Img2Pts (N=2)	3DFront + Scannetpp	2 images	`checkpoints/scene_n2/checkpoint-last.pth`	5.8 GB

Checkpoints are hosted on HuggingFace.

Evaluation

Reproduce benchmark results:

# Download datasets
bash scripts/download_datasets.sh

# SCRREAM evaluation (1-view / 2-view)
bash scripts/eval/eval_scrream_n1_stage2.sh --data_root /path/to/datasets
bash scripts/eval/eval_scrream_n2_stage2.sh --data_root /path/to/datasets

See docs/EVALUATION.md for detailed instructions.

BibTeX

If you find NOVA3R useful for your research and applications, please cite us using this BibTex:

@inproceedings{chennova3r,
  title={NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction},
  author={Chen, Weirong and Zheng, Chuanxia and Zhang, Ganlin and Vedaldi, Andrea and Cremers, Daniel},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026}
}

License

This project is licensed under the Apache License 2.0. See LICENSE for full terms. Third-party code (e.g., DUSt3R, CroCo, VGGT, TripoSG) retains its original license.

Acknowledgments

We build on prior advances in multi-view 3D reconstruction, global scene representations, and flow-based generative models. Our codebase utilizes code from VGGT, DUSt3R, TripoSG, and LaRI. We sincerely appreciate the authors for their wonderful work and for releasing their code and data processing scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
croco		croco
data/scrream		data/scrream
demo		demo
docs		docs
dust3r		dust3r
eval/mv_recon		eval/mv_recon
nova3r		nova3r
scripts		scripts
third_party/triposg		third_party/triposg
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_nova3r.py		demo_nova3r.py
demo_nova3r_ae.py		demo_nova3r_ae.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction

Requirements

Installation

Demo

Point Cloud AE

Python API

Checkpoints

Evaluation

BibTeX

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction

Requirements

Installation

Demo

Point Cloud AE

Python API

Checkpoints

Evaluation

BibTeX

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages