Bohan Li - 李博涵
Email: bohan.li77_at_gmail.com [GitHub] [Google Scholar] |
![]() |
I'm a Ph.D. student at Shanghai Jiao Tong University (SJTU) and Eastern Institute of Technology(EIT), Ningbo, advised by Prof. Xin Jin, Prof. Wenjun Zeng, Prof. Chao Ma, and Prof. Xiaokang Yang. I'm a Visiting Scholar of BME and ECE at the National University of Singapore (NUS) and work with Prof. Yueming Jin. I also work with Prof. Hao Zhao at Tsinghua University. I did my Master's degree at South China University of Technology (SCUT) and Bachelor's degree at Northeastern University (NEU). I have also spent some time at BAAI, Changcheng, Lixiang, MEGVII, IDEA, Tencent AILab, NetEase AILab, PhiGent, ZTE.
I have a broad research interest in 3D computer vison and world modeling, including autonomous vehicles and robotics, 3D scene comprehension and generation, 3D structed information processing, representation disentanglement, and AI for Science.
|
OmniNWM: Omniscient Driving Navigation World Models Bohan Li*, Zhuang Ma*, Dalong Du*, Baorui Peng, Zhujin Liang, Zhenqiang Liu, Chao Ma, Yueming Jin, Hao Zhao, Wenjun Zeng, Xin Jin†. Arxiv |
|
Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction Bohan Li, Jiajun Deng, Yasheng Sun, Xiaofeng Wang, Xin Jin†, Wenjun Zeng. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) |
|
OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation Bohan Li, Xin Jin†, Jianan Wang, Yukai Shi, Yasheng Sun, Xiaofeng Wang, Zhuang Ma, Baao Xie, Chao Ma, Xiaokang Yang, Wenjun Zeng. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) |
|
UniScene: Unified Occupancy-centric Driving Scene Generation Bohan Li*, Jiazhe Guo*, Hongsi Liu*, Yingshuang Zou*, Yikang Ding*, Xiwu Chen, Hu Zhu, Feiyang Tan, Chi Zhang, Tiancai Wang, Shuchang Zhou, Li Zhang, Xiaojuan Qi, Hao Zhao, Mu Yang, Wenjun Zeng, Xin Jin†. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025) |
|
NaviNeRF++: Towards Interpretable 3D Reconstruction via Unsupervised Disentangled Representation Learning Baao Xie, Zequn Zhang, Huanting Guo, Qiuyu Chen, Hu Zhu, Bohan Li, Wenjun Zeng, Xin Jin†. IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI) |
|
Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method Bohan Li, Xin Jin†, Hu Zhu, Hongsi Liu, Ruikai Li, Jiazhe Guo, Kaiwen Cai, Chao Ma, Yueming Jin, Hao Zhao, Xiaokang Yang, Wenjun Zeng. Arxiv |
|
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond Zheng Zhu*†, Xiaofeng Wang*, Wangbo Zhao*, Chen Min*, Bohan Li*, Nianchen Deng*, Min Dou*, Yuqi Wang*, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang. Arxiv |
|
ORV: 4D Occupancy-centric Robot Video Generation Xiuyu Yang*, Bohan Li*, Shaocong Xu,Nan Wang,Chongjie Ye,Zhaoxi Chen,Minghan Qin,Yikang Ding,Xin Jin,Hang Zhao, Hao Zhao†. Arxiv |
|
Challenger: Affordable Adversarial Driving Video Generation Zhiyuan Xu*, Bohan Li*, Huan-ang Gao, Mingju Gao, Yong Chen, Ming Liu, Chenxu Yan, Hang Zhao, Shuo Feng, Hao Zhao†. Conference on Robot Learning (CoRL 2025 SAFE-ROL Workshop Oral) |
|
Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation Wenyao Zhang*, Hongsi Liu*, Bohan Li*, Jiawei He, Zekun Qi, Yunnan Wang, Shengyang Zhao, Xinqiang Yu, Wenjun Zeng, Xin Jin†. International Conference on Computer Vision (ICCV 2025) |
|
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting Nan Wang, Yuantao Chen, Lixing Xiao, Weiqing Xiao, Bohan Li, Zhaoxi Chen, Chongjie Ye, Shaocong Xu, Saining Zhang, Ziyang Yan, Pierre Merriaux, Lei Lei, Tianfan Xue, Hao Zhao†. Neural Information Processing Systems (NeurIPS 2025) |
|
One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation Zheng Geng, Nan Wang, Shaocong Xu, Chongjie Ye, Bohan Li, Zhaoxi Chen, Sida Peng, Hao Zhao†. Conference on Robot Learning ( CoRL 2025 Oral) |
|
DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation Jiazhe Guo, Yikang Ding, Xiwu Chen, Shuo Chen, Bohan Li, Yingshuang Zou, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Zhiheng Li, Hao Zhao†. International Conference on Computer Vision (ICCV 2025) |
|
MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction Yingshuang Zou, Yikang Ding, Chuanrui Zhang, Jiazhe Guo, Bohan Li, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Haoqian Wang†. British Machine Vision Association (BMVC 2025) |
|
TAPTRv2: Attention-based Position Update Improves Tracking Any Point Hongyang Li, Feng Li, Hao Zhang, Tianhe Ren, Shilong Liu, Bohan Li, Zhaoyang Zeng, Lei Zhang†. Neural Information Processing Systems (NeurIPS 2024) |
|
Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion Bohan Li, Jiajun Deng, Wenyao Zhang, Liang, Dalong Du, Xin Jin†, Wenjun Zeng. European Conference on Computer Vision (ECCV 2024) |
|
Closed-Loop Unsupervised Representation Disentanglement with β-VAE Distillation and Diffusion Probabilistic Feedback Xin Jin*†, Bohan Li*, Baao Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng. European Conference on Computer Vision (ECCV 2024) |
|
Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, Xiaofeng Wang, Yunnan Wang, Xin Jin†, Wenjun Zeng. International Joint Conference on Artificial Intelligence (IJCAI 2024 Oral) |
|
One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception Bohan Li, Yasheng Sun, Jingxin Dong, Zheng Zhu, Jinming Liu, Xin Jin†, Wenjun Zeng. AAAI Conference on Artificial Intelligence (AAAI 2024) |
|
NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation Baao Xie*, Bohan Li*, Zequn Zhang, Junting Dong, Xin Jin†, Jingyu Yang, Wenjun Zeng. International Conference on Computer Vision (ICCV 2023) |
|
Robust Scale-Aware Stereo Matching Network James Okae, Bohan Li, Juan Du†, Yueming Hu. IEEE Transactions on Artificial Intelligence (TAI) |
|
Project Leader & Research Intern Topic: Large-scale 4D Reconstruction and Generation, Controllable World Modeling |
|
Project Leader & Research Intern Topic: Efficient World Modeling |
|
Project Leader & Research Intern Topic: Scalable Multi-modal Driving Scene Generation |
|
Project Leader & Research Intern Topic: UniScene: Unified Occupancy-centric Driving Scene Generation |
|
Research Engineer Topic: Large-scale 3D City Scene Generation and Reconstruction |
|
Research Intern Topic: Robust 3D Perception and Reconstruction |
|
Research Intern Topic: Multi-modal Driving Scene Generation |
|
Research Intern Topic: Occupancy-based Scene Generation, Long-term Consistent Perception |
|
Engineer & Research Intern Topic: Semantic Scene Completion, Depth Estimation, World Models |
|
Research Intern Topic: Monocular Depth Estimation, Real-time Stereo Matching |
2026 CSIG CSIG First Doctoral Student Forum
2025 PhD National Scholarship
2024 Ningbo Association for Science and Technology Major Scientific Achievement Award
2024 Huatai Securities Doctoral Science and Technology Scholarship
IJCAI 2024 Oral Presentation ;
Conference Reviewer:
CVPR 2025-, ICCV 2025-, ECCV 2024-, NeurIPS 2024-, ICLR 2026-, AAAI 2025-, IJCAI 2025-
Journal Reviewer:
IEEE T-PAMI, IEEE T-IP, IEEE T-MM, IEEE T-CSVT, IEEE T-AI, IEEE RA-L