Bohan Li - 李博涵

Email: bohan.li77_at_gmail.com

[GitHub] [Google Scholar]

profile photo

Biography

I'm a Ph.D. student at Shanghai Jiao Tong University (SJTU) and Eastern Institute of Technology(EIT), Ningbo, advised by Prof. Xin Jin, Prof. Wenjun Zeng, Prof. Chao Ma, and Prof. Xiaokang Yang. I'm a Visiting Scholar of BME and ECE at the National University of Singapore (NUS) and work with Prof. Yueming Jin. I also work with Prof. Hao Zhao at Tsinghua University. I did my Master's degree at South China University of Technology (SCUT) and Bachelor's degree at Northeastern University (NEU). I have also spent some time at BAAI, Changcheng, Lixiang, MEGVII, IDEA, Tencent AILab, NetEase AILab, PhiGent, ZTE.

Research

I have a broad research interest in 3D computer vison and world modeling, including autonomous vehicles and robotics, 3D scene comprehension and generation, 3D structed information processing, representation disentanglement, and AI for Science.

News

  • We have one paper accepted to IEEE TPAMI, good news on the last day of 2025.
  • We have two papers accepted to IEEE TPAMI.
  • We have a paper accepted to NeurIPS 2025.
  • We have a paper accepted to CoRL 2025 (Oral).
  • We have two papers accepted to ICCV 2025.
  • We have a paper accepted to CVPR 2025.
  • We have a paper accepted to NeurIPS 2024.
  • We have two papers accepted to ECCV 2024.
  • We have a paper accepted to IJCAI 2024 (Oral).
  • We have a paper accepted to AAAI 2024.
  • We have a paper accepted to ICCV 2023.
  • I worked as a Computer Vision Algorithm Engineer at Tencent AI Lab.
  • Selected Publications (*Equal Contribution, †Corresponding)

    OmniNWM: Omniscient Driving Navigation World Models
    Bohan Li*, Zhuang Ma*, Dalong Du*, Baorui Peng, Zhujin Liang, Zhenqiang Liu, Chao Ma, Yueming Jin, Hao Zhao, Wenjun Zeng, Xin Jin†.

    Arxiv
    [ paper ] [ project page ] [ code ] GitHub Stars

    Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction
    Bohan Li, Jiajun Deng, Yasheng Sun, Xiaofeng Wang, Xin Jin†, Wenjun Zeng.

    IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)
    [ paper ] [ project page ]

    OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation
    Bohan Li, Xin Jin†, Jianan Wang, Yukai Shi, Yasheng Sun, Xiaofeng Wang, Zhuang Ma, Baao Xie, Chao Ma, Xiaokang Yang, Wenjun Zeng.

    IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)
    [ paper ]

    UniScene: Unified Occupancy-centric Driving Scene Generation
    Bohan Li*, Jiazhe Guo*, Hongsi Liu*, Yingshuang Zou*, Yikang Ding*, Xiwu Chen, Hu Zhu, Feiyang Tan, Chi Zhang, Tiancai Wang, Shuchang Zhou, Li Zhang, Xiaojuan Qi, Hao Zhao, Mu Yang, Wenjun Zeng, Xin Jin†.

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025)
    [ paper ] [ project page ] [ code ] GitHub Stars

    NaviNeRF++: Towards Interpretable 3D Reconstruction via Unsupervised Disentangled Representation Learning
    Baao Xie, Zequn Zhang, Huanting Guo, Qiuyu Chen, Hu Zhu, Bohan Li, Wenjun Zeng, Xin Jin†.

    IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)
    [ paper ]

    Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
    Bohan Li, Xin Jin†, Hu Zhu, Hongsi Liu, Ruikai Li, Jiazhe Guo, Kaiwen Cai, Chao Ma, Yueming Jin, Hao Zhao, Xiaokang Yang, Wenjun Zeng.

    Arxiv
    [ paper ] [ project page ] [ code ] GitHub Stars

    Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
    Zheng Zhu*†, Xiaofeng Wang*, Wangbo Zhao*, Chen Min*, Bohan Li*, Nianchen Deng*, Min Dou*, Yuqi Wang*, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang.

    Arxiv
    [ paper ] [ project page ] GitHub Stars

    ORV: 4D Occupancy-centric Robot Video Generation
    Xiuyu Yang*, Bohan Li*, Shaocong Xu,Nan Wang,Chongjie Ye,Zhaoxi Chen,Minghan Qin,Yikang Ding,Xin Jin,Hang Zhao, Hao Zhao†.

    Arxiv
    [ paper ] [ project page ] [ Dataset ] [ code ] GitHub Stars

    Challenger: Affordable Adversarial Driving Video Generation
    Zhiyuan Xu*, Bohan Li*, Huan-ang Gao, Mingju Gao, Yong Chen, Ming Liu, Chenxu Yan, Hang Zhao, Shuo Feng, Hao Zhao†.

    Conference on Robot Learning (CoRL 2025 SAFE-ROL Workshop Oral)
    [ paper ] [ project page ] [ Dataset ] [ code ] GitHub Stars

    Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation
    Wenyao Zhang*, Hongsi Liu*, Bohan Li*, Jiawei He, Zekun Qi, Yunnan Wang, Shengyang Zhao, Xinqiang Yu, Wenjun Zeng, Xin Jin†.

    International Conference on Computer Vision (ICCV 2025)
    [ paper ] [ code ]

    Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
    Nan Wang, Yuantao Chen, Lixing Xiao, Weiqing Xiao, Bohan Li, Zhaoxi Chen, Chongjie Ye, Shaocong Xu, Saining Zhang, Ziyang Yan, Pierre Merriaux, Lei Lei, Tianfan Xue, Hao Zhao†.

    Neural Information Processing Systems (NeurIPS 2025)
    [ paper ] [ project page ] [ code ] GitHub Stars

    One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation
    Zheng Geng, Nan Wang, Shaocong Xu, Chongjie Ye, Bohan Li, Zhaoxi Chen, Sida Peng, Hao Zhao†.

    Conference on Robot Learning ( CoRL 2025 Oral)
    [ paper ] [ project page ] [ Huggingface ] [ code ] GitHub Stars

    DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation
    Jiazhe Guo, Yikang Ding, Xiwu Chen, Shuo Chen, Bohan Li, Yingshuang Zou, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Zhiheng Li, Hao Zhao†.

    International Conference on Computer Vision (ICCV 2025)
    [ paper ] [ page ] [ code ] GitHub Stars

    MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction
    Yingshuang Zou, Yikang Ding, Chuanrui Zhang, Jiazhe Guo, Bohan Li, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Haoqian Wang†.

    British Machine Vision Association (BMVC 2025)
    [ paper ] [ page ] [ code ] GitHub Stars

    TAPTRv2: Attention-based Position Update Improves Tracking Any Point
    Hongyang Li, Feng Li, Hao Zhang, Tianhe Ren, Shilong Liu, Bohan Li, Zhaoyang Zeng, Lei Zhang†.

    Neural Information Processing Systems (NeurIPS 2024)
    [ paper ]

    Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion
    Bohan Li, Jiajun Deng, Wenyao Zhang, Liang, Dalong Du, Xin Jin†, Wenjun Zeng.

    European Conference on Computer Vision (ECCV 2024)
    [ paper ] [ code ] GitHub Stars

    Closed-Loop Unsupervised Representation Disentanglement with β-VAE Distillation and Diffusion Probabilistic Feedback
    Xin Jin*†, Bohan Li*, Baao Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng.

    European Conference on Computer Vision (ECCV 2024)
    [ paper ]

    Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion
    Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, Xiaofeng Wang, Yunnan Wang, Xin Jin†, Wenjun Zeng.

    International Joint Conference on Artificial Intelligence (IJCAI 2024 Oral)
    [ paper ] [ code ] GitHub Stars

    One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception
    Bohan Li, Yasheng Sun, Jingxin Dong, Zheng Zhu, Jinming Liu, Xin Jin†, Wenjun Zeng.

    AAAI Conference on Artificial Intelligence (AAAI 2024)
    [ paper ]

    NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation
    Baao Xie*, Bohan Li*, Zequn Zhang, Junting Dong, Xin Jin†, Jingyu Yang, Wenjun Zeng.

    International Conference on Computer Vision (ICCV 2023)
    [ paper ] [ code ]

    Robust Scale-Aware Stereo Matching Network
    James Okae, Bohan Li, Juan Du†, Yueming Hu.

    IEEE Transactions on Artificial Intelligence (TAI)
    [ paper ]

    Experiences

    BAAI
    Project Leader & Research Intern
    Topic: Large-scale 4D Reconstruction and Generation, Controllable World Modeling
    Changcheng
    Project Leader & Research Intern
    Topic: Efficient World Modeling
    Lixiang
    Project Leader & Research Intern
    Topic: Scalable Multi-modal Driving Scene Generation
    MEGVII
    Project Leader & Research Intern
    Topic: UniScene: Unified Occupancy-centric Driving Scene Generation
    Tencent AILab
    Research Engineer
    Topic: Large-scale 3D City Scene Generation and Reconstruction
    NetEase AILab
    Research Intern
    Topic: Robust 3D Perception and Reconstruction
    NavInfo
    Research Intern
    Topic: Multi-modal Driving Scene Generation
    IDEA
    Research Intern
    Topic: Occupancy-based Scene Generation, Long-term Consistent Perception
    PhiGent Robotics
    Engineer & Research Intern
    Topic: Semantic Scene Completion, Depth Estimation, World Models
    ZTE
    Research Intern
    Topic: Monocular Depth Estimation, Real-time Stereo Matching

    Certificate

    Academic Services


    © Bohan Li | Last updated: September, 2025