Wuyang LI

I am a Research Scientist (since Feb 2025) in Visual Intelligence for Transportation Lab (VITA) at Γ‰cole Polytechnique FΓ©dΓ©rale de Lausanne (EPFL). I feel extremely fortunate to work under the supervision of Prof. Alexandre Alahi.

My research journey began with Domain Adaptive Object Detection, which combines two core concepts: perception and domain transfer. Interestingly, this was not only the starting point of my work, but the underlying philosophy has profoundly shaped all of my subsequent research. I firmly believe that breakthroughs in any field rely on two pillars: the depth of perception within that field and the effective transfer of insights from other fields. This philosophy has led my research to span a remarkably broad range of topics, covering four main areas:

  • AI for Human Life: Video Generative Model (Current), 3D Spatial Reasoning
  • AI for Autonomous Mobility: 3D Occupancy Prediction, Open-vocabulary/Cross-domain/Open-set Object Detection
  • AI for Scientific Innovation: Nanophotonics, Meta Optics, Graph-based Learning, Computational Photography
  • AI for Transforming Medicine: Brain MRI Analysis, Medical Report Generation, 4D Surgical Simulation

Before this, I worked as a Postdoctoral Researcher (2024–2025) at the Chinese University of Hong Kong (CUHK) and completed my PhD (2020–2023) at the City University of Hong Kong (CityUHK) with Early Graduation, supervised by Prof. Yixuan Yuan. During my PhD, I focused on visual perception in the context of autonomous driving, thoroughly addressing two-dimensional challenges posed by out-of-distribution data and domain shifts. I was also fortunate to have had the opportunity to work with Prof. Bo Han. I completed my undergraduate studies (2016–2020) at Tianjin University.

I will enter the job market in 2026, seeking Research Scientist positions in video generation, creative AI, and related fields. I am open to opportunities in any location (e.g., US, Switzerland, China, etc.). If you have any open positions and think I might be a good fit, please feel free to reach out via email ([email protected]) or WeChat (conv-bn-relu).

E-mail  /  CV  /  Google Scholar  /  Github  /  LinkedIn

profile photo

Figure 1: Wuyang arrives in France to attend ICCV 2023 and, through a space-time journey, begins searching for job opportunities in 2026.

πŸ’₯ News

  • [10-2025] We are excited to open source Stable Video Infinity, potentially making end-to-end filming realistic!
  • [09-2025] 3 papers, VoxDet, See&Trek, and IR3D-Bench are accepted by NeurIPS 2025! Congrats to all the co-authors! VoxDet is selected as Spotlight!
  • [07-2025] Our MetaScope is selected as Highlight in ICCV 2025!
  • [06-2025] 2 papers are accepted by ICCV 2025! Our work, MetaScope, the pioneering attempt to unify three types of sciences (optical, biomedical, and computer), received all full scores (6, 6, 6) in the final rating!
  • [04-2025] 1 co-authored paper, ToothMaker, is accepted by TMI 2025! Congrats to Weihao!
  • [03-2025] 1 co-authored paper about LLM is accepted by TMI 2025! Congrats to Yiwen!
  • [02-2025] 2 co-authored papers (FlexGS and TAO) are accepted by CVPR 2025! Congrats to all the co-authors!
  • [01-2025] 2 co-authored papers (InstantSplamp and PDH-Diffusion) are accepted by ICLR 2025! Congrats to Chenxin and Yufan!
  • [01-2025] 1 co-authored paper (Hide-In-Motion) is accepted by ICRA 2025! Congrats to Hengyu!
  • [12-2024] 2 co-authored papers (U-KAN and DPA) are accepted by AAAI 2025! Congrats to Yuanfan!
  • [12-2024] 1 co-authored paper on 3D GS watermarking has been accepted by ICASSP 2025! Congrats to Hengyu!
  • [11-2024] 1 co-authored paper about MRI phenotype prediction foundation model is accepted by TMI 2024! Congrats to Zhibin!
  • [10-2024] I am selected as the Top Reviewer at NeurIPS 2024!
  • [09-2024] 1 paper about SAM for uncertainty modeling is accepted by NeurIPS 2024. Congrats to Chenxin!
  • [Milestone] From 08/2021 to 08/2024, my first-author works have been selected as Oral at 4 CV conferences: CVPR, ICCV, ECCV, and AAAI!
  • [08-2024] Our work CLIFF is selected as Oral Presentation in ECCV
  • [07-2024] We are organizing the first workshop on multi-modal medical foundation models at NeurIPS; Submission Link
  • [07-2024] 1 paper using diffusion to tackle the open-vocabulary issue from a probabilistic viewpoint is accepted by ECCV 2024
  • [06-2024] 6 papers are accepted by MICCAI 2024 ! Congrats to the co-authors!
  • [04-2024] Our work on metasurfaces and stereo vision has been selected as the Cover Paper in ACS Photonics!
  • [12-2023] I passed my PhD defense with Early Graduation!!
  • [08-2023] Our work SOMA is selected as Oral Presentation in ICCV.
  • [07-2023] 2 papers are accepted by ICCV 2023.
  • [06-2023] Successfully secured seed funding for our startup team!
  • [02-2023] 1 paper is accepted by CVPR 2023
  • [01-2023] 1 paper is accepted by TPAMI 2023
  • [06-2022] Our work SIGMA appears on CVPR Best Paper Finalist [33/8161] !
  • [05-2022] 1 paper is accepted by MICCAI 2022 ( Early Accept, Oral)
  • [03-2022] 2 papers are accepted by CVPR 2022 (one Oral Presentation).
  • [10-2021] Our work SCAN is accepted by AAAI 2022 ( Oral Presentation ).

πŸ“‘ Selected Publication

* denotes equal contribution; Highlighted papers are representative first-author works

ArXiv 2025 Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
Wuyang Li, Wentao Pan, Po-Chien Luan, Yang Gao, Alexandre Alahi
project page / paper / youtube / code
Key Words: Long Video Generation; End-to-end Filming; Human Talking/Dancing Animation
Summary: Stable Video Infinity (SVI) is able to generate ANY-length videos with high temporal consistency, plausible scene transitions, and controllable streaming storylines in ANY domains. SVI incorporates Error-Recycling Fine-Tuning, a new type of efficient training that recycles the Diffusion Transformer (DiT)’s self-generated errors into supervisory prompts, thereby encouraging DiT to actively correct its own errors.
ArXiv 2025 Factorized Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models
Mariam Hassan, Bastien Van Delft, Wuyang Li, Alexandre Alahi
project page / paper / code (coming)
Key Words: Video Factorization; Text-to-Video Diffusion Models
Summary: We propose Factorized Video Generation (FVG), a simple yet effective pipeline that decomposes text-to-video generation into three stages: reasoning, composition, and temporal synthesis..
ArXiv 2025 RAP: 3D Rasterization Augmented End-to-End Planning
Lan Feng, Yang Gao, Γ‰loi Zablocki, Quanyi Li, Wuyang Li, Sichao Liu, Matthieu Cord, Alexandre Alahi
project page / paper / code
Key Words: End-to-End Planning; 3D Rasterization; Data Scaling
Summary: We propose RAP, a Raster-to-Real feature-space alignment that bridges the sim-to-real gap without requiring pixel-level realism. RAP ranks 1st in the Waymo Open Dataset Vision-based End-to-End Driving Challenge (2025) (UniPlan entry); Waymo Open Dataset Vision-based E2E Driving Leaderboard, NAVSIM v1 navtest, and NAVSIM v2 navhard
NeurIPS 2025 Spotlight VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection
Wuyang Li, Zhuy Yu, Alexandre Alahi
project page / paper / code
Key Words: 3D Semantic Occupancy Prediction; Dense Object Detection
Summary: 3D semantic occupancy prediction aims to reconstruct the 3D geometry and semantics of the surrounding environment. With dense voxel labels, prior works typically formulate it as a dense segmentation task, independently classifying each voxel without instance-level perception. Differently, VoxDet addresses semantic occupancy prediction with an instance-centric formulation inspired by dense object detection, which uses a VoxNT trick for freely transferring voxel-level class labels to instance-level offset labels.
NeurIPS 2025 See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
Pengteng Li, Pinhao Song Wuyang Li, Weiyu Guo, Huizai Yao, Yijie Xu, Dugang Liu, Hui Xiong
paper
Key Words: Spatial Understanding; Multimodal Large Language Model
Summary: We introduce SEE&TREK, the first training-free prompting framework tailored to enhance the spatial understanding of Multimodal Large Language Models (MLLMS) under vision-only constraints. While prior efforts have incorporated modalities like depth or point clouds to improve spatial reasoning, purely visualspatial understanding remains underexplored. SEE&TREK addresses this gap by focusing on two core principles: increasing visual diversity and motion reconstruction.
ICCV 2025 Highlight MetaScope: Optics-Driven Neural Network for Ultra-Micro Metalens Endoscopy
Wuyang Li*, Wentao Pan*, Xiaoyuan Liu*, Zhendong Luo, Chenxin Li, Hengyu Liu, Din Ping Tsai, Mu Ku Chen, Yixuan Yuan
project page / paper/ code (coming)
Key Words: Metalens, Computation Photography, Endoscopy, Optical Imaging
Summary: Unlike conventional endoscopes limited by millimeter-scale thickness, metalenses operate at the micron scale, serving as a promising solution for ultra-miniaturized endoscopy. However, metalenses suffer from intensity decay and chromatic aberration. To address this, we developed MetaScope, an optics-driven neural network for metalens-based endoscopy, offering a promising pathway for next-generation ultra-miniaturized medical imaging devices.
NeurIPS 2025 IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
Parker Liu, Chenxin Li, Zhengxin Li, Yipeng Wu, Wuyang Li, Zhiqin Yang, Zhenyue Zhang, Yunlong Lin, Sirui Han, Brandon Y. Feng
project page / paper / code
Key Words: 3D Scene Understanding; Vision-Language Model; Inverse Rendering
Summary: We propose IR3D-Bench, a benchmark that challenges VLMs to demonstrate real scene understanding by actively recreating 3D structures from images using tools. An "understanding-by-creating" approach that probes the generative and tool-using capacity of vision-language agents (VLAs), moving beyond the descriptive or conversational capacity measured by traditional scene understanding benchmarks.
AAAI 2025 Top-1 most influential paper U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation
Chenxin Li*, Xinyu Liu*, Wuyang Li*, Cheng Wang*, Hengyu Liu, Yifan Liu, Zhen Chen, Yixuan Yuan
project page/ paper/ code
Key Words: Kolmogorov-Arnold Networks; Medical Image Segmentation/Generation; Medical Backbone
Summary: We propose the first KAN-based medical backbone, U-KAN, which can be seamlessly integrated with existing medical image segmentation and generation models to boost their performance with minimal computational overhead. This work has been cited more than 250 times in one year.
CVPR 2025 FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
Hengyu Liu, Yuehao Wang, Chenxin Li, Ruisi Cai, Kevin Wang, Wuyang Li, Pavlo Molchanov, Peihao Wang, Zhangyang Wang
project page / paper code
Key Words: Efficient Gaussian Splatting; Flexible Rendering
Summary: We propose FlexGS, which can be trained once and seamlessly adapt to varying computational constraints, eliminating the need for costly retraining or finetuning for each configuration / hardware constraint. Given an input specifying the desired model size, our method selects and transforms a subset of Gaussians to meet the memory requirements while maintaining considerable rendering performance.
ECCV 2024 Oral CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
Wuyang Li, Xinyu Liu, Jiayi Ma, Yixuan Yuan
paper / code / video
Key Words: Open-Vocabulary Object Detection; Diffusion Model
Summary: This work aims to detect objects in the unseen classes. We explore the advanced generative paradigm with distribution perception and propose a novel framework based on the diffusion model, coined Continual Latent Diffusion (CLIFF), which formulates a continual distribution transfer among the object, image, and text latent space probabilistically.
NeurIPS 2024 Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM
Chenxin Li*, Yuzhi Huang*, Wuyang Li, Hengyu Liu, Xinyu Liu, Qing Xu, Zhen Chen, Yue Huang, Yixuan Yuan
project page / paper / code
Key Words: SAM; Ambiguous Segmentation;
Summary: As the vision foundation models, e.g., SAM, demonstrate potent universality, they present challenges in giving ambiguous and uncertain predictions. This paper takes a unique path to explore how this flaw can be inverted into an advantage when modeling inherently ambiguous data distributions.
ICCV 2023 Oral Novel Scenes & Classes: Towards Adaptive Open-set Object Detection
Wuyang Li, Xiaoqing Guo, Yixuan Yuan
paper / code
Key Words: Object Detection; Distributions Shift; Out-Of-Distribution
Summary: Previous generalizable object detectors transfer the model to a novel domain free of labels. However, in the real world, besides encountering novel scenes, novel domains always contain novel-class objects de facto, which are ignored. Thus, we formulate and study a more practical setting, Adaptive Open-set Object Detection (AOOD), considering both novel scenes and classes in real world scenarios.
ICCV 2023 MRM: Masked Relation Modeling for Medical Image Pre-Training with Genetics
Qiushi Yang, Wuyang Li, Baopu Li, Yixuan Yuan
paper
Key Words: Multi-modal Pretraining; Medical Imaging Analysis
Summary: We propose leveraging genetics to boost image pre-training and present a masked relation modeling (MRM) frameworks. Instead of explicitly masking input data in previous MIM methods leading to loss of disease-related semantics, we design relation masking to mask out token-wise feature relation in both self- and cross-modality levels.
ACS Photonics 2024 Cover Paper Stereo Vision Meta-lens-assisted Driving Vision
Xiaoyuan Liu, Wuyang Li, Takeshi Yamaguchi, Zihan Geng, Takuo Tanaka, Din Ping Tsai, Mu Ku Chen
paper
Key Words: Metalens; Stereo Vision; Autonomous Driving
Summary: Meta-lens, a novel flat optical device, has an artificial nanoantenna array to manipulate the light properties. In this work, we use metalens to enhance the stereo vision system for autonomous driving, achieving superior performance with reduced physical size and weight.
CVPR 2023 Adjustment and Alignment for Unbiased Open Set Domain Adaptation
Wuyang Li, Jie Liu, Bo Han, Yixuan Yuan
paper / code/ video
Key Words: Open Set Domain Adaptation (OSDA); Causal Theory
Summary: This work aims to transfer the model from a label-rich domain to a label-free one containing novel-class samples. Existing works overlook abundant novel-class semantics hidden in the source domain, leading to a biased model learning and transfer. To address this, we propose a novel causality-driven solution with the unexplored front-door adjustment theory, and then implement it with a theoretically grounded framework, coined AdjustmeNt aNd Alignment (ANNA), to achieve an unbiased OSDA.
CVPR 2022 Oral + Best Paper Finalist SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection
Wuyang Li, Xinyu Liu, Yixuan Yuan
paper / code / ηŸ₯乎
Key Words: Domain Adaptive Object Detection (DAOD); Graph Matching
Summary: DAOD leverages a labeled domain to learn an object detector generalizing to a novel domain free of annotations. Recent advances align class-conditional distributions by narrowing down cross-domain prototypes. Though great success, they ignore the significant within-class variance and domain-mismatched semantics. To solve these issues, we propose a novel SemantIc-complete Graph MAtching (SIGMA) framework, which completes mismatched semantics and reformulates the adaptation with graph matching.
CVPR 2022 Towards Robust Adaptive Object Detection under Noisy Annotations
Xinyu Liu, Wuyang Li, Qiushi Yang, Baopu Li, Yixuan Yuan
paper / code
Key Words: Object Detection; Domain Shift; Noisy Label
Summary: Existing domain adaptation methods assume that the source domain labels are completely clean, yet large-scale datasets often contain error-prone annotations due to instance ambiguity, which may lead to a biased source distribution and severely degrade the performance of the domain adaptive detector de facto. In this paper, we represent the first effort to formulate noisy seting and propose a Noise Latent Transferability Exploration (NLTE) framework to address this issue.
AAAI 2022 Oral SCAN: Cross Domain Object Detection with Semantic Conditioned Adaptation
Wuyang Li, Xinyu Liu, Xiwen Yao, Yixuan Yuan
paper / code
Key Words: Object Detection; Domain Shift; Graph-based Learning
Summary: In this work, we empirically discover that the key factor leading to the performance drop in cross-domain object detection is the misalignment of semantic information, instead of the bounding box regression and centerness scores. We address this issue by introducing cross-domain semantic-conditioned kernels, which is implemented through a graph-based learning framework.
TPAMI 2023 SIGMA++: Improved Semantic-complete Graph Matching for Domain Adaptive Object Detection
Wuyang Li, Xinyu Liu, Yixuan Yuan
paper / code
Key Words: Domain Adaptive Object Detection; Hypergraph Matching
Summary: We propose SIGMA++, an improved version of the pair-wise SIGMA framework that incorporates high-order hypergraph matching. This enhancement effectively addresses domain misalignment issues by enabling group-level adaptation. SIGMA++ achieved the best results on all the popular DAOD benchmarks.
MICCAI 2022 Oral Intervention & Interaction Federated Abnormality Detection with Noisy Clients
Xinyu Liu, Wuyang Li, Yixuan Yuan
paper / code
Key Words: Federated Learning; Noisy Label; Causal Theory
Summary: A practical yet challenging Federated learning problem is studied in this paper, namely Federated abnormality detection with noisy clients (FADN). We represent the first effort to reason the FADN task as a structural causal model, and identify the main issue that leads to the performance deterioration, namely recognition bias. To tackle the problem, an Intervention & Interaction FL framework (FedInI) is proposed, using the causal theory to achieve unbased learning.
TIP 2021 HTD: Heterogeneous Task Decoupling for Two-Stage Object Detection
Wuyang Li, Zhen Chen, Baopu Li, Dingwen Zhang, Yixuan Yuan
paper / code
Key Words: Generic Object Detection; Graph-based Learning
Summary: This work aims to develop more effective object detector for the generic usage. We propose HTD to discover the heterogeneous feature demands between the classification and regression, and solving via task-decoupled designs, which enhance the inter-object semantic interaction in classification branch, and boost the border information in regression branch. HTD achieves the best result on MS COCO benchmark.
CVPR 2025 Track Any Anomalous Object: A Granular Video Anomaly Detection Pipeline
Yuzhi Huang, Chenxin Li, Haitao Zhang, Zixu Lin, Yunlong Lin, Hengyu Liu, Wuyang Li, Xinyu Liu, Jiechao Gao, Yue Huang, Xinghao Ding, Yixuan Yuan
project page / paper / code
Key Words: SAM; Video Anomaly Detection; Tracking
Summary: We propose an innovative VAD framework called Track Any Object (TAO), which introduces a Granular Video Anomaly Detection Framework that, for the first time, integrates the detection of multiple fine-grained anomalous objects into a unified framework
ICLR 2025 InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting
Chenxin Li, Hengyu Liu, Zhiwen Fan, Wuyang Li, Yifan Liu, Panwang Pan, Yixuan Yuan
project page / paper code
Key Words: 3D Gaussian Splatting; Efficient Stenography
Summary: We propose InstantSplamp (Instant Splitting Stamp), a framework that seamlessly integrates the 3D steganography pipeline into large 3D generative models without introducing explicit additional time costs.
ArXiv 2025 X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography
Yifan Liu, Wuyang Li, Weihao Yu, Chenxin Li, Alexandre Alahi, Max Meng, Yixuan Yuan
project page / paper
Key Words: Feed-forward Gaussian Reconstruction Model; Sparse-view X-ray
Summary: We present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT from sparse-view 2D X-ray projections.
TMI 2025 ToothMaker: Realistic Panoramic Dental Radiograph Generation via Disentangled Control
Weihao Yu, Xiaoqing Guo, Wuyang Li, Xinyu Liu, Hui Chen, Yixuan Yuan
paper / code
Key Words: Dental Radiograph Generation; Diffusion Model
Summary: We take the first attempt to investigate diffusion-based teeth X-ray image generation and propose ToothMaker, a novel framework specifically designed for the dental domain.
MICCAI 2024 From Static to Dynamic Diagnostics: Boosting Medical Image Analysis via Motion-Informed Generative Videos
Wuyang Li, Xinyu Liu, Qiushi Yang, Yixuan Yuan
paper
Key Words: Medical Video Generation; Semi-Supervised Diagnosis
Summary: Enhancing the semi-supervised diagnosis with generative videos, enabling the learning across image and video modalities jointly.
ICLR 2024 Synthesizing Realistic fMRI: A Physiological Dynamics-Driven Hierarchical Diffusion Model for Efficient fMRI Acquisition
Yufan Hu, Yu Jiang, Wuyang Li, Yixuan Yuan
paper / code
Key Words: Brain fMRI Generation; Physiological Dynamics; Diffusion Model
Summary: We propose a Physiological Dynamics-Driven Hierarchical Diffusion Model that integrates brain hierarchical regional interactions through hypergraph-based functional connectivity and multifractal dynamics to generate physiologically realistic fMRI signals with preserved scale-invariant characteristics.
TMI 2024 FM-APP: Foundation Model for Any Phenotype Prediction via fMRI to sMRI Knowledge Transfer
Zhibin He, Wuyang Li, Yifan Liu, Xinyu Liu, Junwei Han, Tuo Zhang, Yixuan Yuan
paper / code
Key Words: Brain fMRI Analysis; Phenotype Prediction;
Summary: Predicting individual-level non-neuroimaging phenotypes (e.g., fluid intelligence) using brain imaging data is a fundamental goal of neuroscience. We propose the first Foundational Model for Any Phenotype Prediction via fMRI to sMRI knowledge transfer.
MICCAI 2024 LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction
Hengyu Liu, Yifan Liu, Chenxin Li, Wuyang Li, Yixuan Yuan
project page / paper / code
Key Words: 4D Gaussian Splatting; Light-weight Reconstruction; Surgical Simulation
Summary: We introduce a Lightweight 4D Gaussian Splatting framework (LGS) that can liberate the efficiency bottlenecks of both rendering and storage for dynamic endoscopic reconstruction.
MICCAI 2024 Endora: Video Generation Models as Endoscopy Simulators
Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuan
project page/ paper/ code
Key Words: Medical Video Generation; Surgical Simulation
Summary: We propose Endora, the first medical video generation models that can simulate the intraoperative endoscopy with high-quality and diverse videos, which can be used for novel surgical training.
MICCAI 2024 When 3D Partial Points Meets SAM: Tooth Point Cloud Segmentation with Sparse Labels
Yifan Liu, Wuyang Li, Cheng Wang, Hui Chen, Yixuan Yuan
paper
Key Words: SAM; Tooth Point Cloud Segmentation; Data-efficient Learning
Summary: Leveraging the SAM to address the severe label scarcity in 3D point cloud segmentation, enabling good performance with only 0.1% label ratio.
TNNLS 2023 Decoupled Unbiased Teacher for Source-Free Domain Adaptive Medical Object Detection
Xinyu Liu, Wuyang Li, Yixuan Yuan
paper / code
Key Words: Source-free Domain Adaptive Object Detection; Causal Theory
Summary: Identifying the bias at the data, model and prediction levels in SFDA, and solving with causal intervention.

πŸ“„ Other Publication

Spatial Understanding

  • SCAN++: Enhanced Semantic Conditioned Adaptation for Domain Adaptive Object Detection
    Wuyang Li, Xinyu Liu, Yixuan Yuan
    IEEE Transactions on Multimedia (TMM), 2022
  • Hide-in-Motion: Embedding Steganographic Copyright Information into 4D Gaussian Splatting Assets
    Hengyu Liu, Chenxin Li, Wentao Pan, Zhiqin Yang, Yifeng Yang, Yifan Liu, Wuyang Li, Yixuan Yuan.
    IEEE International Conference on Robotics and Automation (ICRA), 2025
  • Universal Domain Adaptive Object Detection via Dual Probabilistic Alignment
    Yuanfan Zheng, Jinlin Wu, Wuyang Li, Zhen Chen.
    Proceedings of the AAAI Conference on Artificial Intelligence, 2025
  • ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting
    Yifeng Yang*, Hengyu Liu*, Chenxin Li*, Yining Sun, Wuyang Li, Yifan Liu, Yiyang Lin, Yixuan Yuan, Nanyang Ye.
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025
  • Eventvl: Understand event streams via multimodal large language model
    Pengteng Li, Yunfan Lu, Pinghao Song, Wuyang Li, Huizai Yao, Hui Xiong.
    arXiv preprint arXiv:2501.13707, 2025

Medical Imaging Analysis

  • Joint polyp detection and segmentation with heterogeneous endoscopic data
    Wuyang Li, Chen Yang, Jie Liu, Xiaoqing Guo, Yixuan Yuan
    International Symposium on Biomedical Imaging (ISBI) Workshop, 2021
  • DiffRect: Latent Diffusion Label Rectification for Semi-supervised Medical Image Segmentation
    Xinyu Liu, Wuyang Li, Yixuan Yuan
    Medical Image Computing and Computer Assisted Intervention (MICCAI), 2024
  • LLM-guided Decoupled Probabilistic Prompt for Continual Learning in Medical Image Diagnosis
    Yiwen Luo, Wuyang Li, Chen Cheng, Xiang Li, Tianming Liu, Yixuan Yuan
    IEEE Transactions on Medical Imaging (TMI), 2024, Under Revision
  • GRAB-Net: Graph-based Boundary-aware Network for Medical Point Cloud Segmentation
    Yifan Liu, Wuyang Li, Jie Liu, Hui Chen, Yixuan Yuan
    IEEE Transactions on Medical Imaging (TMI), 2023
  • Medical Federated Learning with Joint Graph Purification for Noisy Label Learning
    Zhen Chen, Wuyang Li, Xiaohan Xing, Yixuan Yuan
    Medical Image Analysis (MedIA), 2023
  • GAGM: Geometry-aware graph matching framework for weakly supervised gyral hinge correspondence
    Zhibin He, Wuyang Li, Tianming Liu, Xiang Li, Junwei Han, Tuo Zhang, Yixuan Yuan.
    Medical Image Analysis (MedIA), 2025
  • F2TNet: FMRI to T1w MRI Knowledge Transfer Network for Brain Multi-phenotype Prediction
    Zhibin He, Wuyang Li, Yu Jiang, Zhihao Peng, Pengyu Wang, Xiang Li, Tianming Liu, Junwei Han, et al.
    Medical Image Computing and Computer Assisted Intervention (MICCAI), 2024
  • H2GM: A Hierarchical Hypergraph Matching Framework for Brain Landmark Alignment
    Zhibin He, Wuyang Li, Tuo Zhang, Yixuan Yuan
    Medical Image Computing and Computer Assisted Intervention (MICCAI), 2023

πŸ’‘ Service

Journals Reviewer

  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
  • International Journal of Computer Vision (IJCV)
  • IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
  • IEEE Transactions on Image Processing (TIP)
  • IEEE Transactions on Automation Science and Engineering (TASE)
  • IEEE Transactions on Multimedia (TMM)
  • IEEE Transactions on Medical Imaging (TMI)
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
  • IEEE Transactions on Intelligent Vehicles (TIV)
  • IEEE Transactions on Intelligent Transportation Systems (TITS)
  • Pattern Recognition (PR)

Conferences Reviewer

  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • IEEE International Conference on Computer Vision (ICCV)
  • European Conference on Computer Vision (ECCV)
  • Conference on Neural Information Processing Systems (NeurIPS)
  • International Conference on Machine Learning (ICML)
  • International Conference on Learning Representations (ICLR)
  • AAAI Conference on Artificial Intelligence (AAAI)

πŸ… Selected Honors

  • [2023] Outstanding Academic Performance Award (OAPA), CityU
  • [2023] Research Tuition Scholarship (RTS), CityU
  • [2022] Outstanding Academic Performance Award (OAPA), CityU
  • [2022] Research Tuition Scholarship (RTS), CityU
  • [2018] National Scholarship (Top 2% student)
  • [2017] National Scholarship (Top 2% student)
  • [2017] Tianjin Mathematical Competition (Second Prize)
  • [2017-2020] Outstanding Student Scholarship (Top 10% student)

πŸŽ“ Education

Tianjin University (TJU), China

Sep. 2016 - Jun. 2020: Bachelor's degree of Communication Engineering.

GPA: 3.83/4.00, 91.3/100, Ranking 6/120

NUS (Suzhou) Research Institute (NUSRI), China

Sep. 2019 - Jun. 2020: Exchange program of Electrical and Computer Engineering.

Supervisor: Prof. Zhiying Zhou

Completed the project: Towards Webpage-based Augmented Reality (AR)

City University of Hong Kong (CityU), China

Sep. 2020 - Dec. 2023: Ph.D. study in Electrical Engineering.

Supervisor: Prof. Yixuan Yuan

πŸ‘₯ Leadership

Founder & Director, ScholaGO Education Technology Company Limited

To gain a deeper understanding of technology, I founded ScholaGO Education Technology Company Limited (ε­Έζ—…ι€šζ•™θ‚²η§‘ζŠ€ζœ‰ι™ε…¬εΈ) with four co-founders to develop an innovative educational product aimed at converting static knowledge into an immersive, interactive, multi-modal adventure. Our company is supported by HKSTP, HK Tech 300, and Alibaba Cloud. My ultimate goal is to develop valuable technologies and products to improve the national happiness index.

🎨 Personal Interests

  • Painting and Designing: I used to do sketch training with art candidates and have a certain level of graphic design foundation. I have a strong interest in user needs analysis and product design.
  • I am looking for the opportunity to establish a start-up team and create some awesome high-tech products.

We steal this website from this guy