|
Fu-En Yang
I am a Research Scientist at NVIDIA Research, pursuing research on Adaptive Physical Intelligence, focusing on developing efficient, adaptive AI systems for vision-language-action models (VLA), world modeling, embodied reasoning, and physical AI.
I received my Ph.D. from National Taiwan University (NTU) in Jul. 2023, supervised by Prof. Yu-Chiang Frank Wang. Previously, I was a research intern at NVIDIA Research (Feb. 2023-Aug. 2023), focusing on efficient model personalization and vision-language models. Also, I was a Ph.D. program researcher at ASUS AICS from Sep. 2020 to Oct. 2022, specializing in visual transfer learning.
Prior to my Ph.D., I received my Bachelor's degree from Department of Electrical Engineering at National Taiwan University in 2018.
Email  / 
CV  / 
Google Scholar  / 
LinkedIn  / 
Twitter  / 
Github
|
|
|
Selected Publications
My research goal is to advance Embodied and Physical AI Research, developing fast adaptive and self-evolving AI agents that seamlessly integrate dynamics, reasoning, and action in physical environments. I focus on vision-language-action models that enable intelligent agents to understand and interact with the world through multimodal reasoning, sophisticated world modeling for predictive understanding of dynamic environments, and embodied reasoning that bridges abstract cognition with physical reality. I am driven by the vision that AI should not merely process information, but should adaptively learn from and intelligently respond to the rich complexity of physical experience, ultimately creating more capable and contextually aware artificial agents. Full list of publications here.
|
|
|
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
Chi-Pin Huang,
Yunze Man,
Zhiding Yu,
Min-Hung Chen,
Jan Kautz,
Yu-Chiang Frank Wang,
Fu-En Yang
arXiv, 2026  
paper
/
arXiv
/
project
Fast-ThinkAct achieves efficient vision-language-action reasoning by compressing lengthy textual chain-of-thought into a few continuous latents, enabling predictable inference for real-time robotic control. It supports efficient embodied reasoning, long-horizon planning, few-shot adaptation, and robust failure recovery via compact yet expressive latent reasoning.
|
Academic Services
- Area Chair: NeurIPS 2025 Workshop GenProCC
- Conference Program Committee/Reviewer: CVPR 2026, ICLR 2026, AAAI 2026, WACV 2026, NeurIPS 2025, ICCV 2025, ICML 2025, CVPR 2025, ICLR 2025, ICLR 2025 WS SCOPE, AAAI 2025, ACM MM 2025, NeurIPS 2024, ECCV 2024, ICML 2024, CVPR 2024, AAAI 2024, ACCV 2024, ICIP 2024, NeurIPS 2023, ICCV 2023, CVPR 2023, AAAI 2023, WACV 2023, ICIP 2023, ACCV 2022, CVPR 2022, AAAI 2022, WACV 2022, AAAI 2021, ICIP 2020, AAAI 2020
- Journal Reviewer: Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Computer Vision and Image Understanding (CVIU), ACM Computing Surveys (CSUR)
|
Awards
- Honorable Mention at 2023 TAAI Ph.D. Thesis Award, Nov. 2023
- NTU Presidential Award for Graduate Students, Sep. 2023
- Merit Award at the 16th IPPR Doctoral Thesis Award, Aug. 2023
|
Teaching Assistant
- Deep Learning for Computer Vision, Spring 2019
- Computer Vision: from recognition to geometry, Fall 2018
|
|