


default search action
Mingfei Han 0002
Person information
- affiliation: Mohamed bin Zayed University of Artificial Intelligence, UAE
- affiliation: Monash University, Faculty of Information Technology, Melbourne, VIC, Australia
- affiliation: CSIRO Data61, Eveleigh, NSW, Australia
- affiliation: Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Beijing, China
Other persons with the same name
- Mingfei Han 0001
— Beijing Proteome Research Center, Department of Bioinformatics, China - Mingfei Han 0003 — ByteDance Seed, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
[c13]Mingfei Han, Liang Ma, Kamila Zhumakhanova, Ekaterina Radionova, Jingyi Zhang, Xiaojun Chang
, Xiaodan Liang, Ivan Laptev:
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation. CVPR 2025: 27586-27596
[c12]Mingfei Han, Linjie Yang, Xiaojun Chang, Lina Yao, Heng Wang:
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos. ICLR 2025
[c11]Harsh Singh, Rocktim Jyoti Das, Mingfei Han, Preslav Nakov, Ivan Laptev:
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotic Manipulation. IROS 2025: 20386-20393
[i21]Haihong Hao, Mingfei Han, Changlin Li, Zhihui Li, Xiaojun Chang:
CoNav: Collaborative Cross-Modal Reasoning for Embodied Navigation. CoRR abs/2505.16663 (2025)
[i20]Bingqian Lin, Yunshuang Nie, Khun Loun Zai, Ziming Wei, Mingfei Han, Rongtao Xu, Minzhe Niu, Jianhua Han, Liang Lin, Cewu Lu, Xiaodan Liang:
EvolveNav: Self-Improving Embodied Reasoning for LLM-Based Vision-Language Navigation. CoRR abs/2506.01551 (2025)
[i19]Liang Ma, Jiajun Wen, Min Lin, Rongtao Xu, Xiwen Liang, Bingqian Lin, Jun Ma, Yongxin Wang, Ziming Wei, Haokun Lin, Mingfei Han, Meng Cao, Bokui Chen, Ivan Laptev, Xiaodan Liang:
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly. CoRR abs/2506.08708 (2025)
[i18]Jinxing Zhou, Zhihui Li, Yongqiang Yu, Yanghao Zhou, Ruohao Guo, Guangyao Li, Yuxin Mao, Mingfei Han, Xiaojun Chang, Meng Wang:
Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation. CoRR abs/2506.23271 (2025)
[i17]Jinxing Zhou, Yanghao Zhou, Mingfei Han, Tong Wang
, Xiaojun Chang, Hisham Cholakkal, Rao Muhammad Anwer:
Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation. CoRR abs/2508.04418 (2025)
[i16]Mingfei Han, Haihong Hao, Jinxing Zhou, Zhihui Li, Yuhui Zheng, Xueqing Deng, Linjie Yang, Xiaojun Chang:
Self-Consistency as a Free Lunch: Reducing Hallucinations in Vision-Language Models via Self-Reflection. CoRR abs/2509.23236 (2025)
[i15]Longtao Jiang, Mingfei Han, Philip Chen, Yongqiang Yu, Feng Zhao, Xiaojun Chang, Zhihui Li:
Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models. CoRR abs/2509.23919 (2025)
[i14]Rocktim Jyoti Das, Harsh Singh, Diana Turmakhan, Muhammad Abdullah Sohail, Mingfei Han, Preslav Nakov, Fabio Pizzati, Ivan Laptev:
BLAZER: Bootstrapping LLM-based Manipulation Agents with Zero-Shot Data Generation. CoRR abs/2510.08572 (2025)
[i13]Zirui Song, Yuan Huang, Junchang Liu, Haozhe Luo, Chenxi Wang, Lang Gao, Zixiang Xu, Mingfei Han, Xiaojun Chang, Xiuying Chen:
Beyond Survival: Evaluating LLMs in Social Deduction Games with Human-Aligned Strategies. CoRR abs/2510.11389 (2025)
[i12]Yongxin Wang, Zhicheng Yang, Meng Cao, Mingfei Han, Haokun Lin, Yingying Zhu, Xiaojun Chang, Xiaodan Liang:
CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal. CoRR abs/2512.19554 (2025)- 2024
[j2]Mingfei Han, Yali Wang
, Mingjie Li
, Xiaojun Chang
, Yi Yang, Yu Qiao
:
Progressive Frame-Proposal Mining for Weakly Supervised Video Object Detection. IEEE Trans. Image Process. 33: 1560-1573 (2024)
[c10]Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang
, Heng Wang:
Video Recognition in Portrait Mode. CVPR 2024: 21831-21841
[c9]Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang:
LongVLM: Efficient Long Video Understanding via Large Language Models. ECCV (33) 2024: 453-470
[c8]Chengyou Jia
, Minnan Luo
, Xiaojun Chang
, Zhuohang Dang
, Mingfei Han
, Mengmeng Wang
, Guang Dai
, Sizhe Dang
, Jingdong Wang
:
Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition. ACM Multimedia 2024: 4640-4649
[i11]Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang:
LongVLM: Efficient Long Video Understanding via Large Language Models. CoRR abs/2404.03384 (2024)
[i10]Panwen Hu, Jin Jiang, Jianqi Chen, Mingfei Han, Shengcai Liao, Xiaojun Chang, Xiaodan Liang:
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration. CoRR abs/2411.04925 (2024)
[i9]Harsh Singh, Rocktim Jyoti Das, Mingfei Han, Preslav Nakov, Ivan Laptev:
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation. CoRR abs/2411.17636 (2024)
[i8]Yongxin Wang, Meng Cao, Haokun Lin, Mingfei Han, Liang Ma, Jin Jiang, Yuhao Cheng, Xiaodan Liang:
EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation. CoRR abs/2412.04903 (2024)
[i7]Mingfei Han, Liang Ma, Kamila Zhumakhanova, Ekaterina Radionova, Jingyi Zhang, Xiaojun Chang, Xiaodan Liang, Ivan Laptev:
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation. CoRR abs/2412.08591 (2024)- 2023
[c7]Mingfei Han, Yali Wang, Zhihui Li, Lina Yao, Xiaojun Chang
, Yu Qiao:
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation. ICCV 2023: 13368-13377
[c6]Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang:
Mask Propagation for Efficient Video Semantic Segmentation. NeurIPS 2023
[i6]Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang:
Mask Propagation for Efficient Video Semantic Segmentation. CoRR abs/2310.18954 (2023)
[i5]Chengyou Jia, Minnan Luo, Xiaojun Chang, Zhuohang Dang, Mingfei Han, Mengmeng Wang, Guang Dai, Sizhe Dang, Jingdong Wang:
Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition. CoRR abs/2312.02226 (2023)
[i4]Mingfei Han, Linjie Yang, Xiaojun Chang, Heng Wang:
Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos. CoRR abs/2312.10300 (2023)
[i3]Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang:
Video Recognition in Portrait Mode. CoRR abs/2312.13746 (2023)- 2022
[c5]Mingfei Han, David Junhao Zhang, Yali Wang, Rui Yan, Lina Yao, Xiaojun Chang
, Yu Qiao:
Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition. CVPR 2022: 2980-2989
[c4]Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang
, Bohan Zhuang:
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection. ECCV (34) 2022: 358-375
[i2]Mingfei Han, David Junhao Zhang, Yali Wang, Rui Yan, Lina Yao, Xiaojun Chang, Yu Qiao:
Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition. CoRR abs/2204.02148 (2022)
[i1]Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang
, Bohan Zhuang:
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection. CoRR abs/2207.10448 (2022)- 2020
[j1]Shiyu Xuan
, Shengyang Li
, Mingfei Han
, Xue Wan
, Gui-Song Xia
:
Object Tracking in Satellite Videos by Improved Correlation Filters With Motion Estimations. IEEE Trans. Geosci. Remote. Sens. 58(2): 1074-1086 (2020)
[c3]Mingfei Han
, Yali Wang, Xiaojun Chang
, Yu Qiao
:
Mining Inter-Video Proposal Relations for Video Object Detection. ECCV (21) 2020: 431-446
2010 – 2019
- 2019
[c2]Shiyu Xuan, Shengyang Li, Zifei Zhao, Mingfei Han:
The Multi-task Fully Convolutional Siamese Network with Correlation Filter Layer for Real-Time Visual Tracking. PRCV (1) 2019: 123-134
[c1]Xiaojun Chang, Wenhe Liu, Po-Yao Huang, Changlin Li, Fengda Zhu, Mingfei Han, Mingjie Li, Mengyuan Ma, Siyi Hu, Guoliang Kang, Junwei Liang, Liangke Gui, Lijun Yu, Yijun Qian, Jing Wen, Alexander G. Hauptmann:
MMVG-INF-Etrol@TRECVID 2019: Activities in Extended Video. TRECVID 2019
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-01-30 00:39 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







