Xiaotian Han
Redmond, WA 98052
I’m currently a Researcher at OpenAI, focus on Multimodal. I was a Senior Research Scientist at ByteDance Seed. I was a senior applied scientist at Microsoft Azure AI Computer Vision Team. Before joining Microsoft, I received my M.S. degree from Duke University. I received my B.S. degree from University of Science and Technology of China (USTC).
My research experiences in various areas are more or less related to computer vision, multi-modal, reinforcement learning and deep learning.
news
| Nov 3, 2024 | 🎉 Exciting News! 🎉 Two papers DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation and Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model have been accepted by 2024 Conference on Neural Information Processing Systems (NeurIPS 2024). One paper InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning has been accepted by 4th MATH-AI Workshop at NeurIPS 24. |
|---|---|
| Jul 30, 2024 | 🎉 Exciting News! 🎉 We’ve been focusing on enhancing the capabilities of multimodal language models in math, coding, and STEM. We’ve summarized some of the latest research papers and are thrilled to share with the community. GitHub repo: Awesome-Multimodal-LLM-for-Math-STEM. |
| Mar 12, 2024 | Our paper COCO is “ALL’’ You Need for Visual Instruction Fine-tuning has been accepted by 2024 IEEE International Conference on Multimedia and Expo (ICME 2024). |
latest posts
| Sep 19, 2024 | Math Reasoning for Multimodal Large Language Models |
|---|---|
| Jan 18, 2024 | Multimodal Large Language Models Sharing Series -- 1 |
| Dec 1, 2023 | InfiMM-Eval Benchmark Released |




