Mark Weber

I am a Research Scientist at Amazon Frontier AI & Robotics (FAR) working on foundational models. Before that, I was a PhD Student at the Technical University Munich and a member of the Dynamic Vision and Learning Group as well as the Computer Vision Group. My advisors were Professor Laura Leal-Taixé and Professor Daniel Cremers.

During my PhD I did several research internships. I spend 9 month doing research at Google and 9 month doing research at ByteDance Seed (TikTok), both under the supervision of Liang-Chieh Chen. As well as 7 additional month of research also at ByteDance Seed.

Before that, I obtained my Bachelor and Master degree at the RWTH Aachen Univeristy, where I worked with Dr. Aljosa Osep and Professor Bastian Leibe.

Email  /  Google Scholar  /  LinkedIn  /  Twitter

Research

My research interests lie in multi-modal tokenization, generative modeling and scene understanding. Please refer to my Google Scholar for a usual up2date list of publications.

Highlights

An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu*, Mark Weber*, Xueqing Deng, Xiaohui Shen, Daniel Cremers, Liang-Chieh Chen
NeurIPS, 2024

Embedding-free Image Generation via Bit Tokens
Mark Weber, Lijun Yu, Xueqing Deng, Qihang Yu, Xiaohui Shen, Daniel Cremers, Liang-Chieh Chen
TMLR, featured & reproducibility certifications, 2024

STEP: Segmenting and Tracking Every Pixel
Mark Weber, Jun Xie, Maxwell Collins, Yukun Zhu, Paul Voigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljosa Osep, Laura Leal-Taixe, Liang-Chieh Chen
NeurIPS Track on Datasets and Benchmarks, 2021

All

SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding
Tanveer Hannan, Shuaicong Wu, Mark Weber, Suprosanna Shit, Jindong Gu, Rajat Koner, Aljoša Ošep, Laura Leal-Taixé, Thomas Seidl
Structural Priors for Vision Workshop at ICCV'25, 2025

The Diashow Paradox: Stronger 3D-Aware Representations Emerge from Image Sets, Not Videos
Nguyen Tien Duc, Anna Sonnweber, Mark Weber, Nikita Araslanov, Daniel Cremers
Preprint, 2025

Embedding-free Image Generation via Bit Tokens
Mark Weber, Lijun Yu, Xueqing Deng, Qihang Yu, Xiaohui Shen, Daniel Cremers, Liang-Chieh Chen
TMLR, featured & reproducibility certifications, 2024

An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu*, Mark Weber*, Xueqing Deng, Xiaohui Shen, Daniel Cremers, Liang-Chieh Chen
NeurIPS, 2024

MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation
Linyan Yang, Lukas Hoyer, Mark Weber, Tobias Fischer, Dengxin Dai, Laura Leal-Taixé, Daniel Cremers, Mark Pollefeys, Luc Van Gool
ECCV, 2024

DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation
Aysim Toker*, Lukas Kondmann*, Mark Weber, Marvin Eisenberger, Andrés Camero, Jingliang Hu, Ariadna Pregel Hoderlein, Çağlar Şenaras, Timothy Davis, Daniel Cremers, Giovanni Marchisio, Xiao Xiang Zhu, Laura Leal-Taixé
CVPR, 2022

STEP: Segmenting and Tracking Every Pixel
Mark Weber, Jun Xie, Maxwell Collins, Yukun Zhu, Paul Voigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljosa Osep, Laura Leal-Taixe, Liang-Chieh Chen
NeurIPS Track on Datasets and Benchmarks, 2021

DeepLab2: A TensorFlow Library for Deep Labeling
Mark Weber*, Huiyu Wang*, Siyuan Qiao*, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen
Tech report, arXiv

4D Panoptic LiDAR Segmentation
Mehmet Aygun, Aljosa Osep, Mark Weber, Maxim Maximov, Cyrill Stachniss, Jens Behley, Laura Leal-Taixé
CVPR, 2021

Single-shot Panoptic Segmentation
Mark Weber, Jonathon Luiten, Bastian Leibe
IROS, 2021

4D Generic Video Object Proposals
Aljosa Osep, Paul Voigtlaender, Mark Weber, Jonathon Luiten, Bastian Leibe
ICRA, 2020

Visual Person Understanding Through Multi-task and Multi-dataset Learning
Kilian Pfeiffer, Alexander Hermans, Istvan Sarandi, Mark Weber, Bastian Leibe
GCPR, 2019

Conference Workshops

Over the course of my PhD, I co-organized multiple workshops in the area of segmentation and tracking.

Towards Spatiotemporal Action Grounding in Videos: 8th Workshop on MOT25
ICCV 2025

2nd Workshop on Tracking and Its Many Guises
CVPR 2023

How Far Can Synthetic Data Take us? -- 7th Workshop on Benchmarking Multi-Target Tracking
CVPR 2022

Segmenting and Tracking Every Point and Pixel: 6th Workshop on Benchmarking Multi-Target Tracking
ICCV 2021


Thanks to Jon Barron for the website template.