


default search action
MMM 2024, Amsterdam, The Netherlands - Part IV
- Stevan Rudinac

, Alan Hanjalic
, Cynthia C. S. Liem
, Marcel Worring
, Björn Þór Jónsson
, Bei Liu
, Yoko Yamakata
:
MultiMedia Modeling - 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 - February 2, 2024, Proceedings, Part IV. Lecture Notes in Computer Science 14557, Springer 2024, ISBN 978-3-031-53301-3
FMM: Special Session on Foundation Models for Multimedia
- Jun Wu, Mingxin He, Yang Liu, Jingjie Lin, Zeyu Huang, Dayong Ding:

Removing Stray-Light for Wild-Field Fundus Image Fusion Based on Large Generative Models. 3-16 - Yuma Honbu

, Keiji Yanai
:
Training-Free Region Prediction with Stable Diffusion. 17-31 - Lei Wang, Jiabang He, Shenshen Li, Ning Liu, Ee-Peng Lim

:
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites. 32-45 - Can Zhang, Zhiqiang Wang, Yuan Zhang, Xuanya Li, Kai Hu:

GDTNet: A Synergistic Dilated Transformer and CNN by Gate Attention for Abdominal Multi-organ Segmentation. 46-57 - Xinyue Liu, Gang Yang, Yang Zhou, Yajie Yang, Weichen Huang, Dayong Ding, Jun Wu:

Fine-Grained Multi-modal Fundus Image Generation Based on Diffusion Models for Glaucoma Classification. 58-70 - Lantao Wang, Chao Ma:

Adapting Pretrained Large-Scale Vision Models for Face Forgery Detection. 71-85
ICDAR: Special Session on Intelligent Cross-Data Analysis and Retrieval
- Fuyang Yu, Zhen Wang, Dongyuan Li, Peide Zhu, Xiaohui Liang, Xiaochuan Wang, Manabu Okumura:

Towards Cross-Modal Point Cloud Retrieval for Indoor Scenes. 89-102 - Nhat-Hao Pham, Khanh-Linh Vo, Mai Anh Vu, Thu Nguyen, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen:

Correlation Visualization Under Missing Values: A Comparison Between Imputation and Direct Parameter Estimation Methods. 103-116 - Rupayan Mallick

, Jenny Benois-Pineau
, Akka Zemmari
:
IFI: Interpreting for Improving: A Multimodal Transformer with an Interpretability Technique for Recognition of Risk Events. 117-131 - Kha-Luan Pham, Minh-Khoi Nguyen-Nhat, Anh-Huy Dinh

, Quang-Tri Le, Manh-Thien Nguyen, Anh-Duy Tran, Minh-Triet Tran, Duc-Tien Dang-Nguyen
:
Ookpik- A Collection of Out-of-Context Image-Caption Pairs. 132-144 - Viet-Tham Huynh

, Trong-Thuan Nguyen
, Quang-Thuc Nguyen
, Mai-Khiem Tran
, Tam V. Nguyen
, Minh-Triet Tran
:
LUMOS-DM: Landscape-Based Multimodal Scene Retrieval Enhanced by Diffusion Model. 145-158
XR-MACCI: Special Session on eXtended Reality and Multimedia - Advancing Content Creation and Interaction
- Helmut Neuschmied

, Werner Bailer
:
Mining Landmark Images for Scene Reconstruction from Weakly Annotated Video Collections. 161-174 - Panagiotis Vrachnos, Marios Krestenitis, Ilias Koulalis

, Konstantinos Ioannidis, Stefanos Vrochidis:
A Framework for 3D Modeling of Construction Sites Using Aerial Imagery and Semantic NeRFs. 175-187 - Maria Pegia

, Björn Þór Jónsson
, Anastasia Moumtzidou
, Sotiris Diplaris
, Ilias Gialampoukidis
, Stefanos Vrochidis
, Ioannis Kompatsiaris
:
Multimodal 3D Object Retrieval. 188-201 - Ioannis Kontostathis

, Evlampios Apostolidis
, Vasileios Mezaris
:
An Integrated System for Spatio-temporal Summarization of 360-Degrees Videos. 202-215
Brave New Ideas
- Mingliang Liang, Zhouran Liu, Martha A. Larson:

Mutant Texts: A Technique for Uncovering Unexpected Inconsistencies in Large-Scale Vision-Language Models. 219-233 - Rômulo Vieira, Débora C. Muchaluat-Saade, Pablo César:

Exploring Artificial Intelligence for Advancing Performance Processes and Events in Io3MT. 234-248
Demonstrations
- Masatoshi Hamanaka

:
Implementation of Melody Slot Machines. 251-257 - Faiga Alawad

, Pål Halvorsen
, Michael A. Riegler
:
E2Evideo: End to End Video and Image Pre-processing and Analysis Tool. 258-264 - Loris Sauter

, Tim Bachmann
, Heiko Schuldt
, Luca Rossetto
:
Augmented Reality Photo Presentation and Content-Based Image Retrieval on Mobile Devices with AR-Explorer. 265-270 - Evlampios Apostolidis

, Konstantinos Apostolidis
, Vasileios Mezaris
:
Facilitating the Production of Well-Tailored Video Summaries for Sharing on Social Media. 271-278 - Mehdi Houshmand Sarkhoosh

, Sayed Mohammad Majidi Dorcheh
, Cise Midoglu, Saeed Shafiee Sabet, Tomas Kupka, Dag Johansen, Michael A. Riegler, Pål Halvorsen:
AI-Based Cropping of Soccer Videos for Different Social Media Representations. 279-287 - Werner Bailer

, Mihai Dogariu
, Bogdan Ionescu
, Hannes Fassold
:
Few-Shot Object Detection as a Service: Facilitating Training and Deployment for Domain Experts. 288-294 - Boyu Xu

, Ghazaleh Tanhaei
, Lynda Hardman
, Wolfgang Hürst
:
DatAR: Supporting Neuroscience Literature Exploration by Finding Relations Between Topics in Augmented Reality. 295-300 - Tengteng Dong, Fangyuan Liu, Xinke Wang, Yishun Jiang, Xiwei Zhang, Xiao Sun:

EmoAda: A Multimodal Emotion Interaction and Psychological Adaptation System. 301-307
Video Browser Showdown
- Takayuki Hori

, Kazuya Ueki
, Yuma Suzuki, Hiroki Takushima, Hayato Tanoue, Haruki Sato, Takumi Takada, Aiswariya Manoj Kumar:
Waseda_Meisei_SoftBank at Video Browser Showdown 2024. 311-316 - Florian Spiess

, Luca Rossetto
, Heiko Schuldt
:
Exploring Multimedia Vector Spaces with vitrivr-VR. 317-323 - Ralph Gasser

, Rahel Arnold
, Fynn Faber
, Heiko Schuldt
, Raphael Waltenspül
, Luca Rossetto
:
A New Retrieval Engine for Vitrivr. 324-331 - Giuseppe Amato

, Paolo Bolettieri, Fabio Carrara
, Fabrizio Falchi
, Claudio Gennaro, Nicola Messina, Lucia Vadicamo
, Claudio Vairo:
VISIONE 5.0: Enhanced User Interface and AI Models for VBS2024. 332-339 - Jakub Lokoc, Zuzana Vopálková, Michael Stroh

, Raphael Buchmueller, Udo Schlegel:
PraK Tool: An Interactive Search Tool Based on Video Data Services. 340-346 - Omar Shahbaz Khan, Hongyi Zhu, Ujjwal Sharma

, Evangelos Kanoulas, Stevan Rudinac, Björn Þór Jónsson:
Exquisitor at the Video Browser Showdown 2024: Relevance Feedback Meets Conversational Search. 347-355 - Nick Pantelidis, Maria Pegia, Damianos Galanopoulos, Konstantinos Apostolidis, Klearchos Stavrothanasopoulos

, Anastasia Moumtzidou, Konstantinos Gkountakos
, Ilias Gialampoukidis, Stefanos Vrochidis, Vasileios Mezaris, Ioannis Kompatsiaris, Björn Þór Jónsson:
VERGE in VBS 2024. 356-363 - Konstantin Schall, Nico Hezel, Kai Uwe Barthel, Klaus Jung:

Optimizing the Interactive Video Retrieval Tool Vibro for the Video Browser Showdown 2024. 364-371 - Klaus Schoeffmann, Sahar Nasirihaghighi:

DiveXplore at the Video Browser Showdown 2024. 372-379 - Zhixin Ma, Jiaxin Wu

, Chong Wah Ngo:
Leveraging LLMs and Generative Models for Interactive Known-Item Video Search. 380-386 - Guihe Gu

, Zhengqian Wu
, Jiangshan He
, Lin Song
, Zhongyuan Wang
, Chao Liang
:
TalkSee: Interactive Video Retrieval Engine Using Large Language Model. 387-393 - Thao-Nhu Nguyen, Le Minh Quang, Graham Healy, Binh T. Nguyen, Cathal Gurrin:

VideoCLIP 2.0: An Interactive CLIP-Based Video Retrieval System for Novice Users at VBS2024. 394-399 - Gia-Huy Vuong

, Van-Son Ho
, Tien-Thanh Nguyen-Dang
, Xuan-Dang Thai
, Tu-Khiem Le
, Minh-Khoi Pham
, Van-Tu Ninh
, Cathal Gurrin
, Minh-Triet Tran
:
ViewsInsight: Enhancing Video Retrieval for VBS 2024 with a User-Friendly Interaction Mechanism. 400-406

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














