


default search action
22nd CBMI 2025: Dublin, Ireland
- International Conference on Content-Based Multimedia Indexing, CBMI 2025, Dublin, Ireland, October 22-24, 2025. IEEE 2025, ISBN 979-8-3315-5500-9

- Nathanya Queby Satriani

, Djordje Slijepcevic, Markus Schedl, Matthias Zeppelzauer:
Explanatory Interactive Machine Learning for Bias Mitigation in Visual Gender Classification. 1-8 - Hermann Fürntratt

, Werner Bailer
:
Facilitating Interactive Image Labelling Using Fine-Tuned SAM2. 1-5 - David Luna-García, Iván Martín-Fernández, Sergio Esteban Romero, Manuel Gil-Martín

, Fernando Fernández Martínez:
Exploring the Effect of Size, Architecture and Fine-Tuning Hyperparameters on Large Visual-Language Model Adaptation for Video Memorability Prediction. 1-7 - Ioannis Kontostathis, Evlampios Apostolidis, Vasileios Mezaris:

TSalV360: A Method and Dataset for Text-driven Saliency Detection in 360-Degrees Videos. 1-8 - Xu Ji, Haizhao Sun, Yu Ning, Ming Wu, Chuang Zhang:

GTR: General Handwritten Lines Text Recognition Dataset. 1-7 - Bruno Korbar, Andrew Zisserman:

Personalizing Retrieval Using Joint Embeddings; or "the Return of Fluffy". 1-8 - Pavan K. Rachabathuni, Andrea Ciamarra, Roberto Caldelli, Marco Bertini:

Text-Oriented Image Query Representation for Zero-Shot Composed Image Retrieval. 1-7 - Javier Carreno, Khuong An Nguyen, Zhiyuan Luo, Andrew Fish:

Probabilistic Fusion Model for Multi-Label Media Content Classification. 1-7 - Zhengxu Tang, Zizheng Wang, Luning Wang, Zitao Shuai, Chenhao Zhang, Siyu Qian, Yirui Wu, Bohao Wang, Haosong Rao, Zhenyu Yang, Chenwei Wu:

SeqBench: Benchmarking Sequential Narrative Generation in Text-to-Video Models. 1-7 - Sushant Gautam, Cise Midoglu, Vajira Lasantha Thambawita, Michael A. Riegler, Pål Halvorsen, Mubarak Shah:

SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding. 1-8 - Nicolas Martin

, Philippe Mulhem
, Jean-Pierre Chevallet
:
Melanoma Segmentation with SAM-Like Models: Assessing the Influence and Limits of Bounding Box Input. 1-7 - Nirmal Elamon, Rouzbeh Davoudi:

Beyond CNNs: Efficient Fine-Tuning of Multi-Modal LLMs for Object Detection on Low-Data Regimes. 1-6 - Muhammad Hamza, Hafsa Ilyas, Junaid Mir, Ali Javed, Muhammad Haroon Yousaf, Ahmed Zoha:

MSS: A Multilingual Spoofed Speech Dataset with Code-Switching for Anti-Spoofing Measures. 1-7 - Ruxandra Tapu, Bogdan Mocanu:

Lip Reading Across Languages: A Cross-Modal Framework Leveraging Foundation Models. 1-7 - Johanna Kallio

, Jussi Liikka, Satu-Marja Mäkelä
, Atte Kinnula
, Elena Vildjiounaite
:
Understanding Indoor Context in an Office Environment: An Empirical Study on Air Stuffiness Perception. 1-6 - Hugo Carlesso, Maria Eliza Patulea, Moncef Garouani, Radu Tudor Ionescu, Josiane Mothe:

GeMix: Conditional GAN-Based Mixup for Improved Medical Image Augmentation. 1-7 - Mohammed Althubyani, Zhijin Meng, Shengyuan Xie, Francisco Cruz, Imran Razzak, Mukesh Prasad, Eduardo B. Sandoval, A. Baki Kocaballi:

MERCI: A Multimodal Dataset for Personalised and Emotionally-Aware Dialogues. 1-7 - Sandeep Kalari, Mohan Sunkara, Dominik Soós, Vikas Ashok, Ravi Mukkamala:

ReViewQwen: An Explainable Vision-Language Model for Discrepancy Detection in Multimodal E-Commerce Reviews. 1-7 - Masatoshi Hamanaka:

BandNaviHD: Band-Member Backtrack Interface Based on Member History Information. 1-4 - Chahine-Nicolas Zede, Laurent Caraffa, Valérie Gouet-Brunet:

DSI-3D: Differentiable Search Index for Point Clouds Retrieval. 1-7 - Mehdi Houshmand Sarkhoosh, Cise Midoglu, Saeed Shafiee Sabet, Tomas Kupka, Pål Halvorsen:

Hockey2D: A Keypoint-Based Framework for Ice Hockey Rink Localization and Object Mapping. 1-7 - Chafic Abou Akar, Christian Beddawi, Marc Kamradt, Abdallah Makhoul:

A New Pipeline for Extracting and Clustering Sub-Images from Unannotated Complex Image Datasets. 1-7 - Alireza Siyavashi, Christian Herglotz:

Toward an Energy-Efficient and Explainable Neural Network Architecture for Detection of Breast Cancer in Mammography. 1-7 - Tariq Al Shoura, Reza Razavi, Mohammad Moshirpour:

EcoStream: A Resource Utilization and Power Consumption Dataset in Multimedia Streaming for Sustainability Analysis. 1-8 - Meiyu Li, Wei Ai, Naeemul Hassan:

A Survey of Information Disorder on Video-Sharing Platforms. 1-10 - Robin Schön, Julian Lorenz, Katja Ludwig, Daniel Kienle, Rainer Lienhart:

MMMS: Multi-Modal Multi-Surface Interactive Segmentation. 1-8 - Ayse Vildan Nurdag, Mete Mert Birdal, Yusuf Yazici, Baris Özcan, Erkut Arican

:
Media Search: A Multi-Stage Image Retrieval Framework with Enriched Image Captioning. 1-6 - Ander Etxezarreta, Jenny Benois-Pineau, Renaud Péteri, Lucas Bardisbanian, Aymar de Rugy:

First-Person Human Sensing for Upper Limb Neuroprosthesis Control: 6D Pose Estimation of Objects to Grasp. 1-6 - Anastasiia Potiagalova, Joemon J. Jose, Benjamin R. Cowan, Gareth J. F. Jones:

A Comparative Study of Conversational and Conventional Search Methods for Image Retrieval. 1-7 - Kiyotaka Matsue, Kenta Umene, Nghia Dao, Hieu Nguyen, Manh Phan:

FPN-Based Multi-Scale Feature Fusion for Robust 3D Pedestrian Detection in Crowded Scenes. 1-7 - M. M. Mahabubur Rahman, Jelena Tesic:

Accelerating Vector Search at Scale: BAM-ANN with Batch-Aware Memory-Disk Hybrid Indexing. 1-7 - Hoang-Bao Le, Allie Tran, Binh T. Nguyen, Liting Zhou, Cathal Gurrin:

Vision Projector: Improving Zero-Shot Composed Image Retrieval at Inference. 1-5 - Kazuya Ueki, Ryo Muto, Takuya Wada, Ryota Akaba, Genesis Faith Fernandez:

U-Cker: Initial Development of an Interactive Video Retrieval System for Novice Users. 1-5 - Jonas Geiger, Marta Moscati, Shah Nawaz, Markus Schedl:

Music4All A+A: A Multimodal Dataset for Music Information Retrieval Tasks. 1-7 - Mohammad El Sakka

, Caroline De Pourtales, Lotfi Chaâri
, Josiane Mothe
:
AgriPotential: A Novel Multi-Spectral and Multi-Temporal Remote Sensing Dataset for Agricultural Potentials. 1-6 - Thao Thi Phuong Dao, Tan-Cong Nguyen, Trong-Le Do, Truong Hoang Viet, Nguyen Chi Thanh, Huynh Nguyen Thuan, Do Vo Cong Nguyen, Minh-Khoi Pham, Mai-Khiem Tran, Viet-Tham Huynh, Trong-Thuan Nguyen, Trung-Nghia Le, Thanh-Nhan Vo, Tam V. Nguyen, Minh-Triet Tran, Thanh Dinh Le:

Toward Content-Based Indexing and Retrieval of Head and Neck CT With Abscess Segmentation. 1-8 - Younes Kebour, Smaïl Niar, Nacim Ihaddadene, Abdelghani Bekrar, Hammouda Elbez:

Zero-Shot Vision-Language Model for Event Detection in Smart Surveillance. 1-8 - Xinyang Shan, Yuanyuan Xu, Tian Xia, Yin-Shan Lin:

Rethinking Wine Tasting for Chinese Consumers: A Service Design Approach Enhanced by Multimodal Personalization. 1-5 - Bruno Henriques, Benjamin Allaert, Nicolas Sutton-Charani, Pierre Slangen, Jean-Philippe Vandeborre:

TREB: Temporal Refinement of Egocentric Body Pose. 1-6 - Minh-Nhat Nguyen, Trong-Nghia Tran, Minh-Triet Tran, Duc-Tien Dang-Nguyen, Trong-Le Do:

Robust Multimedia Verification of Cheapfakes and Deepfakes via External Context Leveraging. 1-8 - Minh-Quang Le, Graham Healy, Liting Zhou, Cathal Gurrin:

Anonymisation of Visual Lifelogs using Diffusion Models and Large Language Models. 1-7 - Mehdi Houshmand Sarkhoosh, Cise Midoglu, Saeed Shafiee Sabet, Tomas Kupka, Pål Halvorsen:

VoiceVision: AI-Powered Speaker-Aware Cropping and Content Indexing for Multi-Speaker Videos. 1-5 - Hichem Sahbi:

Label-Efficient Skeleton-Based Recognition with Stable Graph Convnets. 1-8 - Omar Shahbaz Khan, Ujjwal Sharma, Stevan Rudinac, Björn Þór Jónsson:

Examining Performance Disparities Between Expert and Novice Users in Interactive Video Retrieval. 1-4 - Thomas Eleftheriadis, Evlampios Apostolidis, Vasileios Mezaris:

An Experimental Study on Generating Plausible Textual Explanations for Video Summarization. 1-8 - Carl De Sousa Trias, Mihai Mitrea:

NN Watermarking for Face Segmentation Task. 1-8 - Elena Vildjiounaite

, Vesa Kyllönen
, Johanna Kallio
, Pauli Räsänen
:
Semi-Supervised Approach to Detect Human Discontent from Real-Life Behaviour Data. 1-7 - Guanqi Zhan, Yuanpei Liu, Kai Han, Weidi Xie, Andrew Zisserman:

ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval. 1-8 - Florian Spiess

, Heiko Schuldt
:
Novice-Friendly Video Retrieval in Mixed Reality with Vitrivr- VR. 1-3 - Oumaima Marsi, Sebastien Ambellouis, José Mennesson, Cyril Meurie, Anthony Fleury, Charles Tatkeu:

Masked Spikformer: Gaussian based and Random Spike Masking for Energy-Efficient Spiking Transformers. 1-7 - Mehtab Ur Rahman, Martha A. Larson, Louis ten Bosch, Cristian Tejedor-Garcia

:
Dual-Objective Adversarial Disentanglement for Protecting Speech Data used for Diagnosing Parkinson's Disease. 1-6 - Li Weng, Xizhe Wang, Qianneng Wang, Bingya Wu:

Fusion of Global and Local Features with Multi-Inverted Indices for Image Retrieval. 1-8 - Sensen Wang, Yuehu Liu, Chi Zhang:

Mitigating Shortcut Learning in Online Action Detection and Anticipation via Cross-Modal Semantic Alignment. 1-7 - Domenico D'Orsi, Fabio Carrara, Fabrizio Falchi, Nicola Tonellotto:

Breaking the 2D Dependency: What Limits 3D-Only Open-Vocabulary Scene Understanding. 1-5 - Matthieu Pelingre

, Salvatore Tabbone:
Historical Postcard Stamp Content Understanding. 1-7 - Marc Gallofré Ocaña, Balázs Mosolygó, Bahareh Fatemi:

Evaluating the Recognisability of AI-Generated Familiar Images in a Closed Environment with a Gamified Approach. 1-7 - Charalampos Saitis, Ben Heyderman, Vjosa Preniqi, Kyriaki Kalimeri, Johan Pauwels:

Predicting Moral Values in Lyrics Through Audio. 1-7 - Mary Ogbuka Kenneth, Foaad Khosmood, Abbas Edalat:

MultiHuSE: A Multimodal Dataset for Humour Styles and Emotions. 1-7 - Duc-Hung Nguyen, Huu-Phuc Huynh, Minh-Triet Tran, Trung-Nghia Le:

GenFlow: Interactive Modular System for Image Generation. 1-7 - Andrea Asperti, Leonardo Dessì, Maria Chiara Tonetti, Nico Wu:

Does CLIP Perceive Art the Same Way We Do? 1-8 - Cameron Baird, Ke Li, Dan Lin:

TrueEar: A Lightweight and Accurate Fake Voice Detector for Mobile Devices. 1-8 - Luís Vilaça, Paula Viana, Yi Yu:

Dialogue-AV: A Dialogue-Attended Audiovisual Dataset. 1-8 - Abhisek Ray

, Lukas Esterle
:
Towards Graph-Based Federated Learning: ModelNet - A ResNet-based Model Classification Dataset. 1-7 - Mingliang Liang, Martha A. Larson:

Enhancing Vision-Language Model Pre-Training with Image-Text Pair Pruning Based on Word Frequency. 1-7 - Junyi Zou, Riccardo Bovo, Ali Hamza, Georgios Loukas:

A Mixed-Methods Investigation of XR Security Warnings - Lessons Learned. 1-8 - Antoine Hanna-Asaad, Decky Aspandi-Latif, Titus Zaharia:

MI-Cap: A Multi-Modal Interpretable Model for Video Captioning. 1-8 - Quang-Linh Tran, Ly-Duyen Tran, Binh T. Nguyen, Gareth J. F. Jones, Cathal Gurrin:

Multi-modal Context Reranking for Lifelog Question Answering. 1-8

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














