


default search action
WACV 2025: Tucson, AZ, USA
- IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025, Tucson, AZ, USA, February 26 - March 6, 2025. IEEE 2025, ISBN 979-8-3315-1083-1

- Joanna Kaleta, Kacper Kania, Tomasz Trzcinski, Marek Kowalski:

LumiGauss: Relightable Gaussian Splatting in the Wild. 1-10 - Junjie Wang, Tomas Nordström

:
Latency Robust Cooperative Perception Using Asynchronous Feature Fusion. 1-10 - Jordan Voas, Wei-Cheng Tseng, Layne Berry, Xixi Hu, Puyuan Peng, James Stuedemann, David Harwath:

Temporally Streaming Audio-Visual Synchronization for Real-World Videos. 1-9 - Seul-Ki Yeom, Julian von Klitzing:

U-MixFormer: UNet-Like Transformer with Mix-Attention for Efficient Semantic Segmentation. 1-10 - Seong Jong Yoo, Snehesh Shrestha, Irina Muresanu, Cornelia Fermüller:

VioPose: Violin Performance 4D Pose Estimation by Hierarchical Audiovisual Inference. 1-12 - Jane Wu, Diego Thomas, Ronald Fedkiw:

Sparse-View 3D Reconstruction of Clothed Humans via Normal Maps. 11-22 - Théo Morales, Omid Taheri, Gerard Lacey:

A Versatile and Differentiable Hand-Object Interaction Representation. 23-33 - Kohei Matsuzaki, Keisuke Nonaka:

Point Cloud Color Upsampling with Attention-Based Coarse Colorization and Refinement. 34-43 - Vincenzo Polizzi, Marco Cannici, Davide Scaramuzza, Jonathan Kelly:

FaVoR: Features via Voxel Rendering for Camera Relocalization. 44-53 - Pallabjyoti Deka, Saumik Bhattacharya, Debashis Sen, Prabir Kumar Biswas:

3D Shape Completion using Multi-resolution Spectral Encoding. 54-63 - Alexander H. Berger, Laurin Lux, Suprosanna Shit, Ivan Ezhov, Georgios Kaissis, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzold:

Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers. 64-74 - Hossein Askari, Fred Roosta, Hongfu Sun:

Training-free Medical Image Inverses via Bi-level Guided Diffusion Models. 75-84 - Suhyun Ahn, Wonjung Park, Jihoon Cho, Jinah Park:

Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images. 85-95 - Trong-Thang Pham, Tien-Phat Nguyen, Yuki Ikebe, Akash Awasthi

, Zhigang Deng, Carol C. Wu, Hien Nguyen, Ngan Le:
GazeSearch: Radiology Findings Search Benchmark. 96-106 - Yitong Li, Morteza Ghahremani, Youssef Wally

, Christian Wachinger:
DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PET. 107-116 - Mevan Ekanayake

, Zhifeng Chen
, Gary F. Egan, Mehrtash Harandi, Zhaolin Chen:
SeCo-INR: Semantically Conditioned Implicit Neural Representations for Improved Medical Image Super-Resolution. 117-126 - Majed El Helou, Doruk Cetin, Petar Stamenkovic, Niko Benjamin Huber, Fabio Zünd:

VerA: Versatile Anonymization Applicable to Clinical Facial Photographs. 127-138 - Dixi Yao:

Towards Privacy-Preserving Split Learning for ControlNet. 139-148 - Stefan Smeu, Elisabeta Oneata, Dan Oneata:

DeCLIP: Decoding CLIP Representations for Deepfake Localization. 149-159 - Maciej Chrabaszcz, Hubert Baniecki, Piotr Komorowski, Szymon Plotka, Przemyslaw Biecek:

Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models. 160-171 - Xin Hu, Janet Wang, Jihun Hamm, Rie Roselyne Yotsu, Zhengming Ding:

Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM. 172-181 - Hanxiao Tan

:
Evaluating Sensitivity Consistency of Explanations. 182-191 - Pengxiao Wang, Tzu-Heng Lin, Chunyu Wang, Yizhou Wang:

Shift Equivariant Pose Network. 192-201 - Yunfei Li, Yuezun Li, Xin Wang, Baoyuan Wu, Jiaran Zhou, Junyu Dong:

Texture, Shape and Order Matter: A New Transformer Design for Sequential DeepFake Detection. 202-211 - Hai Wang

, Jing-Hao Xue:
360PanT: Training-Free Text-Driven 360-Degree Panorama-to-Panorama Translation. 212-221 - Enis Simsar, Alessio Tonioni, Yongqin Xian, Thomas Hofmann, Federico Tombari:

LIME: Localized Image Editing via Attention Regularization in Diffusion Models. 222-231 - Rohit Jena, Ali Taghibakhshi, Sahil Jain, Gerald Shen, Nima Tajbakhsh, Arash Vahdat:

Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models. 232-242 - Qinpeng Cui, Xinyi Zhang, Qiqi Bao, Qingmin Liao:

Elucidating the Solution Space of Extended Reverse-Time SDE for Diffusion Models. 243-252 - Xiaofei Huang, Elaheh Hatamimajoumerd, Amal Mathew, Sarah Ostadabbas:

Infant Action Generative Modeling. 253-265 - Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori, Sahar Dastani, Milad Cheraghalikhani, David Osowiechi, Farzad Beizaee, Gustavo Adolfo Vargas Hakim, Ismail Ben Ayed, Christian Desrosiers:

Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging. 266-275 - Peizhi Yan, Rabab Ward, Qiang Tang, Shan Du:

Gaussian Déjà-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities. 276-286 - Kazuto Ichimaru, Diego Thomas, Takafumi Iwaguchi, Hiroshi Kawasaki:

Neural SDF for Shadow-Aware Unsupervised Structured Light. 287-296 - Mateusz Poleski, Jacek Tabor, Przemyslaw Spurek:

GeoGuide: Geometric Guidance of Diffusion Models. 297-305 - Simon Boeder

, Benjamin Risse:
OccFlowNet: Occupancy Estimation via Differentiable Rendering and Occupancy Flow. 306-316 - Boyuan Zhang

, Zhenliang He, Meina Kan, Shiguang Shan:
Precise Integral in NeRFs: Overcoming the Approximation Errors of Numerical Quadrature. 317-326 - Cagla Deniz Bahadir, Gozde Bozdagi Akar, Mert R. Sabuncu:

LLM-Generated Rewrite and Context Modulation for Enhanced Vision Language Models in Digital Pathology. 327-336 - Yiying Wang, Abhirup Banerjee, Robin P. Choudhury, Vicente Grau:

DeepCA: Deep Learning-Based 3D Coronary Artery Tree Reconstruction from Two 2D Non-Simultaneous X-Ray Angiography Projections. 337-346 - Daniel Kim, Mohammed A. Al-masni, Jaehun Lee, Dong-Hyun Kim, Kanghyun Ryu:

Improving Pelvic MR-CT Image Alignment with Self-Supervised Reference-Augmented Pseudo-CT Generation Framework. 347-356 - Felix Wagner, Wentian Xu, Pramit Saha, Ziyun Liang, Daniel Whitehouse

, David K. Menon, Virginia F. J. Newcombe, Natalie Voets, J. Alison Noble, Konstantinos Kamnitsas:
Feasibility of Federated Learning from Client Databases with Different Brain Diseases and MRI Modalities. 357-367 - Shumpei Takezaki, Kiyohito Tanaka, Seiichi Uchida:

Self-Relaxed Joint Training: Sample Selection for Severity Estimation with Ordinal Noisy Labels. 368-377 - Tiancheng Gu, Kaicheng Yang, Xiang An, Ziyong Feng, Dongnan Liu, Tom Weidong Cai:

ORID: Organ-Regional Information Driven Framework for Radiology Report Generation. 378-387 - Snehashis Majhi, Mohammed Guermal, Antitza Dantcheva, Quan Kong, Lorenzo Garattoni

, Gianpiero Francesca, François Brémond:
Guess Future Anomalies from Normalcy: Forecasting Abnormal Behavior in Real-World Videos. 388-398 - Seoyeon Gye, Junwon Ko, Hyounguk Shon, Minchan Kwon, Junmo Kim:

Reducing the Content Bias for AI-generated Image Detection. 399-408 - Jaehyeong Park, Juncheol Ye, Seungkook Lee, Hyun W. Ka

, Dongsu Han:
NarrAD: Automatic Generation of Audio Descriptions for Movies with Rich Narrative Context. 409-419 - Tung Luu, Nam Le, Duc Le, Bac Le:

From Visual Explanations to Counterfactual Explanations with Latent Diffusion. 420-429 - Yu-Yun Tseng, Tanusree Sharma, Lotus Zhang, Abigale Stangl, Leah Findlater, Yang Wang, Danna Gurari:

BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments. 430-440 - Gayoon Choi, Taejin Jeong, Sujung Hong, Seong Jae Hwang:

Dragtext: Rethinking Text Embedding in Point-Based Image Editing. 441-450 - Brian B. Moser, Stanislav Frolov, Federico Raue, Sebastian Palacio, Andreas Dengel:

Dynamic Attention-Guided Diffusion for Image Super-Resolution. 451-460 - Shuang Chen, Haozheng Zhang, Amir Atapour-Abarghouei

, Hubert P. H. Shum
:
SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM. 461-471 - Rahul Sajnani, Jeroen van Baar, Jie Min, Kapil Katyal, Srinath Sridhar:

GeoDiffuser: Geometry-Based Image Editing with Diffusion Models. 472-482 - Zitian Zhang, Frédéric Fortier-Chouinard, Mathieu Garon, Anand Bhattad, Jean-François Lalonde:

Zerocomp: Zero-Shot Object Compositing from Image Intrinsics via Diffusion. 483-494 - Diego Thomas, Briac Toussaint, Jean-Sébastien Franco, Edmond Boyer:

VortSDF: 3D Modeling with Centroidal Voronoi Tessellation on Signed Distance Field. 495-504 - Markus Plack, Hannah Dröge, Leif Van Holland, Matthias B. Hullin:

VHS: High-Resolution Iterative Stereo Matching with Visual Hull Priors. 505-514 - Ren Matsumoto, Takahiro Okabe, Ryo Kawahara:

Polarization as Texture: Microscale 3D Shape from Polarized Light Focus. 515-524 - Yuxin Huang, Andong Yang, Yuantao Chen, Runyi Yang, Zhenxin Zhu, Chao Hou, Hao Zhao, Guyue Zhou:

Self-Aligning Depth-Regularized Radiance Fields for Asynchronous RGB-D Sequences. 525-534 - Henrique Piñeiro Monteagudo

, Leonardo Taccari, Aurel Pjetri, Francesco Sambo, Samuele Salti:
RendBEV: Semantic Novel View Synthesis for Self-Supervised Bird's Eye View Segmentation. 535-544 - Yujing Sun, Caiyi Sun, Yuan Liu, Yuexin Ma, Siu Ming Yiu:

Generalizable Single-View Object Pose Estimation by Two-Side Generating and Matching. 545-556 - Chetan Madan, Mayuna Gupta, Soumen Basu

, Pankaj Gupta, Chetan Arora:
LQ-Adapter: ViT-Adapter with Learnable Queries for Gallbladder Cancer Detection from Ultrasound Images. 557-567 - Xiwei Liu, Mohamad Kassab, Min Xu

, Qirong Ho:
J-Invariant Volume Shuffle for Self-Supervised Cryo-Electron Tomogram Denoising on Single Noisy Volume. 568-577 - Daniel Khalil, Christina Liu, Pietro Perona, Jennifer J. Sun, Markus Marks:

Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision. 578-588 - Nirhoshan Sivaroopan, Chamuditha Jayanga Galappaththige, Chalani Ekanayake, Hasindri Watawana, Ranga Rodrigo, Chamira U. S. Edussooriya, Dushan N. Wadduwage

:
Uncertainty Awareness Enables Efficient Labeling for Cancer Subtyping in Digital Pathology. 589-598 - Hyeongmin Park, Sungrae Hong, Chanjae Song, Jongwoo Kim, Mun Yong Yi:

Uncertainty-based Data-wise Label Smoothing for Calibrating Multiple Instance Learning in Histopathology Image Classification. 599-608 - Huimin Zeng, Jiacheng Li, Ziqiang Zheng, Zhiwei Xiong:

All-in-One Image Compression and Restoration. 609-619 - Sourajit Saha, Tejas Gokhale:

Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling. 620-629 - Pritam Karmokar, Quan H. Nguyen, William J. Beksi:

Secrets of Edge-Informed Contrast Maximization for Event-Based Vision. 630-639 - Sangwon Lee, Myungsub Choi, Nagyeong Lee, Hyong-Euk Lee:

Stable Autofocus with Focal Consistency Loss. 640-649 - Ashish Tiwari, Mihirkumar Sutariya, Shanmuganathan Raman:

LIPIDS: Learning-based Illumination Planning In Discretized (Light) Space for Photometric Stereo. 650-659 - Jiancheng Huang, Yi Huang, Jianzhuang Liu, Donghao Zhou, Yifan Liu, Shifeng Chen:

Dual-Schedule Inversion: Training- and Tuning-Free Inversion for Real Image Editing. 660-669 - Sanuwani Dayarathna, Kh Tohidul Islam

, Bohan Zhuang, Guang Yang, Jianfei Cai, Meng Law, Zhaolin Chen:
McCaD: Multi-Contrast MRI Conditioned, Adaptive Adversarial Diffusion Model for High-Fidelity MRI Synthesis. 670-679 - Kyungri Park, Woohwan Jung:

Improving Detail in Pluralistic Image Inpainting with Feature Dequantization. 680-689 - Kyungmin Jo, Jaegul Choo:

Skip-and-Play: Depth-Driven Pose-Preserved Image Generation for Any Objects. 690-699 - Arya Bangun

, Zhuo Cao, Alessio Quercia, Hanno Scharr, Elisabeth Pfaehler:
MRI Reconstruction with Regularized 3D Diffusion Model (R3DM). 700-710 - Chengjie Huang, Vahdat Abdelzad, Sean Sedwards, Krzysztof Czarnecki:

VADet: Multi-Frame LiDAR 3D Object Detection Using Variable Aggregation. 711-720 - Gursimran Singh, Tianxi Hu, Mohammad Akbari, Qiang Tang, Yong Zhang:

Towards Secure and Usable 3D Assets: A Novel Framework for Automatic Visible Watermarking. 721-730 - Xinyue Wei, Fanbo Xiang, Sai Bi, Anpei Chen, Kalyan Sunkavalli, Zexiang Xu, Hao Su:

NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support. 731-741 - Decai Chen, Brianne Oberson, Ingo Feldmann, Oliver Schreer, Anna Hilsmann, Peter Eisert:

Adaptive and Temporally Consistent Gaussian Surfels for Multi-View Dynamic Reconstruction. 742-752 - Gonzalo Martin Garcia, Karim Abou Zeid, Christian Schmidt, Daan de Geus, Alexander Hermans, Bastian Leibe:

Fine-Tuning Image-Conditional Diffusion Models is Easier than you Think. 753-762 - Mahdi Alehdaghi, Pourya Shamsolmoali, Rafael M. O. Cruz, Eric Granger:

Bidirectional Multi-Step Domain Generalization for Visible-Infrared Person Re-Identification. 763-773 - Jiahao Luo, Jing Liu, James Davis:

SplatFace: Gaussian Splat Face Reconstruction Leveraging an Optimizable Surface. 774-783 - Jui-Che Chiang, Hou-Ning Hu, Bo-Syuan Hou, Chia-Yu Tseng, Yu-Lun Liu, Min-Hung Chen, Yen-Yu Lin:

ORFormer: Occlusion-Robust Transformer for Accurate Facial Landmark Detection. 784-793 - Rohit Lal, Saketh Bachu, Yash Garg, Arindam Dutta, Calvin-Khang Ta, Hannah Dela Cruz, Dripta S. Raychaudhuri, M. Salman Asif, Amit K. Roy-Chowdhury:

STRIDE: Single-Video Based Temporally Continuous Occlusion-Robust 3D Pose Estimation. 794-803 - Kartik Narayan, Nithin Gopalakrishnan Nair, Jennifer Xu, Rama Chellappa, Vishal M. Patel:

PETALface: Parameter Efficient Transfer Learning for Low-Resolution Face Recognition. 804-814 - Zengqun Zhao, Yu Cao, Shaogang Gong, Ioannis Patras:

Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer. 815-824 - Elaine Sui, Xiaohan Wang, Serena Yeung-Levy:

Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models. 825-835 - Leon Sick, Dominik Engel, Pedro Hermosilla

, Timo Ropinski:
Attention-Guided Masked Autoencoders for Learning Image Representations. 836-846 - Donghyeon Kwon, Inho Kim, Suha Kwak:

Boosting Semi-Supervised Video Action Detection with Temporal Context. 847-858 - Gabriele Spadaro, Marco Grangetto, Attilio Fiandrotti, Enzo Tartaglione, Jhony H. Giraldo:

WiGNet: Windowed Vision Graph Neural Network. 859-868 - Fei Wu, Pablo Márquez-Neila, Hedyeh Rafii-Tari, Raphael Sznitman:

Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation. 869-878 - Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu:

DeepMIM: Deep Supervision for Masked Image Modeling. 879-888 - Surojit Saha, Sarang C. Joshi, Ross T. Whitaker:

ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders. 889-898 - Wonjun Kang, Kevin Galim

, Hyung Il Koo, Nam Ik Cho:
Counting Guidance for High Fidelity Text-to-Image Synthesis. 899-908 - Rui Xu, Mengya Hu, Deren Lei, Yaxi Li, David Lowe, Alex Gorevski, Mingyu Wang, Emily Ching, Alex Deng:

InvisMark: Invisible and Robust Watermarking for AI-generated Image Provenance. 909-918 - Zhiyuan Xu, Yinhe Chen, Huan-ang Gao, Weiyan Zhao, Guiyu Zhang, Hao Zhao:

Diffusion-based Visual Anagram as Multi-task Learning. 919-928 - Ashutosh Srivastava, Tarun Ram Menta, Abhinav Java, Avadhoot Jadhav, Silky Singh

, Surgan Jandial, Balaji Krishnamurthy:
REEDIT: Multimodal Exemplar-Based Image Editing. 929-939 - Tanvir Mahmud, Mustafa Munir, Radu Marculescu, Diana Marculescu

:
Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior. 940-949 - Bo Lang, Mooi Choo Chuah:

Event-Guided Fusion-Mamba for Context-Aware 3D Human Pose Estimation. 950-960 - Luchao Qi

, Jiaye Wu, Annie N. Wang, Shengze Wang, Roni Sengupta:
My3DGen: A Scalable Personalized 3D Generative Model. 961-972 - Ashkan Ganj, Hang Su, Tian Guo:

HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image Priors. 973-982 - Keon Moradi, Ethan Haque, Jasmeen Kaur, Alexandra B. Bentz

, Eli S. Bridge, Golnaz Habibi
:
Context-Aware Outlier Rejection for Robust Multi-View 3D Tracking of Similar Small Birds in An Outdoor Aviary. 983-991 - Stathis Galanakis, Alexandros Lattas, Stylianos Moschoglou, Stefanos Zafeiriou:

FitDiff: Robust Monocular 3D Facial Shape and Reflectance Estimation using Diffusion Models. 992-1004 - Junyi Cao, Chao Ma:

Towards Generalized Face Anti-Spoofing from a Frequency Shortcut View. 1005-1015 - Marco Huber, Naser Damer:

Beyond Spatial Explanations: Explainable Face Recognition in the Frequency Domain. 1016-1026 - Diana Voth, Leonidas Dane, Jonas Grebe, Sebastian Peitz

, Philipp Terhörst:
Effective Backdoor Learning on Open-Set Face Recognition Systems. 1027-1039 - Han-Wei Kung, Tuomas Varanka, Sanjay Saha, Terence Sim, Nicu Sebe

:
Face Anonymization Made Simple. 1040-1050 - Yuxiang Guo, Anshul Shah, Jiang Liu, Ayush Gupta, Rama Chellappa, Cheng Peng:

GaitContour: Efficient Gait Recognition Based on a Contour-Pose Representation. 1051-1061 - Sanoojan Baliah, Qinliang Lin, Shengcai Liao, Xiaodan Liang, Muhammad Haris Khan:

Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models. 1062-1071 - Rui Li, Martin Trapp

, Marcus Klasson, Arno Solin
:
Flatness Improves Backbone Generalisation in Few-Shot Classification. 1072-1089 - Andrea Alfarano, Alberto Alfarano, Linda Friso, Andrea Bacciu, Irene Amerini, Fabrizio Silvestri:

STLight: A Fully Convolutional Approach for Efficient Predictive Learning by Spatio-Temporal Joint Processing. 1090-1100 - Do Huu Dat, Po Yuan Mao, Tien Hoang Nguyen, Wray L. Buntine, Mohammed Bennamoun

:
HOPE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts. 1101-1110 - Kiran Kokilepersaud, Seulgi Kim, Mohit Prabhushankar, Ghassan AlRegib:

HEX: Hierarchical Emergence Exploitation in Self-Supervised Algorithms. 1111-1121 - Yunbei Zhang, Akshay Mehra, Jihun Hamm:

OT-VP: Optimal Transport-Guided Visual Prompting for Test-Time Adaptation. 1122-1132 - Marcelo Sanchez, Gil Triginer, Coloma Ballester, Ignacio Sarasua, Lara Raad:

A New Benchmark and Baseline for Real-Time High-Resolution Image Inpainting on Edge Devices. 1133-1143 - Mrinal Verghese, Brian Chen, Hamid Eghbalzadeh, Tushar Nagarajan, Ruta Desai:

User-in-the-Loop Evaluation of Multimodal LLMs for Activity Assistance. 1144-1154 - Yan-Bo Lin, Yu Tian, Linjie Yang, Gedas Bertasius, Heng Wang:

VMAs: Video-to-Music Generation via Semantic Alignment in Web Music Videos. 1155-1165 - Daeyoung Roh, Donghee Han, Jihyun Nam, Jungsoo Oh, Youngbin You, Jeongheon Park, Mun Yong Yi:

CTIP: Towards Accurate Tabular-to-Image Generation for Tire Footprint Generation. 1166-1175 - Younghyun Cho, Changhun Lee, Seonggon Kim, Eunhyeok Park:

PTQ4VM: Post-Training Quantization for Visual Mamba. 1176-1185 - Jay N. Paranjape, Celso de Melo, Vishal M. Patel:

A Mamba-Based Siamese Network for Remote Sensing Change Detection. 1186-1196 - Julian D. Santamaria, Claudia Isaza, Jhony H. Giraldo:

CATALOG: A Camera Trap Language-Guided Contrastive Learning Model. 1197-1206 - Faith M. Johnson, Ryan Meegan, Jack Lowry, Peter Oudemans

, Kristin J. Dana:
Agtech Framework for Cranberry-Ripening Analysis Using Vision Foundation Models. 1207-1216 - Feng Chen, Sotirios A. Tsaftaris, Mario Valerio Giuffrida:

GMT: Guided Mask Transformer for Leaf Instance Segmentation. 1217-1226 - Shirin Qiam, Saipraneeth Devunuri, Lewis J. Lehe:

A Pipeline and NIR-Enhanced Dataset for Parking Lot Segmentation. 1227-1236 - Shao-Hao Lu, Ren Wang

, Ching-Chun Huang, Wei-Chen Chiu:
Boosting Diffusion Guidance via Learning Degradation-Aware Models for Blind Super Resolution. 1237-1246 - Antoine Mercier, Ramin Nakhli, Mahesh Reddy, Rajeev Yasarla, Hong Cai, Fatih Porikli, Guillaume Berger:

HexaGen3D: StableDiffusion is One Step Away from Fast and Diverse Text-to-3D Generation. 1247-1257 - Ali Mollaahmadi Dehaghi, Reza Razavi, Mohammad Moshirpour:

Reversing the Damage: A QP-Aware Transformer-Diffusion Approach for 8K Video Restoration under Codec Compression. 1258-1267 - Jianyi Zhang, Yufan Zhou, Jiuxiang Gu, Curtis Wigington, Tong Yu, Yiran Chen, Tong Sun, Ruiyi Zhang:

ARTIST: Improving the Generation of Text-Rich Images with Disentangled Diffusion Models and Large Language Models. 1268-1278 - Lorenzo Mandelli, Stefano Berretti:

Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models. 1279-1288 - S. Divakar Bhat, Amit More, Mudit Soni, Surbhi Agrawal:

Prior2Posterior: Model Prior Correction for Long-Tailed Learning. 1289-1298 - Prafful Kumar Khoba, Zijian Wang

, Chetan Arora, Mahsa Baktashmotlagh
:
Feature Space Perturbation: A Panacea to Enhanced Transferability Estimation. 1299-1308 - Hayeong Yu, Seungjae Han, Young-Gyu Yoon:

Design Principles of Multi-Scale J-Invariant Networks for Self-Supervised Image Denoising. 1309-1318 - Simon Damm, Mike Laszkiewicz, Johannes Lederer, Asja Fischer:

AnomalyDINO: Boosting Patch-based Few-Shot Anomaly Detection with DINOv2. 1319-1329 - Abu Zahid Bin Aziz, Mokshagna Sai Teja Karanam, Tushar Kataria, Shireen Y. Elhabian:

EFFICIENTMORPH: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration. 1330-1341 - Wenxin Ma, Qingsong Yao, Xiang Zhang, Zhelong Huang, Zihang Jiang, S. Kevin Zhou:

Towards Accurate Unified Anomaly Segmentation. 1342-1352 - Junhyeong Go, Jongbin Ryu:

Channel Propagation Networks for Refreshable Vision Transformer. 1353-1362 - Muhammad Ali

, Mamoona Javaid, Mubashir Noman, Mustansar Fiaz, Salman H. Khan:
COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes. 1363-1372 - Lucas Deregnaucourt, Hind Laghmara, Alexis Lechervy, Samia Ainouz:

A Conflict-Guided Evidential Multimodal Fusion for Semantic Segmentation. 1373-1382 - Monika Kwiatkowski, Simon Matern, Olaf Hellwich:

Swin-∇: Gradient-Based Image Restoration from Image Sequences using Video Swin-Transformers. 1383-1391 - Gautier Evennou, Antoine Chaffin, Vivien Chappelier, Ewa Kijak:

Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation. 1392-1402 - Mohammad Reza Taesiri, Cor-Paul Bezemer:

Videogamebunny: Towards Vision Assistants for Video Games. 1403-1413 - Aiyu Cui, Jay Mahajan, Viraj Shah, Preeti Gomathinayagam, Chang Liu, Svetlana Lazebnik:

Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images. 1414-1423 - Jeya Maria Jose Valanarasu, Rahul Garg, Andeep Toor, Xin Tong, Weijuan Xi, Andreas Lugmayr, Vishal M. Patel, Anne Menini:

ReBotNet: Fast Real-Time Video Enhancement. 1424-1435 - Souhail Bakkali, Sanket Biswas, Zuheng Ming, Mickaël Coustaty, Marçal Rusiñol, Oriol Ramos Terrades, Josep Lladós:

GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification. 1436-1446 - Yizhou Wang, Kuan-Chuan Peng, Yun Fu:

Towards Zero-shot 3D Anomaly Localization. 1447-1456 - Seoungyoon Kang, Youngsun Lim, Hyunjung Shim:

Label-Augmented Dataset Distillation. 1457-1466 - Surbhi Madan, Shreya Ghosh, Lownish Rai Sookha, M. A. Ganaie, Ramanathan Subramanian

, Abhinav Dhall, Tom Gedeon:
MIP-GAF: A MLLM-Annotated Benchmark for Most Important Person Localization and Group Context Understanding. 1467-1476 - Paritosh Parmar, Eric Peh, Basura Fernando

:
Learning to Visually Connect Actions and Their Effects. 1477-1487 - Keren Ganon, Morris Alper, Rachel Mikulinsky, Hadar Averbuch-Elor:

WAFFLE: Multimodal Floorplan Understanding in the Wild. 1488-1497 - Dac Thai Nguyen, Trung Thanh Nguyen, Huu Tien Nguyen, Thanh Trung Nguyen, Huy Hieu Pham, Thanh Hung Nguyen, Truong Thao Nguyen, Phi Le Nguyen:

CT to PET Translation: A Large-Scale Dataset and Domain-Knowledge-Guided Diffusion Approach. 1498-1507 - Jiahao Xu

, Zikai Zhang
, Rui Hu
:
Achieving Byzantine-Resilient Federated Learning via Layer-Adaptive Sparsified Model Aggregation. 1508-1517 - Jung Im Choi

, Qizhen Lan, Qing Tian
:
Improving Deep Detector Robustness via Detection-Related Discriminant Maximization and Reorganization. 1518-1527 - Rambod Azimi, Yijian Kong, Dusan Gostimirovic, James J. Clark, Odile Liboiron-Ladouceur:

SEMU-Net: A Segmentation-Based Corrector for Fabrication Process Variations of Nanophotonics with Microscopic Images. 1528-1536 - Seonguk Seo, Mustafa Gökhan Uzunbas, Bohyung Han, Sara Cao, Ser-Nam Lim:

Metric Compatible Training for Online Backfilling in Large-Scale Retrieval. 1537-1545 - Dimitrios Sinodinos

, Narges Armanfard:
Cross-Task Affinity Learning for Multitask Dense Scene Predictions. 1546-1555 - Sourasekhar Banerjee, Debaditya Roy, Vigneshwaran Subbaraju, Monowar Bhuyan:

Predicting Event Memorability Using Personalized Federated Learning. 1556-1565 - Hamidreza Dastmalchi, Aijun An

, Ali Cheraghian, Shafin Rahman, Sameera Ramasinghe:
Test-Time Adaptation of 3D Point Clouds via Denoising Diffusion Models. 1566-1576 - Dan-Sebastian Bacea

, Florin Oniga
:
ECF-YOLOv7-Tiny: Improving Feature Fusion and the Receptive Field for Lightweight Object Detectors. 1577-1586 - Giulia Rizzoli, Matteo Caligiuri, Donald Shenaj

, Francesco Barbato, Pietro Zanuttigh:
When Cars Meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather. 1587-1596 - Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, Babak Taati:

SUM: Saliency Unification Through Mamba for Visual Attention Modeling. 1597-1607 - Nyle Siddiqui, Florinel-Alin Croitoru, Gaurav Kumar Nayak, Radu Tudor Ionescu, Mubarak Shah:

DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-Id. 1608-1617 - Kuan-Hung Liu, Cheng-Kun Yang, Min-Hung Chen, Yu-Lun Liu, Yen-Yu Lin:

CorrFill: Enhancing Faithfulness in Reference-Based Inpainting with Correspondence Guidance in Diffusion Models. 1618-1627 - Shuo Wang

, Chunlong Xia, Feng Lv
, Yifeng Shi:
RT-DETRv3: Real-Time End-to-End Object Detection with Hierarchical Dense Positive Supervision. 1628-1636 - Elham Amin Mansour, Ozan Unal, Suman Saha, Benjamín Béjar, Luc Van Gool:

Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation. 1637-1648 - Saheli Hazra, Sudip Das, Rohit Choudhary, Arindam Das, Ganesh Sistu, Ciarán Eising, Ujjwal Bhattacharya:

Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure. 1649-1659 - Maciej K. Wozniak, Hariprasath Govindarajan, Marvin Klingner, Camille Maurice, Ravi Kiran, Senthil Kumar Yogamani:

S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving. 1660-1670 - Rémy Sun, Li Yang, Diane Lingrand, Frédéric Precioso:

Mind the Map! Accounting for Existing Maps When Estimating Online HDMaps from Sensors. 1671-1681 - Adrien Lafage, Mathieu Barbier, Gianni Franchi, David Filliat:

Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting. 1682-1691 - Chaesong Park, Eunbin Seo, Jongwoo Lim:

HeightLane: BEV Heightmap Guided 3D Lane Detection. 1692-1701 - Roy Uziel, Oded Bialer:

Optimizing Vision-Language Model for Road Crossing Intention Estimation. 1702-1712 - Xiaoyu Zhang, Ziwei Wang, Hai Dong

, Zhifeng Bao, Jiajun Liu:
On-the-Fly Object-aware Representative Point Selection in Point Cloud. 1713-1722 - Nikos Efthymiadis, Bill Psomas, Zakaria Laskar, Konstantinos Karantzalos, Yannis Avrithis, Ondrej Chum, Giorgos Tolias:

Composed Image Retrieval for Training-FREE DOMain Conversion. 1723-1733 - Zifu Wan, Pingping Zhang, Yuhao Wang, Silong Yong, Simon Stepputtis, Katia P. Sycara

, Yaqi Xie
:
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation. 1734-1744 - Hanoona Abdul Rasheed, Muhammad Maaz, Abdelrahman M. Shaker, Salman H. Khan, Hisham Cholakkal, Rao Muhammad Anwer, Tim Baldwin, Michael Felsberg, Fahad Shahbaz Khan:

Palo: A Polyglot Large Multimodal Model for 5B People. 1745-1754 - Anjishnu Mukherjee, Ziwei Zhu

, Antonios Anastasopoulos:
Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models. 1755-1764 - Srikumar Sastry, Subash Khanal, Aayush Dhakal, Adeel Ahmad

, Nathan Jacobs:
TaxaBind: A Unified Embedding Space for Ecological Applications. 1765-1774 - Qianyi Liu, Siqi Zhang, Yanyuan Qiao

, Junyou Zhu, Xiang Li, Longteng Guo, Qunbo Wang, Xingjian He, Qi Wu, Jing Liu:
GroundingMate: Aiding Object Grounding for Goal-Oriented Vision-and-Language Navigation. 1775-1784 - Florian Hofherr

, Bjoern Haefner, Daniel Cremers:
On Neural BRDFs: A Thorough Comparison of State-of-the-Art Approaches. 1785-1794 - Leif Van Holland, Michael Weinmann

, Jan U. Müller, Patrick Stotko, Reinhard Klein:
NeRFs are Mirror Detectors: Using Structural Similarity for Multi-View Mirror Scene Reconstruction with 3D Surface Primitives. 1795-1807 - Hugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic:

RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis. 1808-1817 - Konstantinos Tzevelekakis, Shutong Zhang, Luc Van Gool, Christos Sakaridis:

Sun Off, Lights on: Photorealistic Monocular Nighttime Simulation for Robust Semantic Perception. 1818-1828 - Kengo Matsufuji, Lin Shi, Ryo Kawahara, Takahiro Okabe:

Separating Direct and Global Components from Novel Viewpoints. 1829-1838 - Tianshu Kuai, Sina Honari, Igor Gilitschenski, Alex Levinshtein:

Towards Unsupervised Blind Face Restoration Using Diffusion Prior. 1839-1849 - Naeun Ko, Yonghyun Jeong, Jong Chul Ye:

Text-to-Image Synthesis for Domain Generalization in Face Anti-Spoofing. 1850-1860 - Huawei Sun, Zixu Wang, Hao Feng, Julius Ott, Lorenzo Servadei, Robert Wille:

GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling. 1850-1860 - Bin Yan, Martin Sundermeyer, David Joseph Tan, Huchuan Lu, Federico Tombari:

Towards Real-Time Open-Vocabulary Video Instance Segmentation. 1861-1871 - Hakjin Lee, Minki Song

, Jamyoung Koo, Junghoon Seo:
Hausdorff Distance Matching with Adaptive Query Denoising for Rotated Detection Transformer. 1872-1882 - Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum:

Crafting Distribution Shifts for Validation and Training in Single Source Domain Generalization. 1883-1892 - Abbas Khan, Muhammad Asad, Martin Benning, Caroline H. Roney, Gregory G. Slabaugh:

CAMS: Convolution and Attention-Free Mamba-Based Cardiac Image Segmentation. 1893-1903 - Wenhao Gu, Li Gu, Ziqiang Wang, Ching Yee Suen, Yang Wang:

DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning. 1904-1913 - Wulin Xie, Lian Zhao, Jiang Long, Xiaohuan Lu, Bingyan Nie:

Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification. 1914-1923 - Shahriar Rifat, Jonathan D. Ashdown, Francesco Restuccia:

DARDA: Domain-Aware Real-Time Dynamic Neural Network Adaptation. 1924-1932 - Hidehisa Arai, Keita Miwa, Kento Sasaki, Kohei Watanabe, Yu Yamaguchi, Shunsuke Aoki, Issei Yamamoto:

CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving. 1933-1943 - Alloy Das, Sanket Biswas, Prasun Roy, Subhankar Ghosh, Umapada Pal, Michael Blumenstein, Josep Lladós, Saumik Bhattacharya:

FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework. 1944-1954 - Sreetama Sarkar, Gourav Datta, Souvik Kundu, Kai Zheng, Chirayata Bhattacharyya, Peter A. Beerel:

MaskVD: Region Masking for Efficient Video Object Detection. 1955-1964 - Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu:

Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding. 1965-1978 - Zhonghua Yi, Hao Shi, Qi Jiang, Kailun Yang, Ze Wang, Diyang Gu, Yufan Zhang, Kaiwei Wang:

EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data. 1979-1988 - Michael Schwingshackl, Fabio Francisco Oberweger

, Markus Murschitz
:
Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks. 1989-1998 - Jiange Yang, Wenhui Tan, Chuhao Jin, Keling Yao, Bei Liu, Jianlong Fu, Ruihua Song, Gangshan Wu, Limin Wang:

Transferring Foundation Models for Generalizable Robotic Manipulation. 1999-2010 - Raktim Gautam Goswami, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami:

FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training. 2011-2020 - Jianhao Zheng

, Gábor Valasek, Daniel Barath, Iro Armeni:
Multi-HexPlanes: A Lightweight Map Representation for Rendering and 3D Reconstruction. 2021-2031 - Lin Shi, Kengo Matsufuji, Ryo Kawahara, Takahiro Okabe:

FluoNeRF: Fluorescent Novel-View Synthesis Under Novel Light Source Colors. 2032-2041 - Eito Ikuta, Yohan Lee

, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka:
Harmonizing Attention: Training-free Texture-aware Geometry Transfer. 2042-2051 - Chengyang Yan, Donald G. Dansereau:

TaCOS: Task-Specific Camera Optimization with Simulation. 2052-2062 - Daiki Miyake, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka:

Negative-Prompt Inversion: Fast Image Inversion for Editing with Text-Guided Diffusion Models. 2063-2072 - Stanislav Frolov, Brian B. Moser, Andreas Dengel:

SpotDiffusion: A Fast Approach for Seamless Panorama Generation Over Time. 2073-2081 - Prajneya Kumar, Eshika Khandelwal, Makarand Tapaswi, Vishnu Sreekumar:

Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability. 2082-2091 - Akshita Gupta, Gaurav Mittal, Ahmed Magooda, Ye Yu, Graham W. Taylor, Mei Chen:

LoSA: Long-Short-Range Adapter for Scaling End-to-End Temporal Action Localization. 2092-2102 - Minghui Lin, Shu Wang, Xiang Wang, Jianhua Tang, Longbin Fu, Zhengrong Zuo, Nong Sang:

DMPT: Decoupled Modality-Aware Prompt Tuning for Multi-Modal Object Re-Identification. 2103-2112 - Rita Pucci, Niki Martinel:

CE-VAE: Capsule Enhanced Variational AutoEncoder for Underwater Image Enhancement. 2113-2123 - Logan Servant, Michaël Clément, Laurent Wendling, Camille Kurtz:

Contrastive Learning of Image Representations Guided by Spatial Relations. 2124-2133 - Katharina Prasse

, Isaac Bravo, Stefanie Walter, Margret Keuper:
I Spy with My Little Eye a Minimum Cost Multicut Investigation of Dataset Frames. 2134-2143 - Jingbo Zeng, Zaiwang Gu, Weide Liu, Lile Cai, Jun Cheng:

Uncertainty Aware Interest Point Detection and Description. 2144-2153 - Jiawei Yao, Jusheng Zhang, Xiaochao Pan, Tong Wu, Canran Xiao:

DepthSSC: Monocular 3D Semantic Scene Completion via Depth-Spatial Alignment and Voxel Adaptation. 2154-2163 - Yongkang Cheng, Mingjiang Liang, Shaoli Huang, Gaoge Han, Jifeng Ning, Wei Liu:

Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios. 2164-2173 - Chen Zhao

, Mengyuan Yu, Fan Yang, Peiguang Jing:
VIIS: Visible and Infrared Information Synthesis for Severe Low-Light Image Enhancement. 2174-2184 - Saad Lahlali, Nicolas Granger, Hervé Le Borgne, Quoc-Cuong Pham:

ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only. 2185-2194 - Rao Fu, Jingyu Liu, Xilun Chen, Yixin Nie, Wenhan Xiong:

Scene-LLM: Extending Language Model for 3D Visual Reasoning. 2195-2206 - Tai D. Nguyen, Matthew C. Stamm:

MVFNet: Multipurpose Video Forensics Network using Multiple Forms of Forensic Evidence. 2207-2217 - Gaoge Han, Mingjiang Liang, Jinglei Tang, Yongkang Cheng, Wei Liu, Shaoli Huang:

ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model. 2218-2227 - Shaoxiang Wang, Yaxu Xie, Chun-Peng Chang, Christen Millerdurai, Alain Pagani, Didier Stricker:

Uni-SLAM: Uncertainty-Aware Neural Implicit SLAM for Real-Time Dense Indoor Scene Reconstruction. 2228-2239 - Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Jürgen Gall, Amirhossein Habibian:

Valid: Variable-Length Input Diffusion for Novel View Synthesis. 2240-2249 - Florian Chabot, Nicolas Granger, Guillaume Lapouge:

GaussianBeV: 3D Gaussian Representation meets Perception Models for BeV Segmentation. 2250-2259 - Longwei Li, Huajian Huang, Sai-Kit Yeung, Hui Cheng:

OmniGS: Fast Radiance Field Reconstruction Using Omnidirectional Gaussian Splatting. 2260-2268 - Tao Tu, Ming-Feng Li, Chieh Hubert Lin, Yen-Chi Cheng, Min Sun, Ming-Hsuan Yang:

DreaMo: Articulated 3D Reconstruction from a Single Casual Video. 2269-2279 - Danush Kumar Venkatesh, Dominik Rivoir, Micha Pfeiffer, Fiona R. Kolbinger

, Stefanie Speidel
:
Data Augmentation for Surgical Scene Segmentation with Anatomy-Aware Diffusion Models. 2280-2290 - Fotios Logothetis, Ignas Budvytis, Roberto Cipolla:

NPL-MVPS: Neural Point-Light Multi-View Photometric Stereo. 2291-2300 - Wenzhao Li, Tianhao Wu, Fangcheng Zhong, Cengiz Öztireli:

ARF-Plus: Controlling Perceptual Factors in Artistic Radiance Fields for 3D Scene Stylization. 2301-2310 - Sachin Raja, Ajoy Mandal, C. V. Jawahar:

Treading Towards Privacy-Preserving Table Structure Recognition. 2311-2321 - Tong Wei, Philipp Lindenberger, Jirí Matas, Daniel Barath:

Breaking the Frame: Visual Place Recognition by Overlap Prediction. 2322-2331 - G. Ujwal Sai, Arkadipta De, Vartika Sengar, Anuj Rathore, Daksh Thapar, Manohar Kaul:

Learning Semantic Part-Based Graph Structure for 3D Point Cloud Domain Generalization. 2332-2341 - Jiuxiang Gu, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song:

Differential Privacy Mechanisms in Neural Tangent Kernel Regression. 2342-2356 - Adith Boloor, Weikai Lin, Tianrui Ma

, Yu Feng, Yuhao Zhu, Xuan Zhang:
PrivateEye: In-Sensor Privacy Preservation Through Optical Feature Separation. 2357-2367 - Shogo Sato, Takuhiro Kaneko, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida, Akisato Kimura:

Unsupervised Single-Image Intrinsic Image Decomposition with LiDAR Intensity Enhanced Training. 2368-2378 - Benjamin Salmon

, Alexander Krull
:
Unsupervised Denoising for Signal-Dependent and Row-Correlated Imaging Noise. 2379-2389 - Chen Wu, Ling Wang

, Long Peng, Dianjie Lu, Zhuoran Zheng:
Dropout the High-Rate Downsampling: A Novel Design Paradigm for UHD Image Restoration. 2390-2399 - Ankit Dhiman, R. Srinath, Srinjay Sarkar, Lokesh R. Boregowda, R. Venkatesh Babu:

ChromaDistill: Colorizing Monochrome Radiance Fields with Knowledge Distillation. 2400-2410 - Chaohao Xie, Kai Han, Kwan-Yee K. Wong:

VipDiff: Towards Coherent and Diverse Video Inpainting via Training-Free Denoising Diffusion Models. 2411-2420 - Matias Turkulainen, Xuqian Ren

, Iaroslav Melekhov, Otto Seiskari
, Esa Rahtu
, Juho Kannala:
DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing. 2421-2431 - Dongwoo Park, Suk Pil Ko:

NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior. 2432-2441 - Bo Ji, Angela Yao:

High-Pass Kernel Prediction for Efficient Video Deblurring. 2442-2452 - Guoshan Liu, Hailong Yin, Bin Zhu, Jingjing Chen, Chong-Wah Ngo, Yu-Gang Jiang:

Retrieval Augmented Recipe Generation. 2453-2463 - Nan Cai, Pia Bideau:

Active Event Alignment for Monocular Distance Estimation. 2464-2473 - Hojun Jang, Young Min Kim:

ReMP: Reusable Motion Prior for Multi-domain 3D Human Pose Estimation and Motion Inbetweening. 2474-2483 - Zhiyu Pan, Zhicheng Zhong, Wenxuan Guo, Yifan Chen, Jianjiang Feng, Jie Zhou:

LiCamPose: Combining Multi-View LiDAR and RGB Cameras for Robust Single-timestamp 3D Human Pose Estimation. 2484-2494 - Laura O'Mahony, Nikola S. Nikolov, David J. P. O'Sullivan

:
Towards Utilising a Range of Neural Activations for Comprehending Representational Associations. 2495-2506 - Amit Giloni, Omer Hofman, Ikuya Morikawa, Toshiya Shimizu, Yuval Elovici, Asaf Shabtai:

DiL: An Explainable and Practical Metric for Abnormal Uncertainty in Object Detection. 2507-2516 - Dongyu Yan, Guanyu Huang, Fengyu Quan, Haoyao Chen:

MSI-NeRF: Linking Omni-Depth with View Synthesis Through Multi-Sphere Image Aided Generalizable Neural Radiance Field. 2517-2526 - Giacomo Capitani

, Lorenzo Bonicelli
, Angelo Porrello, Federico Bolelli
, Simone Calderara, Elisa Ficarra:
Towards Unbiased Continual Learning: Avoiding Forgetting in the Presence of Spurious Correlations. 2527-2537 - Juhyeon Park, Seokhyeon Jeong, Taesup Moon:

TLDR: Text Based Last-Layer Retraining for Debiasing Image Classifiers. 2538-2547 - Vito Paolo Pastore, Massimiliano Ciranni

, Davide Marinelli, Francesca Odone, Vittorio Murino:
Looking at Model Debiasing through the Lens of Anomaly Detection. 2548-2557 - Mingqi Shao, Feng Xiong, Hang Zhang, Shuang Yang, Mu Xu, Wei Bian, Xueqian Wang:

Global-Guided Focal Neural Radiance Field for Large-Scale Scene Rendering. 2558-2567 - Weijing Tao

, Biwen Lei, Kunhao Liu, Shijian Lu, Miaomiao Cui, Xuansong Xie:
DivAvatar: Diverse 3D Avatar Generation with a Single Prompt. 2568-2577 - Ugo Leone Cavalcanti, Matteo Poggi, Fabio Tosi, Valerio Cambareri, Vladimir Zlokolica, Stefano Mattoccia:

CabNIR: A Benchmark for In-Vehicle Infrared Monocular Depth Estimation. 2578-2590 - Muhammad Salman Ali

, Sung-Ho Bae, Enzo Tartaglione:
ELMGS: Enhancing Memory and Computation Scalability Through coMpression for 3D Gaussian Splatting. 2591-2600 - Matías Mendieta, Guangyu Sun, Chen Chen:

Navigating Heterogeneity and Privacy in One-Shot Federated Learning with Diffusion Models. 2601-2610 - Feng Xu

, David Ahmedt-Aristizabal, Lars Petersson, Dadong Wang, Xun Li:
Facial Expression Recognition with Controlled Privacy Preservation and Feature Compensation. 2611-2621 - Hermes McGriff, Renato Martins, Nicolas Andreff, Cédric Demonceaux:

Dense Scene Reconstruction from Light-Field Images Affected by Rolling Shutter. 2622-2630 - Shilin Hu, Hieu Le, ShahRukh Athar, Sagnik Das, Dimitris Samaras:

Shadow Removal Refinement via Material-Consistent Shadow Edges. 2631-2641 - Yiqing Liang, Numair Khan, Zhengqin Li, Thu Nguyen-Phuoc, Douglas Lanman, James Tompkin, Lei Xiao:

GauFRe: Gaussian Deformation Fields for Real-Time Dynamic Novel View Synthesis. 2642-2652 - Karam Park, Nam Ik Cho:

Partial Filter-Sharing: Improved Parameter-sharing Method for Single Image Super-Resolution Networks. 2653-2663 - Si-Yu Lu, Yung-Yao Chen, Yi-Tong Wu, Hsin-Chun Lin, Sin-Ye Jhong, Wen-Huang Cheng:

Radiance Field-Based Pose Estimation via Decoupled Optimization Under Challenging Initial Conditions. 2664-2673 - Yimu Wang, Krzysztof Czarnecki:

AiDe: Improving 3D Open-Vocabulary Semantic Segmentation by Aligned Vision-Language Learning. 2674-2685 - Yongjae Lee, Li Yang, Deliang Fan:

MFNeRF: Memory Efficient NeRF with Mixed-Feature Hash Table. 2686-2695 - Tu Vo, Chan Y. Park:

Deep Joint Unrolling for Deblurring and Low-Light Image Enhancement (JUDE). 2696-2705 - Hirunima Jayasekara, Khoi Pham, Nirat Saini, Abhinav Shrivastava:

Unified Framework for Open-World Compositional Zero-Shot Learning. 2706-2714 - Manh Duong Nguyen, Tuan Nghia Nguyen, Xuan Truong Nguyen:

ENAF: A Multi-Exit Network with an Adaptive Patch Fusion for Large Image Super Resolution. 2706-2714 - Yahan Chen

, Wenzheng Liu, Xiaowei Luo
:
Semantic Segmentation Method for Automated Indoor 3D Reconstruction based on Architectural-Knowledge-Aware Features. 2715-2724 - Asen Nachkov, Danda Pani Paudel, Martin Danelljan, Luc Van Gool:

Diffusion-Based Particle-DETR for BEV Perception. 2725-2735 - Aditya Dixit

, Nischit Hosamani, Puneet Gupta, Ankur Garg:
VISIONARY: Novel Spatial-Spectral Attention Mechanism for Hyperspectral Image Denoising. 2736-2745 - Yujing Xue, Jiaxiang Liu, Jiawei Du, Joey Tianyi Zhou:

PVP: Polar Representation Boost for 3D Semantic Occupancy Prediction. 2746-2755 - Han Zou, Masanori Suganuma, Takayuki Okatani:

RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution. 2756-2765 - Aimon Rahman, Malsha V. Perera, Vishal M. Patel:

Frame by Familiar Frame: Understanding Replication in Video Diffusion Models. 2766-2776 - Gasser Elazab, Torben Gräber, Michael Unterreiner, Olaf Hellwich:

MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications. 2777-2787 - Alexandre Fournier-Montgieux, Michaël Soumm, Adrian Popescu, Bertrand Luvison, Hervé Le Borgne:

Fairer Analysis and Demographically Balanced Face Generation for Fairer Face Verification. 2788-2798 - Ziqiang Shi, Rujie Liu, Jun Takahashi, Takuma Yamamoto:

Bayesian Optimal Latent Projection for Noisy Image Restoration. 2799-2807 - Amartya Bhattacharya, Debarshi Brahma, Suraj Nagaje Mahadev, Anmol Asati, Vikas Verma, Soma Biswas:

Can Out-of-Domain Data Help to Learn Domain-Specific Prompts for Multimodal Misinformation Detection? 2808-2817 - Jiahui Li, Pourya Shamsolmoali, Yue Lu, Masoumeh Zareapoor:

ShapeMorph: 3D Shape Completion via Blockwise Discrete Diffusion. 2818-2827 - Inpyo Song, Sanghyeon Lee, Minjun Joo, Jangwon Lee:

Anomaly Detection for People with Visual Impairments Using an Egocentric 360-Degree Camera. 2828-2837 - Green Rosh K. S, Meghana Shankar, Prateek Kukreja, Anmol Namdev, B. H. Pawan Prasad:

XPose: Towards Extreme Low Light Hand Pose Estimation. 2838-2848 - Shaoxiong Zhang

, Hiromitsu Awano, Takashi Sato:
Gaitcloud: Leveraging Spatial-Temporal Information for Lidar-Base Gait Recognition With a True-3D Gait Representation. 2849-2858 - Federico Nocentini, Claudio Ferrari, Stefano Berretti:

EmoVOCA: Speech-Driven Emotional 3D Talking Heads. 2859-2868 - Hugo Porta, Emanuele Dalsasso, Diego Marcos

, Devis Tuia:
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation. 2869-2880 - Aleksandr Matsun, Numan Saeed, Fadillah Adamsyah Maani, Mohammad Yaqub:

ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization. 2881-2889 - Vivek Madhavaram, Shivangana Rawat, Chaitanya Devaguptapu, Charu Sharma, Manohar Kaul:

Towards a Training Free Approach for 3D Scene Editing. 2890-2899 - Leonard Bruns, Jun Zhang

, Patric Jensfelt:
Neural Graph Map: Dense Mapping with Efficient Loop Closure Integration. 2900-2909 - Julian Kaltheuner, Patrick Stotko, Reinhard Klein:

ROSA: Reconstructing Object Shape and Appearance Textures by Adaptive Detail Transfer. 2910-2920 - Hossein Resani, Behrooz Nasihatkon, Mohammadreza Alimoradi Jazi:

Continual Learning in 3D Point Clouds: Employing Spectral Techniques for Exemplar Selection. 2921-2931 - Sanjay S J, Akash J, Sreehari Rajan, Dimple A. Shajahan

, Charu Sharma:
Adversarial Learning Based Knowledge Distillation on 3D Point Clouds. 2932-2941 - Annie N. Wang, Luchao Qi, Roni Sengupta:

Continual Learning of Personalized Generative Face Models with Experience Replay. 2942-2951 - Jae Joong Lee, Bedrich Benes:

RGB2Point: 3D Point Cloud Generation from Single RGB Images. 2952-2962 - Thomas Walker, Octave Mariotti, Amir Vaxman, Hakan Bilen:

Spatially-Adaptive Hash Encodings for Neural Surface Reconstruction. 2963-2972 - Esmat Ghasemi Saghand, Susana K. Lai-Yuen:

MONAS-ESNN: Multi-Objective Neural Architecture Search for Efficient Spiking Neural Networks. 2963-2972 - Mingjiang Liang, Yongkang Cheng, Hualin Liang, Shaoli Huang, Wei Liu:

RopeTP: Global Human Motion Recovery via Integrating Robust Pose Estimation with Diffusion Trajectory Prior. 2973-2982 - Jiawei Liu, Wayne Lam, Zhigang Zhu, Hao Tang

:
SMDAF: A Scalable Sidewalk Material Data Acquisition Framework with Bidirectional Cross-Modal Knowledge Distillation. 2983-2992 - Anvita A. Srinivas, Tuomas P. Oikarinen, Divyansh Srivastava, Wei-Hung Weng, Tsui-Wei Weng:

SAND: Enhancing Open-Set Neuron Descriptions through Spatial Awareness. 2993-3002 - Shreya Saha, Zekai Liang, Shan Lin

, Jingpei Lu, Michael C. Yip, Sainan Liu:
BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction Using Neural Radiance Fields. 3003-3012 - Chuanmao Fan, Chenxi Zhao, Ye Duan:

PVT: An Implicit Surface Reconstruction Framework via Point Voxel Geometric-Aware Transformer. 3013-3023 - Katherine Xu, Lingzhi Zhang, Jianbo Shi:

Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models. 3024-3034 - Naga Venkata Sai Raviteja Chappa, Khoa Luu:

LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition. 3035-3044 - Zhiyuan Gao, Wenbin Teng, Gonglin Chen, Jinsen Wu, Ningli Xu, Rongjun Qin, Andrew Feng, Yajie Zhao:

Skyeyes: Ground Roaming using Aerial View Images. 3045-3054 - Tingting Zhao, Chenguang Liu, Kamal Jnawali, Chang Su:

eLIR-Net: an Efficient AI Solution for Image Retouching. 3055-3063 - Haojie Cai, Dongfu Yin, Fei Richard Yu, Siting Xiong:

DSTR: Dual Scenes Transformer for Cross-Modal Fusion in 3D Object Detection. 3064-3073 - Wangduo Xie, Richard Schoonhoven

, Tristan van Leeuwen, Matthew B. Blaschko
:
AC-IND: Sparse CT Reconstruction Based on Attenuation Coefficient Estimation and Implicit Neural Distribution. 3074-3083 - Ziqi Gao, Wendi Yang, Yujia Li, Lei Xing

, S. Kevin Zhou:
MS-Glance: Bio-Inspired Non-Semantic Context Vectors and Their Applications in Supervising Image Reconstruction. 3084-3095 - Ji Zhang, Yiran Ding, Zixin Liu:

OccLoff: Learning Optimized Feature Fusion for 3D Occupancy Prediction. 3096-3106 - Tung-Yu Wu, Sheng-Yu Huang, Yu-Chiang Frank Wang:

Data-Efficient 3D Visual Grounding via Order-Aware Referring. 3107-3117 - Brent Zoomers, Maarten Wijnants, Ivan Molenaers, Joni Vanherck

, Jeroen Put, Nick Michiels:
PRoGS: Progressive Rendering of Gaussian Splats. 3118-3127 - Junjie Oscar Yin, Ting Li, Jiahao Wang, Yi Zhang, Alan L. Yuille:

EasyRet3D: Uncalibrated Multi-View Multi-Human 3D Reconstruction and Tracking. 3128-3137 - Jingtong Yue, Xin Lin, Zijiu Yang, Chao Ren:

Dual-Representation Interaction Driven Image Quality Assessment with Restoration Assistance. 3138-3147 - Chen Feng, Duolikun Danier

, Fan Zhang, Alex Mackin, Andrew Collins, David Bull:
MVAD: A Multiple Visual Artifact Detector for Video Streaming. 3148-3158 - Katharina Bendig, René Schuster, Nicole Thiemer, Karen Joisten, Didier Stricker:

Supplementary Material AnonyNoise: Anonymizing Event Data with Smart Noise to Outsmart Re-Identification and Preserve Privacy. 3159-3161 - Jiahuan Li, Xiaoyu Dong, Wei He, Naoto Yokoya:

Wavelength- and Depth-Aware Deep Image Prior for Blind Hyperspectral Imagery Deblurring with Coarse Depth Guidance. 3162-3171 - Md Motiur Rahman

, Mohamed Trabelsi, Hüseyin Uzunalioglu, Aidan Boyd:
Personalized Mixture of Experts for Multi-Site Medical Image Segmentation. 3172-3184 - Maor Dikter, Tsachi Blau, Chaim Baskin:

Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency. 3185-3195 - Yilin Zheng, Chiang-Heng Chien, Ricardo Fabbri

, Benjamin B. Kimia:
3D Edge Sketch from Multiview Images. 3196-3205 - Seonguk Seo, Dongwan Kim, Bohyung Han:

Revisiting Machine Unlearning with Dimensional Alignment. 3206-3215 - Arkadipta De, Vartika Sengar, Daksh Thapar, Mahesh Chandran

, Manohar Kaul:
Elemental Composite Prototypical Network: Few-Shot Object Detection on Outdoor 3D Point Cloud Scenes. 3216-3226 - Nourhan Bayasi, Jamil Fayyad, Ghassan Hamarneh, Rafeef Garbi, Homayoun Najjaran:

Debiasify: Self-Distillation for Unsupervised Bias Mitigation. 3227-3236 - Haidong Wu, Snehal Bhayani, Janne Heikkilä:

A Conic Transformation Approach for Solving the Perspective-Three-Point Problem. 3237-3245 - Kunal Kathare, Ankit Dhiman, Vikas K. Gowda, Siddharth Aravindan, Shubham Monga, Basavaraja Shanthappa Vandrotti, Lokesh R. Boregowda:

Instructive3D: Editing Large Reconstruction Models with Text Instructions. 3246-3256 - Marco Garosi, Riccardo Tedoldi, Davide Boscaini

, Massimiliano Mancini
, Nicu Sebe
, Fabio Poiesi:
3D Part Segmentation via Geometric Aggregation of 2D Visual Features. 3257-3267 - Kunal Chelani, Assia Benbihi, Torsten Sattler, Fredrik Kahl:

EdgeGaussians - 3D Edge Mapping via Gaussian Splatting. 3268-3279 - Haoran Wang, Nantheera Anantrasirichai, Fan Zhang

, David Bull
:
UW-GS: Distractor-Aware 3D Gaussian Splatting for Enhanced Underwater Scene Reconstruction. 3280-3289 - Mohammad Farazi, Yalin Wang:

A Recipe for Geometry-Aware 3D Mesh Transformers. 3290-3300 - Kurt H. W. Stolle

:
Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation. 3301-3309 - Michal Byra, Henrik Skibbe:

Generating Visual Explanations from Deep Networks Using Implicit Neural Representations. 3310-3319 - Youpeng Wen, Yi Zhu, Zhihao Zhan, Pengzhen Ren, Jianhua Han, Hang Xu, Shen Zhao, Xiaodan Liang:

DisCo: Discovering Common Affordance from Large Models for Actionable Part Perception. 3320-3329 - Zhen Yao

, Mooi Choo Chuah:
Event-Guided Low-Light Video Semantic Segmentation. 3330-3341 - Masahiro Yamaguchi, Takashi Shibata, Shoji Yachida, Keiko Yokoyama, Toshinori Hosoi:

MDCN-PS: Monocular-Depth-Guided Coarse Normal Attention for Robust Photometric Stereo. 3342-3351 - Eui Jun Hwang, Sukmin Cho, Huije Lee, Youngwoo Yoon, Jong C. Park:

A Spatio-Temporal Representation Learning as an Alternative to Traditional Glosses in Sign Language Translation and Production. 3352-3362 - Devendra Patel, Vikas Verma, Shreyas Kumar Tah, Shwetabh Biswas, Soma Biswas:

FRAUD-Net: Fraud News Detection Using Sample Uncertainty & Domain Aware Generalized Network. 3363-3371 - Priyanka Mishra, Nancy Mehta, Santosh Kumar Vipparthi

, Subrahmanyam Murala:
USWformer: Efficient Sparse Wavelet Transformer for Underwater Image Enhancement. 3372-3382 - Arturo Miguel Russell Bernal, Jane Cleland-Huang, Walter J. Scheirer:

Psych-Occlusion: Using Visual Psychophysics for Aerial Detection of Occluded Persons During Search and Rescue. 3383-3395 - Yi Yang, Lei Zhong, Huiping Zhuang:

ReFu: Recursive Fusion for Exemplar-Free 3D Class-Incremental Learning. 3396-3405 - Juheon Son, Jang-Hwan Choi:

FMD: Comprehensive Data Compression in Medical Domain via Fused Matching Distillation. 3406-3415 - Rouqaiah Al-Refai, Philipp Hempel, Clara Biagi, Philipp Terhörst:

FALCON: Fair Face Recognition via Local Optimal Feature Normalization. 3416-3426 - Minh-Quan Le, Minh-Triet Tran, Trung-Nghia Le, Tam V. Nguyen, Thanh-Toan Do:

CamoFA: A Learnable Fourier-Based Augmentation for Camouflage Segmentation. 3427-3436 - Gianluca D'Amico, Federico Nesti, Giulio Rossolini, Mauro Marinoni, Salvatore Sabina, Giorgio C. Buttazzo:

SynDRA: Synthetic Dataset for Railway Applications. 3437-3446 - Abdul Mohaimen Al Radi, Prothito Shovon Majumder, Md. Mosaddek Khan:

Blind Image Deblurring with FFT-ReLU Sparsity Prior. 3447-3456 - Benjamin Coupry, Baptiste Brument, Antoine Laurent, Jean Mélou, Yvain Quéau, Jean-Denis Durou:

Assessing the Quality of 3D Reconstruction in the Absence of Ground Truth: Application to a Multimodal Archaeological Dataset. 3457-3466 - Gereziher Adhane, Mohammad Mahdi Dehshibi

, Dennis Vetter, David Masip, Gemma Roig:
On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process. 3467-3476 - Sebastian Janampa, Marios Pattichis:

DT-LSD: Deformable Transformer-Based Line Segment Detection. 3477-3486 - Marzieh Mohammadi, Amir Salarpour:

Point-GN: A Non-Parametric Network Using Gaussian Positional Encoding for Point Cloud Classification. 3487-3496 - Rohan Chacko, Nicolai Häni, Eldar Khaliullin, Lin Sun, Douglas Lee:

Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation. 3497-3507 - Victor Rong, Jingxiang Chen, Sherwin Bahmani, Kiriakos N. Kutulakos, David B. Lindell:

GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling. 3508-3518 - Silvan Weder, Francis Engelmann

, Johannes L. Schönberger, Akihito Seki, Marc Pollefeys, Martin R. Oswald:
ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic Reconstruction. 3519-3528 - Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chen:

DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models. 3529-3538 - Pengxiang Li, Kai Chen, Zhili Liu, Ruiyuan Gao, Lanqing Hong, Dit-Yan Yeung, Huchuan Lu, Xu Jia:

TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models. 3539-3548 - Yuan Zhang, Yutong Xie, Hu Wang, Jodie C. Avery, M. Louise Hull, Gustavo Carneiro

:
A Novel Perspective for Multi-Modal Multi-Label Skin Lesion Classification. 3549-3558 - Youngjun Jun, Jiwoo Park, Kyobin Choo, Tae Eun Choi, Seong Jae Hwang:

Disentangling Disentangled Representations: Towards Improved Latent Units via Diffusion Models. 3559-3569 - Chengyin Li, Rafi Ibn Sultan, Prashant Khanduri, Yao Qiang, Chetty J. Indrin, Dongxiao Zhu:

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation. 3570-3580 - Chengyin Li, Hui Zhu, Rafi Ibn Sultan, Hassan Bagher-Ebadian, Prashant Khanduri, Chetty J. Indrin, Kundan Thind, Dongxiao Zhu:

MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training. 3581-3591 - Boqi Chen, Yuanzhi Zhu, Yunke Ao, Sebastiano Caprara, Reto Sutter, Gunnar Rätsch, Ender Konukoglu, Anna Susmelj:

Generalizable Single-Source Cross-Modality Medical Image Segmentation via Invariant Causal Mechanisms. 3592-3602 - Nikolas Adaloglou, Tim Kaiser, Felix Michels, Markus Kollmann:

Rethinking Cluster-Conditioned Diffusion Models for Label-Free Image Synthesis. 3603-3613 - Shwetha Ram, Tal Neiman, Qianli Feng, Andrew Stuart, Son Tran, Trishul Chilimbi:

DreamBlend: Advancing Personalized Fine-Tuning of Text-to-Image Diffusion Models. 3614-3623 - Delin An, Pengfei Gu, Milan Sonka, Chaoli Wang, Danny Z. Chen:

Sli2Vol+: Segmenting 3D Medical Images Based on an Object Estimation Guided Correspondence Flow Network. 3624-3634 - Jonghun Kim

, Inye Na, Eun Sook Ko, Hyunjin Park
:
Tumor Synthesis Conditioned on Radiomics. 3635-3646 - Nahid Ul Islam, Dongao Ma

, Jiaxuan Pang, Shivasakthi Senthil Velan, Michael B. Gotway, Jianming Liang:
Foundation X: Integrating Classification, Localization, and Segmentation Through Lock-Release Pretraining Strategy for Chest X-Ray Analysis. 3647-3656 - Youyuan Zhang, Xuan Ju, James J. Clark:

FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing. 3657-3666 - Sharon Chokuwa, Muhammad Haris Khan:

Divergent Domains, Convergent Grading: Enhancing Generalization in Diabetic Retinopathy Grading. 3667-3677 - Zhi Xu, Shaozhe Hao, Kai Han:

CusConcept: Customized Visual Concept Decomposition with Diffusion Models. 3678-3687 - Benito Buchheim, Max Reimann, Jürgen Döllner:

Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation. 3688-3697 - Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang:

Fine-grained Controllable Video Generation via Object Appearance and Context. 3698-3708 - Hsin-Ping Huang, Yu-Chuan Su, Ming-Hsuan Yang:

Generating Long-Take Videos via Effective Keyframes and Guidance. 3709-3720 - Rishubh Parihar, Prasanna Balaji, Raghav Magazine, Sarthak Vora, Varun Jampani, R. Venkatesh Babu:

Attribute Diffusion: Diffusion Driven Diverse Attribute Editing. 3721-3731 - Ming Kang, Fung Fung Ting

, Raphaël C.-W. Phan, Chee-Ming Ting:
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices. 3732-3741 - Taewoo Kim, Geonsu Lee, Hyukgi Lee, Seongtae Kim, Younggun Lee:

PixSwap: High-Resolution Face Swapping for Effective Reflection of Identity via Pixel-Level Supervision with Synthetic Paired Dataset. 3742-3751 - Niklas Babendererde, Haozhe Zhu, Moritz Fuchs, Jonathan Stieber, Anirban Mukhopadhyay:

Federated-Continual Dynamic Segmentation of Histopathology Guided by Barlow Continuity. 3752-3761 - Yannik Frisch

, Christina Bornberg, Moritz Fuchs, Anirban Mukhopadhyay:
GAUDA: Generative Adaptive Uncertainty-Guided Diffusion-Based Augmentation for Surgical Segmentation. 3762-3771 - Zhongpai Gao, Abhishek Sharma, Meng Zheng, Benjamin Planche, Terrence Chen, Ziyan Wu:

Automated Patient Positioning with Learned 3D Hand Gestures. 3772-3781 - Xingzhe He, Zhiwen Cao, Nicholas I. Kolkin, Lantao Yu, Kun Wan, Helge Rhodin, Ratheesh Kalarot:

A Data Perspective on Enhanced Identity Preservation for Diffusion Personalization. 3782-3791 - Kangfu Mei, Nithin Gopalakrishnan Nair, Vishal M. Patel:

Improving Conditional Diffusion Models through Re-Noising from Unconditional Diffusion Priors. 3792-3801 - Mario Wieser, Daniel Siegismund, Stephan Steigele

:
Revisiting Deep Archetypal Analysis for Phenotype Discovery in High Content Imaging. 3802-3811 - Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Jiale Cao, Zhong Ji, Mingming Sun:

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior. 3812-3822 - Ziyu Zhou, Haozhe Luo, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Xiaowei Ding, Michael B. Gotway, Jianming Liang:

ACE: Anatomically Consistent Embeddings in Composition and Decomposition. 3823-3833 - Amin Ranem, John Kalkhof

, Anirban Mukhopadhyay:
NCAdapt: Dynamic Adaptation with Domain-Specific Neural Cellular Automata for Continual Hippocampus Segmentation. 3834-3843 - Michele De Vita, Vasileios Belagiannis:

Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation. 3844-3854 - Ruyu Wang, Xuefeng Hou, Sabrina Schmedding, Marco F. Huber:

STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation. 3855-3865 - Abdullah Al Rahat, Hemanth Venkateswara:

Dataset Augmentation by Mixing Visual Concepts. 3866-3875 - Chentianye Xu, Xueying Zhan, Min Xu:

CryoMAE: Few-Shot Cryo-EM Particle Picking with Masked Autoencoders. 3876-3885 - Roberto Di Via, Francesca Odone, Vito Paolo Pastore:

Self-Supervised Pre-Training with Diffusion Model for Few-Shot Landmark Detection in X-Ray Images. 3886-3896 - Ziyang Zheng, Ruiyuan Gao, Qiang Xu:

Non-Cross Diffusion for Semantic Consistency. 3897-3906 - Aiman Farooq, Deepak Mishra, Santanu Chaudhury:

Survival Prediction in Lung Cancer through Multi-Modal Representation Learning. 3907-3915 - Zakaria Patel, Kirill Serkh:

Enhancing Image Layout Control with Loss-Guided Diffusion Models. 3916-3924 - Zhenyue Qin, Yiqun Zhang, Yang Liu, Dylan Campbell:

HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images. 3925-3933 - Xin Jiang, Junwei Zheng, Ruiping Liu, Jiahang Li, Jiaming Zhang, Sven Matthiesen, Rainer Stiefelhagen:

@BENCH: Benchmarking Vision-Language Models for Human-centered Assistive Technology. 3934-3943 - Haoning Wu

, Shaocheng Shen, Qiang Hu, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang:
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning. 3944-3953 - Abhishek Kumar Sinha, S. Manthira Moorthi:

CharDiff: Improving Sampling Convergence via Characteristic Function Consistency in Diffusion Models. 3955-3964 - Anuja Vats, Ivar Farup, Marius Pedersen, Kiran B. Raja:

Uncertainty-Aware Regularization for Image-to-Image Translation. 3965-3974 - Hongsuk Choi, Isaac Kasahara, Selim Engin, Moritz A. Graule, Nikhil Chavan Dafle, Volkan Isler:

FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection. 3975-3984 - Souhaib Attaiki, Paul Guerrero, Duygu Ceylan, Niloy J. Mitra, Maks Ovsjanikov:

GANFusion: Feed-Forward Text-to-3D with Diffusion in GAN Space. 3985-3995 - Lucas N. Kirsten, Angelo Angonezi, Jose Marques, Fernanda Oliveira, Juliano Faccioni, Camila Cassel, Débora Santos de Sousa, Samlai Vedovatto, Guido Lenz, Cláudio R. Jung:

Oriented Cell Dataset: A Dataset and Benchmark for Oriented Cell Detection and Applications. 3996-4005 - Jinlin Xiang, Hillol Sarker, Bozhao Qi, Ruisu Zhang, Roger Trullo, Salvatore Badalamenti, Maria Wiekowski, Annie Kruger, Etienne Pochet, Qi Tang, Wei Zhao:

Endoscopic Scoring and Localization in Unconstrained Clinical Trial Videos. 4006-4015 - Vamsi Krishna Vasa, Peijie Qiu, Wenhui Zhu, Yujian Xiong, Oana M. Dumitrascu, Yalin Wang:

Context-Aware Optimal Transport Learning for Retinal Fundus Image Enhancement. 4016-4025 - Libing Zeng, Nima Khademi Kalantari

:
Analyzing and Improving the Skin Tone Consistency and Bias in Implicit 3D Relightable Face Generators. 4026-4035 - Sheng Zhang, Jinge Wu, Junzhi Ning, Guang Yang:

DMRN: A Dynamical Multi-Order Response Network for the Robust Lung Airway Segmentation. 4036-4045 - Shahzad Ahmad, Sania Bano, Sukalpa Chanda, Santosh Kumar Vipparthi

, Subrahmanyam Murala:
TRUST: Time-Domain Residual Unsupervised Stability Technique for Improved Heart Rate Estimation. 4046-4055 - Yoni Gozlan, Antoine Falisse, Scott D. Uhlrich, Anthony A. Gatti

, Michael J. Black, Jennifer L. Hicks
, Scott L. Delp
, Akshay Chaudhari
:
OpenCapBench: A Benchmark to Bridge Pose Estimation and Biomechanics. 4056-4065 - Jianyi Zhang, Hao Yang, Ang Li, Xin Guo, Pu Wang, Haiming Wang, Yiran Chen, Hai Li:

MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning. 4066-4076 - Danfeng Guo, Sanchit Agarwal, Yu-Hsiang Lin, Jiun-Yu Kao, Tagyoung Chung, Nanyun Peng, Mohit Bansal:

Improving Faithfulness of Text-to-Image Diffusion Models through Inference Intervention. 4077-4086 - Idan Kligvasser, Regev Cohen, George Leifman, Ehud Rivlin, Michael Elad:

Anchored Diffusion for Video Face Reenactment. 4087-4097 - Youssof Nawar, Nouran Soliman, Moustafa Wassel, Mohamed ElHabebe, Noha Adly, Marwan Torki, Ahmed Elmassry

, Islam Ahmed:
DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining. 4098-4107 - Zoltán Ádám Milacski, Koichiro Niinuma, Ryosuke Kawamura, Fernando De la Torre, László A. Jeni:

GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and-Text Contexts. 4108-4118 - Naveen Karunanayake, Suranga Seneviratne, Sanjay Chawla:

CRAFT: Class Ranking Aware Fine-Tuning for Enhanced Out-of-Distribution Detection. 4119-4128 - Fatemeh Haghighi, Michael B. Gotway, Jianming Liang:

Learning Anatomy-Disease Entangled Representation. 4129-4141 - Yilmaz Korkmaz, Vishal M. Patel:

MambaRecon: MRI Reconstruction with Structured State Space Models. 4142-4152 - Sai Bharath Chandra Gutha, Ricardo Vinuesa, Hossein Azizpour:

Inverse Problems with Diffusion Models: A MAP Estimation Perspective. 4153-4162 - Steven Hogue, Chenxu Zhang, Yapeng Tian, Xiaohu Guo:

Joint Co-Speech Gesture and Expressive Talking Face Generation Using Diffusion with Adapters. 4163-4172 - Fazle Rahat, M. Shifat Hossain, Md Rubel Ahmed, Sumit Kumar Jha, Rickard Ewetz:

Data Augmentation for Image Classification Using Generative AI. 4173-4182 - Qianwen Lu, Xingchao Yang, Takafumi Taketomi:

BeautyBank: Encoding Facial Makeup in Latent Space. 4183-4193 - Trung Dinh Quoc Dang, Huy Hoang Nguyen, Aleksei Tiulpin:

Image-Level Regression for Uncertainty-Aware Retinal Image Segmentation. 4194-4204 - Remi Chierchia, Léo Lebrat

, David Ahmedt-Aristizabal, Olivier Salvado
, Clinton Fookes, Rodrigo Santa Cruz
:
SALVE: A 3D Reconstruction Benchmark of Wounds from Consumer-Grade Videos. 4205-4214 - Haeil Lee, Hansang Lee, Seoyeon Gye, Junmo Kim:

Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models Using Stepwise Spectral Analysis. 4215-4224 - Chun-Hong Cheng, Jing Wei Chin, Kwan Long Wong, Tsz Tai Chan, Hau Ching Lo, Kwan Lok Pang, Richard Hau Yue So, Bryan Yan:

Remote Blood Pressure Estimation from Facial Videos Using Transfer Learning: Leveraging PPG to rPPG Conversion. 4225-4236 - Ali Karami, Thi Kieu Khanh Ho, Narges Armanfard:

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-Based Video Anomaly Detection. 4237-4247 - Tawsifur Rahman, Alexander S. Baras, Rama Chellappa:

CEMIL: Contextual Attention Based Efficient Weakly Supervised Approach for Histopathology Image Classification. 4248-4257 - Rasel Ahmed Bhuiyan, Adam Czajka:

Forensic Iris Image-Based Post-Mortem Interval Estimation. 4258-4267 - Sabina Martyniak, Joanna Kaleta, Diego Dall'Alba, Michal Naskret, Szymon Plotka, Przemyslaw Korzeniowski:

SimuScope: Realistic Endoscopic Synthetic Dataset Generation Through Surgical Simulation and Diffusion Models. 4268-4278 - Tonmoy Hossain, Jing Ma, Jundong Li, Miaomiao Zhang:

Invariant Shape Representation Learning for Image Classification. 4279-4289 - Kaito Shiku, Kazuya Nishimura, Daiki Suehiro, Kiyohito Tanaka, Ryoma Bise:

Ordinal Multiple-instance Learning for Ulcerative Colitis Severity Estimation with Selective Aggregated Transformer. 4290-4299 - Koushik Biswas, Amit Reza, Meghana Karri, Debesh Jha, Hongyi Pan, Nikhil Kumar Tomar, Aliza Subedi, Smriti Regmi, Ulas Bagci:

Optimizing Neural Network Effectiveness via Non-monotonicity Refinement. 4300-4309 - Justin Theiss, Norman Müller, Daeil Kim, Aayush Prakash:

Multi-View Image Diffusion via Coordinate Noise and Fourier Attention. 4310-4319 - Pamela Osuna-Vargas, Maren H. Wehrheim, Lucas Zinz, Johanna V. Rahm, Ashwin Balakrishnan, Alexandra Kaminer, Mike Heilemann, Matthias Kaschube:

Denoising Diffusion Models for High-Resolution Microscopy Image Restoration. 4320-4330 - Utkarsh Nath, Rajeev Goel, Eun Som Jeon, Changhoon Kim, Kyle Min, Yezhou Yang, Yingzhen Yang, Pavan K. Turaga:

Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation. 4331-4341 - Juhyung Ha, Jong Sung Park, David Crandall

, Eleftherios Garyfallidis, Xuhong Zhang:
Multi-Resolution Guided 3D GANs for Medical Image Translation. 4342-4351 - Muhammad Sohaib

, Siyavash Shabani, Sahar A. Mohammed, Garrett Winkelmaier, Bahram Parvin:
Multi-Aperture Transformers for 3D (MAT3D) Segmentation of Clinical and Microscopic Images. 4352-4361 - Joy Dhar, Nayyar Zaidi, Maryam Haghighat

, Sudipta Roy, Puneet Goyal, Azadeh Alavi, Vikas Kumar:
Multimodal Fusion Learning with Dual Attention for Medical Imaging. 4362-4371 - Sanyam Lakhanpal, Shivang Chopra, Vinija Jain, Aman Chadha, Man Luo:

Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation. 4372-4381 - Man Minh Ho

, Shikha Dubey, Yosep Chong, Beatrice Knudsen, Tolga Tasdizen:
F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation. 4382-4391 - Vaibhav Ganatra, Siddhartha Gairola, Pallavi Joshi

, Anand Balasubramaniam, Kaushik Murali, Arivunithi Varadharajan, Bellamkonda Mallikarjuna, Nipun Kwatra, Mohit Jain:
SmartKC++: Improving Performance of Smartphone-Based Corneal Topographers. 4392-4399 - Kai Wang, Fei Yang, Bogdan Raducanu, Joost van de Weijer:

Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier. 4400-4409 - Antoine P. Sanner, Jonathan Stieber, Nils F. Grauhan, Suam Kim, Marc A. Brockmann, Ahmed E. Othman, Anirban Mukhopadhyay:

Federated Voxel Scene Graph for Intracranial Hemorrhage. 4410-4419 - Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen:

Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing. 4420-4429 - Pengfei Guo, Can Zhao, Dong Yang, Ziyue Xu, Vishwesh Nath, Yucheng Tang, Benjamin Simon, Mason Belue, Stephanie A. Harmon, Baris Turkbey, Daguang Xu:

MAISI: Medical AI for Synthetic Imaging. 4430-4441 - Sebastian Thiele, Jacqueline Kockwelp, Joachim Wistuba, Sabine Kliesch, Jörg Gromoll, Benjamin Risse:

Investigating Imaging, Annotation and Self-Supervision for the Classification of Continuously Developing Cells in Histological Whole Slide Images. 4442-4451 - Qiwen Deng, Yangcen Liu:

Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network. 4452-4461 - Xiaoyang Wei, Camille Kurtz, Florence Cloppet

:
Relaxing Binary Constraints in Contrastive Vision-Language Medical Representation Learning. 4462-4471 - Hyunsoo Lee, Minsoo Kang, Bohyung Han:

Diffusion-Based Conditional Image Editing Through Optimized Inference with Guidance. 4472-4480 - Ciprian A. Corneanu, Qianli Feng, Aleix M. Martínez:

Structured Human Assessment of Text-to-Image Generative Models. 4481-4490 - Raman Dutt, Ondrej Bohdal, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy M. Hospedales:

MemControl: Mitigating Memorization in Diffusion Models via Automated Parameter Selection. 4491-4501 - Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu, Peijie Qiu, Xiwen Chen, Yi Su, Yujian Xiong, Zhangsihao Yang, Yanxi Chen, Yalin Wang:

CUNSB-RFIE: Context-Aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement. 4502-4511 - Shuhan Xiao

, Lukas Klein, Jens Petersen, Philipp Vollmuth, Paul F. Jaeger, Klaus H. Maier-Hein
:
Enhancing Predictive Imaging Biomarker Discovery Through Treatment Effect Analysis. 4512-4522 - Yan Zeng, Masanori Suganuma, Takayuki Okatani:

Inverting the Generation Process of Denoising Diffusion Implicit Models: Empirical Evaluation and a Novel Method. 4516-4524 - Chaewon Kim, Seung Jun Moon, Gyeong-Moon Park:

WINE: Wavelet-Guided GAN Inversion and Editing for High-Fidelity Refinement. 4523-4532 - Mingyu Sheng, Jianan Fan, Dongnan Liu, Ron Kikinis, Weidong Cai:

AMNCutter: Affinity-Attention-Guided Multi-View Normalized Cutter for Unsupervised Surgical Instrument Segmentation. 4533-4544 - Kenta Horikawa, Mariko Isogawa, Hideo Saito, Shohei Mori

:
Dense Depth from Event Focal Stack. 4545-4553 - Xulin Fan, Heting Gao, Ziyi Chen

, Peng Chang, Mei Han, Mark Hasegawa-Johnson:
SyncDiff: Diffusion-Based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization. 4554-4563 - Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla:

Morag - Multi-Fusion Retrieval Augmented Generation for Human Motion. 4564-4573 - Shahzad Ahmad, Sania Bano, Sachin Verma, Yogesh Singh Rawat, Sukalpa Chanda, Santosh Kumar Vipparthi

, Subrahmanyam Murala:
PULSE: Physiological Understanding with Liquid Signal Extraction. 4574-4584 - Xindi Wu, Uriel Singer, Zhaojiang Lin, Andrea Madotto, Xide Xia, Yifan Xu, Paul A. Crook, Xin Luna Dong, Seungwhan Moon:

Corgi: Cached Memory Guided Video Generation. 4585-4594 - Sungkyu Yang

, Woohyun Park, Kwangil Yim, Mansu Kim:
MFTrans: A Multi-Resolution Fusion Transformer for Robust Tumor Segmentation in Whole Slide Images. 4595-4605 - Zhenyuan Dong, Sai Qian Zhang:

DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing. 4606-4615 - Zhuoyi Yang, Liyue Shen:

TempA-VLP: Temporal-Aware Vision-Language Pretraining for Longitudinal Exploration in Chest X-Ray Image. 4625-4634 - Fang-Yi Su, Tzu-Hung Chang, Jung-Hsien Chiang:

DiffuCE: Expert-Level CBCT Image Enhancement Using a Novel Conditional Denoising Diffusion Model with Latent Alignment. 4635-4644 - Vasco Ramos

, Yonatan Bitton, Michal Yarom, Idan Szpektor, João Magalhães:
Contrastive Sequential-Diffusion Learning: Non-Linear and Multi-Scene Instructional Video Synthesis. 4645-4654 - Tapas Kumar Dutta, Snehashis Majhi, Deepak Ranjan Nayak, Debesh Jha:

SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation. 4655-4664 - Xinxi Zhang, Song Wen, Ligong Han, Felix Juefei-Xu, Akash Srivastava, Junzhou Huang, Vladimir Pavlovic, Hao Wang, Molei Tao, Dimitris N. Metaxas:

SODA: Spectral Orthogonal Decomposition Adaptation for Diffusion Models. 4665-4682 - Gurucharan Marthi Krishna Kumar, Janine D. Mendola, Amir Shmuel

:
Nestedmorph: Enhancing Deformable Medical Image Registration With Nested Attention Mechanisms. 4683-4692 - Yaoxin Zhuo, Zachary Bessinger, Lichen Wang, Naji Khosravan, Baoxin Li, Sing Bing Kang:

TFM2: Training-Free Mask Matching for Open-Vocabulary Semantic Segmentation. 4693-4703 - Marvin Burges, Sebastian Zambanini, Robert Sablatnig:

Interactive Object Detection for Tiny Objects in Large Remotely Sensed Images. 4704-4713 - Jingchen Sun, Rohan Sharma, Vishnu Suresh Lokhande, Changyou Chen:

Cross-Modal Feature Alignment and MMD Improve Robustness of Prompt Tuning. 4714-4724 - Yicheng Wang, Zhikang Zhang, Jue Wang, David Fan, Zhenlin Xu, Linda Liu, Xiang Hao, Vimal Bhat, Xinyu Li:

GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning. 4725-4735 - Sanggeon Yun, Ryozo Masukawa, Minhyoung Na, Mohsen Imani:

Missiongnn: Hierarchical Multimodal GNN-Based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation. 4736-4745 - Wendi Yang, Zihang Jiang, Shang Zhao, S. Kevin Zhou:

PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery. 4746-4756 - Ayush Gupta, Rama Chellappa:

MimicGait: A Model Agnostic approach for Occluded Gait Recognition Using Correlational Knowledge Distillation. 4757-4766 - Ekin Celikkan

, Timo Kunzmann, Yertay Yeskaliyev, Sibylle Itzerott, Nadja Klein
, Martin Herold
:
WeedsGalore: A Multispectral and Multitemporal UAV-Based Dataset for Crop and Weed Segmentation in Agricultural Maize Fields. 4767-4777 - Percy Lam

, Sooyong Park, Weiwei Chen
, Lavindra de Silva, Ioannis K. Brilakis:
CRAAC: Consistency Regularised Active Learning with Automatic Corrections for Real-Life Road Image Annotations. 4778-4787 - Sina Malakouti, Aysan Aghazadeh, Ashmit Khandelwal, Adriana Kovashka:

Benchmarking VLMs' Reasoning About Persuasive Atypical Images. 4788-4798 - I-Ting Tsai, Bharath Hariharan:

3D Synthesis for Architectural Design. 4799-4809 - Yan Yang, Utpal Bose, James Broadbent, Sally Stockwell, Keren Byrne, Md. Zakir Hossain, Eric A. Stone, Shannon Dillon:

Flowering Time Prediction of Wheat From DIA-MS Data. 4810-4820 - Xingjian Diao, Ming Cheng, Wayner Barrios, SouYoung Jin:

FT2TF: First-Person Statement Text-to-Talking Face Generation. 4821-4830 - Mayssa Zaier, Hazem Wannous, Hassen Drira:

Geometry-Aware Deep Learning for 3D Skeleton-Based Motion Prediction. 4831-4840 - Sanjana Sinha, Brojeshwar Bhowmick, Lokender Tiwari, Sushovan Chanda:

DisFlowEm : One-Shot Emotional Talking Head Generation Using Disentangled Pose and Expression Flow-Guidance. 4841-4851 - Sombit Dey, Ozan Unal, Christos Sakaridis, Luc Van Gool:

Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding. 4852-4861 - Xiaoyu Xiang, Liat Sless Gorelik, Yuchen Fan, Omri Armstrong, Forrest N. Iandola, Yilei Li, Ita Lifshitz, Rakesh Ranjan:

Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds. 4872-4881 - Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan:

AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models. 4882-4890 - Ce Zheng, Xianpeng Liu, Qucheng Peng, Tianfu Wu

, Pu Wang
, Chen Chen:
DiffMesh: A Motion-Aware Diffusion Framework for Human Mesh Recovery from Videos. 4891-4901 - Bardia Safaei, Vishal M. Patel:

Active Learning for Vision-Language Models. 4902-4912 - Yoshitomo Matsubara, Matteo Mendula, Marco Levorato:

A Multi-Task Supervised Compression Model for Split Computing. 4913-4922 - Aashish Rai, Srinath Sridhar:

EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos. 4935-4946 - Risako Tanigawa, Kenji Ishikawa, Noboru Harada, Yasuhiro Oikawa:

SoundSil-DS: Deep Denoising and Segmentation of Sound-field Images with Silhouettes. 4947-4956 - Bingqing Zhang, Zhuo Cao, Heming Du, Xin Yu

, Xue Li
, Jiajun Liu, Sen Wang
:
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm. 4957-4967 - Vittorio Pipoli, Federico Bolelli

, Sara Sarto
, Marcella Cornia, Lorenzo Baraldi
, Costantino Grana, Rita Cucchiara, Elisa Ficarra:
Semantically Conditioned Prompts for Visual Recognition Under Missing Modality Scenarios. 4968-4977 - Shubham Agarwal, Raz Birman, Ofer Hadar:

WARLearn: Weather-Adaptive Representation Learning. 4978-4987 - Hai Nguyen-Truong, E-Ro Nguyen, Tuan-Anh Vu

, Minh-Triet Tran, Binh-Son Hua, Sai-Kit Yeung:
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding. 4988-4998 - Tetsushi Yamada, Simone Di Santo:

Partial Texture VAE: Color and Texture Encoder for Rock Particle Images. 4999-5008 - Pramook Khungurn:

Talking Head Anime 4: Distillation for Real-Time Performance. 5018-5029 - Anh-Quan Cao, Maximilian Jaritz, Matthieu Guillaumin, Raoul de Charette, Loris Bazzani

:
LATTECLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts. 5030-5040 - Denys Rozumnyi, Nadine Bertsch, Othman Sbai, Filippo Arcadu, Yuhua Chen, Artsiom Sanakoyeu, Manoj Kumar, Catherine Herold, Robin Kips:

XR-MBT: Multi-Modal Full Body Tracking for XR Through Self-Supervision with Learned Depth Point Cloud Registration. 5041-5050 - Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis:

Similarity Over Factuality: Are we Making Progress on Multimodal Out-of-Context Misinformation Detection? 5041-5050 - Yanan Niu

, Roy Sarkis, Demetri Psaltis, Mario Paolone
, Christophe Moser, Luisa Lambertini:
Solar Multimodal Transformer: Intraday Solar Irradiance Predictor Using Public Cameras and Time Series. 5051-5060 - Sina Hajimiri, Ismail Ben Ayed, Jose Dolz:

Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation. 5061-5071 - Xin Ye, Feng Tao, Abhirup Mallik, Burhaneddin Yaman, Liu Ren:

LORD: Large Models Based Opposite Reward Design for Autonomous Driving. 5072-5081 - Md Mahedi Hasan, Shoaib Meraj Sami, Nasser M. Nasrabadi:

CLFace: A Scalable and Resource-Efficient Continual Learning Framework for Lifelong Face Recognition. 5082-5091 - Dingkun Yan, Liang Yuan, Erwin Wu, Yuma Nishioka, Issei Fujishiro, Suguru Saito:

ColorizeDiffusion: Improving Reference-Based Sketch Colorization with Latent Diffusion Model. 5092-5102 - Lior Dikstein, Ariel Lapid, Arnon Netzer, Hai Victor Habi:

Data Generation for Hardware-Friendly Post-Training Quantization. 5103-5113 - Bo Lang, Mooi Choo Chuah:

Event-Guided Video Transformer for End-to-End 3D Human Pose Estimation. 5114-5124 - Wele Gedara Chaminda Bandara, Vishal M. Patel:

Deep Metric Learning for Unsupervised Remote Sensing Change Detection. 5125-5135 - Xuanchen Wang, Heng Wang, Dongnan Liu, Weidong Cai:

Dance any Beat: Blending Beats with Visuals in Dance Video Generation. 5136-5146 - Valentin Bieri, Marco Zamboni, Nicolas S. Blumer

, Qingxuan Chen, Francis Engelmann
:
OpenCity3D: What do Vision-Language Models Know About Urban Environments? 5147-5155 - Abid Ali, Rui Dai, Ashish Marisetty, Guillaume Astruc, Monique Thonnat, Jean-Marc Odobez, Susanne Thümmler, François Brémond:

Loose Social-Interaction Recognition in Real-World Therapy Scenarios. 5156-5165 - Julius Pesonen

, Teemu Hakala, Väinö Karjalainen
, Niko Koivumäki, Lauri Markelin, Anna-Maria Raita-Hakola
, Juha Suomalainen
, Ilkka Pölönen
, Eija Honkavaara:
Detecting Wildfires on UAVs with Real-Time Segmentation Trained by Larger Teacher Models. 5166-5176 - Ying Shen, Daniel Bis, Cynthia Lu, Ismini Lourentzou:

ELBA: Learning by Asking for Embodied Visual Navigation and Task Completion. 5177-5186 - Tim Dieter Eberhardt, Tim Brühl, Robin Schwager, Tin Stribor Sohn, Wilhelm Stork:

Clarity Amidst Blur: A Deterministic Method for Synthetic Generation of Water Droplets on Camera Lenses. 5187-5196 - Siddharth Seth, Rishabh Dabral, Diogo C. Luvizon, Marc Habermann, Ming-Hsuan Yang, Christian Theobalt, Adam Kortylewski:

PocoLoco: A Point Cloud Diffusion Model of Human Shape in Loose Clothing. 5197-5206 - Hanyuan Xiao, Yingshu Chen, Huajian Huang, Haolin Xiong, Jing Yang, Pratusha Prasad, Yajie Zhao:

Localized Gaussian Splatting Editing with Contextual Awareness. 5207-5217 - Doyoung Park, Naresh Reddy Yarram, Sunjin Kim, Minkyu Kim, Seongho Cho, Taehee Lee:

Text Change Detection in Multilingual Documents Using Image Comparison. 5218-5227 - Zihao Zou, Jiaming Liu, Shirin Shoushtari, Yubo Wang, Ulugbek S. Kamilov:

FLAIR: A Conditional Diffusion Framework with Applications to Face Video Restoration. 5228-5238 - Wenjun Huang, Yang Ni, Arghavan Rezvani, Sungheon Jeong, Hanning Chen, Yezi Liu, Fei Wen, Mohsen Imani:

Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach. 5239-5249 - Wele Gedara Chaminda Bandara, Nithin Gopalakrishnan Nair, Vishal M. Patel:

DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Remote Sensing Change Detection. 5250-5262 - Yusuke Akamatsu, Terumi Umematsu, Hitoshi Imaoka, Shizuko Gomi, Hideo Tsurushima:

ComFace: Facial Representation Learning with Synthetic Data for Comparing Faces. 5263-5273 - Hankyeol Lee

, Gawon Seo, Wonseok Choi, Geunyoung Jung, Kyungwoo Song, Jiyoung Jung:
Enhancing Visual Classification Using Comparative Descriptors. 5274-5283 - Luca Collorone, Stefano D'Arrigo, Massimiliano Pappa, Guido Maria D'Amely di Melendugno, Giovanni Ficarra

, Fabio Galasso:
ANTHROPOS-V: Benchmarking the Novel Task of Crowd Volume Estimation. 5284-5294 - Raquel Panadero, Dominik Schörkhuber, Margrit Gelautz:

Importance-Guided Interpretability and Pruning for Video Transformers in Driver Action Recognition. 5295-5304 - Puneet Kumar, Shreshtha Misra, Zhuhong Shao

, Bin Zhu, Balasubramanian Raman, Xiaobai Li:
Multimodal Interpretable Depression Analysis Using Visual, Physiological, Audio and Textual Data. 5305-5315 - Anudeep Vurity

, Emanuela Marasco, Raghavendra Ramachandra, Jongwoo Park:
ColFigPhotoAttnNet: Reliable Finger Photo Presentation Attack Detection Leveraging Window-Attention on Color Spaces. 5316-5325 - Zhao-Yang Wang, Jiang Liu, Jieneng Chen, Rama Chellappa:

VM-Gait: Multi-Modal 3D Representation Based on Virtual Marker for Gait Recognition. 5326-5335 - Kevin Flanagan, Dima Damen

, Michael Wray
:
Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval. 5336-5345 - Hao Fu, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami:

CLIPScope: Enhancing Zero-Shot OOD Detection with Bayesian Scoring. 5346-5355 - Ahmad Arrabi, Xiaohan Zhang

, Waqas Sultani, Chen Chen, Safwan Wshah:
Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance. 5356-5366 - Shakeeb Murtaza, Soufiane Belharbi, Marco Pedersoli, Eric Granger:

A Realistic Protocol for Evaluation of Weakly Supervised Object Localization. 5367-5376 - Mu Cai, Zeyi Huang, Yuheng Li, Utkarsh Ojha, Haohan Wang, Yong Jae Lee:

An Investigation on LLMs' Visual Understanding Ability Using SVG for Image-Text Bridging. 5377-5386 - Deepti Rawat, Keshav Gupta

, Aryamaan Basu Roy, Ravi Kiran Sarvadevabhatla:
DashCop: Automated E-Ticket Generation for Two-Wheeler Traffic Violations Using Dashcam Videos. 5387-5397 - Bumsoo Kim, Wonseop Shin

, Kyuchul Lee
, Yonghoon Jung, Sanghyun Seo:
Make VLM Recognize Visual Hallucination on Cartoon Character Image with Pose Information. 5398-5407 - Yuhang He, Sangyun Shin, Anoop Cherian, Niki Trigoni, Andrew Markham:

SoundLoc3D: Invisible 3D Sound Source Localization and Classification Using a Multimodal RGB-D Acoustic Camera. 5408-5418 - Hiroki Nishizawa, Keitaro Tanaka, Asuka Hirata, Shugo Yamaguchi, Qi Feng, Masatoshi Hamanaka, Shigeo Morishima:

SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering. 5419-5428 - Haoyu Jiang, Zhi-Qi Cheng

, Gabriel Moreira
, Jiawen Zhu, Jingdong Sun, Bukun Ren, Jun-Yan He, Qi Dai, Xian-Sheng Hua:
UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval. 5429-5438 - Jash Dalvi, Ali Dabouei, Gunjan Dhanuka, Min Xu:

Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection. 5439-5448 - Raza Imam, Hanan Gani, Muhammad Huzaifa, Karthik Nandakumar:

Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models. 5449-5459 - Evelyn A. Stump, Francesco Luzi, Leslie M. Collins, Jordan M. Malof:

Meta-Learning for Color-to-Infrared Cross-Modal Style Transfer. 5460-5469 - Tevin Moodley, Dustin van der Haar:

I3D-AE-LSTM: A 2-Stream Autoencoder for Action Quality Assessment Using a Newly Created Cricket Batsman Video Dataset. 5470-5478 - Junno Yun, Mehmet Akçakaya

:
Generative Model-Based Fusion for Improved Few-Shot Semantic Segmentation of Infrared Images. 5479-5488 - Pinrui Yu, Zhenglun Kong, Pu Zhao, Peiyan Dong, Hao Tang, Fei Sun, Xue Lin, Yanzhi Wang:

Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation. 5489-5499 - Debolena Basak, Soham Bhatt, Sahith Kanduri, Maunendra Sankar Desarkar:

Aerial Mirage: Unmasking Hallucinations in Large Vision Language Models. 5500-5508 - Bhavin Jawade, João V. B. Soares, Kapil Thadani, Deen Dayal Mohan, Amir Erfan Eshratifar, Benjamin Culpepper, Paloma de Juan, Srirangaraj Setlur, Venu Govindaraju:

SCOT: Self-Supervised Contrastive Pretraining for Zero-Shot Compositional Retrieval. 5509-5519 - Dipu Manandhar, Paul Guerrero, Zhaowen Wang, John P. Collomosse:

CLASS: Conditional Latent Architecture for Search and Synthesis of Design Layouts. 5520-5529 - Seon-Ho Lee, Jue Wang, David Fan, Zhikang Zhang, Linda Liu, Xiang Hao, Vimal Bhat, Xinyu Li:

Now you see Me: Context-Aware Automatic Audio Description. 5530-5539 - Niharika Hegde

, Shishir Muralidhara
, René Schuster, Didier Stricker:
Modality-Incremental Learning with Disjoint Relevance Mapping Networks for Image-Based Semantic Segmentation. 5540-5549 - Donggeun Kim, Yujin Jo, Myungjoo Lee, Taesup Kim:

Retaining and Enhancing Pre-trained Knowledge in Vision-Language Models with Prompt Ensembling. 5550-5559 - Junha Lee, Sojung An, Sujeong You, Nam Ik Cho:

Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation. 5560-5569 - Maksim Golyadkin, Ianis Plevokas, Ilya Makarov:

Closing the Domain Gap in Manga Colorization via Aligned Paired Dataset. 5580-5590 - Anurag Deo

, Savita Bhat, Shirish S. Karande:
VisualFusion: Enhancing Blog Content with Advanced Infographic Pipeline. 5591-5600 - Daniel Steininger, Julia Simon

, Andreas Trondl, Markus Murschitz
:
TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry Operations. 5601-5610 - Idris Zakariyya, Linda Tran, Kaushik Bhargav Sivangi, Paul Henderson, Fani Deligianni:

Differentially Private Integrated Decision Gradients (IDG-DP) for Radar-Based Human Activity Recognition. 5611-5622 - Suguru Onda, Ryan Farrell:

The FineView Dataset: A 3D Scanned Multi-View Object Dataset of Fine-Grained Category Instances. 5623-5634 - Deepayan Das, Davide Talon, Massimiliano Mancini, Yiming Wang

, Elisa Ricci:
One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering. 5635-5645 - Tom Gillooly, Jean-Baptiste Thomas, Jon Yngve Hardeberg, Giuseppe Claudio Guarnera:

Image Adaptation for Colour Vision Deficient Viewers Using Vision Transformers. 5646-5655 - Pallavi Jain, Dino Ienco, Roberto Interdonato, Tristan Berchoux, Diego Marcos

:
SenCLIP: Enhancing Zero-Shot Land-Use Mapping for Sentinel-2 with Ground-Level Prompting. 5656-5665 - Zhuowen Zou, Prathyush Poduval, Narayan Srinivasa, Mohsen Imani:

Hyperdimensional Representation for Adaptive Information Association and Memorization. 5666-5675 - Sahil Goyal, Abhinav Mahajan, Swasti Mishra, Prateksha Udhayanan, Tripti Shukla, K. J. Joseph, Balaji Vasan Srinivasan:

Design-O-Meter: Towards Evaluating and Refining Graphic Designs. 5676-5686 - Muhammad Awais, Ali Husain Salem Abdulla Alharthi, Amandeep Kumar, Hisham Cholakkal, Rao Muhammad Anwer:

AgroGPT : Efficient Agricultural Vision-Language Model with Expert Tuning. 5687-5696 - Zhuo Xu, Xiang Xiang:

Learning Visual-Semantic Hierarchical Attribute Space for Interpretable Open-Set Recognition. 5697-5706 - Harini S. I, Somesh Singh, Yaman Kumar Singla, Aanisha Bhattacharyya, Veeky Baths, Changyou Chen, Rajiv Ratn Shah, Balaji Krishnamurthy:

Long-Term Ad Memorability: Understanding & Generating Memorable Ads. 5707-5718 - Debasmita Pal

, Redwan Sony, Arun Ross:
A Parametric Approach to Adversarial Augmentation for Cross-Domain Iris Presentation Attack Detection. 5719-5729 - Abhishek Rajora, Shubham Gupta, Suman Kundu:

Cross-Aligned Fusion For Multimodal Understanding. 5730-5740 - Hanwen Zheng, Sijia Wang, Chris Thomas, Lifu Huang:

Advancing Chart Question Answering with Robust Chart Component Recognition. 5741-5750 - Moyuru Yamada, Nimish Dharamshi, Ayushi Kohli, Prasad Kasu, Ainulla Khan, Manu Ghulyani:

Unleashing Potentials of Vision-Language Models for Zero-Shot HOI Detection. 5751-5760 - Zi-Xiang Xia, Sudeep Fadadu, Yi Shi, Louis Foucard:

Robust Long-Range Perception Against Sensor Misalignment in Autonomous Vehicles. 5761-5770 - Felix Hertlein, Alexander Naumann, York Sure-Vetter:

DocMatcher: Document Image Dewarping via Structural and Textual Line Matching. 5771-5780 - Dulanga Weerakoon, Vigneshwaran Subbaraju, Joo Hwee Lim, Archan Misra

:
NeuroViG - Integrating Event Cameras for Resource-Efficient Video Grounding. 5781-5790 - Haiyu Wu, Sicong Tian, Huayu Li, Kevin W. Bowyer:

LogicNet: A Logical Consistency Embedded Face Attribute Learning Network. 5791-5800 - Hasnat Md Abdullah, Tian Liu, Kangda Wei, Shu Kong, Ruihong Huang:

UAL-Bench: The First Comprehensive Unusual Activity Localization Benchmark. 5801-5811 - Neha Choudhary

, Poonam Goyal, Devashish Siwatch, Atharva Chandak, Harsh Mahajan, Varun Khurana, Yaman Kumar:
AdQuestA: Knowledge-Guided Visual Question Answer Framework for Advertisements. 5812-5821 - Raymond Yu, Paul Han, Piper Wolters, Favyen Bastani:

OPTIMUS: Observing Persistent Transformations in Multi-Temporal Unlabeled Satellite-Data. 5822-5830 - María Escobar, Juanita Puentes, Cristhian Forigua, Jordi Pont-Tuset, Kevis-Kokitsi Maninis, Pablo Arbeláez:

EgoCast: Forecasting Egocentric Human Pose in the Wild. 5831-5841 - Cheng-En Wu, Jinhong Lin, Yu Hen Hu, Pedro Morgado:

Patch Ranking: Token Pruning as Ranking Prediction for Efficient CLIP. 5842-5851 - Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt:

Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery. 5852-5862 - Samyak Rawlekar, Shubhang Bhatnagar, Narendra Ahuja:

PositiveCoOp: Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations. 5863-5872 - Alexander Ponomarchuk, Ivan Kruzhilov, Gleb Mazanov, Ruslan Utegenov, Artem Shadrin, Galina Zubkova, Ivan Bessonov, Pavel Blinov:

CardioSyntax: End-to-End SYNTAX Score Prediction - Dataset, Benchmark and Method. 5873-5883 - Ziqiang Dang, Jianfang Li, Lin Liu

:
Cascaded Dual Vision Transformer for Accurate Facial Landmark Detection. 5884-5894 - Charles Gaydon, Floryne Roche:

PureForest: A Large-Scale Aerial Lidar and Aerial Imagery Dataset for Tree Species Classification in Monospecific Forests. 5895-5904 - Ce Zhang, Simon Stepputtis, Katia P. Sycara

, Yaqi Xie
:
Enhancing Vision-Language Few-Shot Adaptation with Negative Learning. 5905-5915 - Jia-Wei Liao, Winston Wang, Tzu-Sian Wang, Li-Xuan Peng, Ju-Hsuan Weng, Cheng-Fu Chou, Jun-Cheng Chen:

DiffQRCoder: Diffusion-Based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement. 5916-5925 - Jingjiao Zhao, Jiaju Li, Dongze Lian, Liguo Sun, Pin Lv:

DualCIR: Enhancing Training-Free Composed Image Retrieval via Dual-Directional Descriptions. 5926-5936 - Ee Yeo Keat, Hao Zhang, Alexander Matyasko, Basura Fernando

:
Deduce and Select Evidences with Language Models for Training-Free Video Goal Inference. 5937-5947 - Luca Scofano, Alessio Sampieri, Edoardo De Matteis, Indro Spinelli, Fabio Galasso:

Social EgoMesh Estimation. 5948-5958 - Xiang Huang, Zhi-Qi Cheng

, Jun-Yan He, Chenyang Li, Wangmeng Xiang, Baigui Sun:
DyRoNet: Dynamic Routing and Low-Rank Adapters for Autonomous Driving Streaming Perception. 5959-5968 - Siyuan Huang, Ram Prabhakar, Yuxiang Guo, Rama Chellappa, Cheng Peng:

VILLS: Video-Image Learning to Learn Semantics for Person Re-Identification. 5969-5979 - Sumin Hu, Youngmin Yoo, Jeeseong Kim, Changsoo Lim, Doohyun Cho, Bongnam Kang:

A Generic Vehicle-to-Sensor Calibration Framework. 5980-5989 - Christian Benz, Volker Rodehorst

:
Crackstructures and Crackensembles: The Power of Multi-View for 2.5D Crack Detection. 5990-5999 - Shuo Chen, Zhen Han, Bailan He, Jianzhe Liu, Mark Buckley, Yao Qin, Philip Torr, Volker Tresp, Jindong Gu:

Can Multimodal Large Language Models Truly Perform Multimodal In-Context Learning? 6000-6010 - Rupanjali Kukal, Jay Patravali, Fuxun Yu, Simranjit Singh, Nikolaos Karianakis, Rishi Madhok:

Click&Describe: Multimodal Grounding and Tracking for Aerial Objects. 6011-6021 - Wenzhao Qiu, Shanmin Pang, Hao Zhang, Jianwu Fang, Jianru Xue:

HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning. 6022-6031 - Jinnan Chen, Chen Li, Gim Hee Lee:

DiHuR: Diffusion-Guided Generalizable Human Reconstruction. 6032-6041 - Ashutosh Chaubey, Anoubhav Agrawal, Sartaki Sinha Roy, Aayush Agrawal, Susmita Ghose:

ContextIQ: A Multimodal Expert-Based Video Retrieval System for Contextual Advertising. 6042-6052 - Aishwarya Agarwal, Srikrishna Karanam, Tripti Shukla, Balaji Vasan Srinivasan:

An Image is Worth Multiple Words: Multi-Attribute Inversion for Constrained Text-To-Image Synthesis. 6053-6062 - Xinhao Zhou, Tong Wang, Zhaodong Liu, Hao Wei, Guangyuan Pan:

A Regional-Level Resource-Saving Model for Winter Road Surface Snow Detection in Extreme Weathers. 6063-6072 - Nicola Fanelli, Gennaro Vessio

, Giovanna Castellano:
I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting. 6073-6082 - Eman Ali, Sathira Silva, Muhammad Haris Khan:

DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models. 6083-6093 - Zijiao Yang, Xiangxi Shi, Eric Slyman, Stefan Lee:

Hijacking Vision-and-Language Navigation Agents with Adversarial Environmental Attacks. 6094-6103 - Ruoyu Wang, Yangfan He, Tengjiao Sun, Xiang Li, Tianyu Shi:

UniTMGE: Uniform Text-Motion Generation and Editing Model via Diffusion. 6104-6114 - Yehun Song, Sunyoung Cho:

Leveraging CLIP Encoder for Multimodal Emotion Recognition. 6115-6124 - Po-Hsuan Huang, Jeng-Lin Li, Chin-Po Chen, Ming-Ching Chang, Wei-Chao Chen:

Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality Analysis. 6125-6135 - Daniel Panangian, Ksenia Bittner:

Can Location Embeddings Enhance Super-Resolution of Satellite Imagery? 6136-6145 - Dinghao Jin, Yuan Zeng, Yi Gong:

Bandwidth-Efficient Communication Modelling for Autonomous Vehicle Collaborative Perception. 6146-6155 - Mallika Garg, Debashis Ghosh

, Pyari Mohan Pradhan:
ConvMixFormer- A Resource-Efficient Convolution Mixer for Transformer-Based Dynamic Hand Gesture Recognition. 6156-6166 - Mathieu Cocheteux

, Julien Moreau, Franck Davoine:
Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach. 6167-6176 - Floriane Magera, Thomas Hoyoux, Olivier Barnich, Marc Van Droogenbroeck:

BroadTrack: Broadcast Camera Tracking for Soccer. 6177-6187 - Hah Min Lew, Sahng-Min Yoo, Hyunwoo Kang, Gyeong-Moon Park:

Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications. 6188-6196 - Xiang Li, Yangfan He, Shuaishuai Zu, Zhengyang Li, Tianyu Shi, Yiting Xie, Kevin Zhang:

Multi-Modal Large Language Model with RAG Strategies in Soccer Commentary Generation. 6197-6206 - Niloufar Alipour Talemi, Hossein Kashiani, Fatemeh Afghah:

Style-Pro: Style-Guided Prompt Learning for Generalizable Vision-Language Models. 6207-6216 - Hung-Shuo Chang, Chien-Yao Wang, Richard Robert Wang, Gene Chou, Hong-Yuan Mark Liao:

Generalist YOLO: Towards Real-Time End-to-End Multi-Task Visual Language Models. 6217-6227 - Patrick Ramos, Nicolas Gonthier, Selina Khan, Yuta Nakashima, Noa Garcia:

No Annotations for Object Detection in Art Through Stable Diffusion. 6228-6237 - Manju R. A, Atul Kumar, Akshay Agarwal:

On Which Data Distribution (Synthetic or Real) We Should Rely for Soft Biometric Classification. 6238-6247 - Weixi Weng, Rui Zhang, Xiaojun Meng, Jieming Zhu, Qun Liu, Chun Yuan:

Unsupervised Domain Adaptive Visual Question Answering in the Era of Multi-Modal Large Language Models. 6248-6258 - Cole Hill, Florence Yellin, Krishna Regmi, Dawei Du, Scott McCloskey:

Re-identifying People in Video via Learned Temporal Attention and Multi-modal Foundation Models. 6259-6268 - Yao Zhang, Haokun Chen, Ahmed Frikha, Denis Krompass, Gengyuan Zhang, Jindong Gu, Volker Tresp:

CL-Cross VQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering. 6269-6278 - Abid Hasan Zim, Aquib Iqbal, Zaid Al-Huda, Asad Malik, Minoru Kuribayashi

:
EfficientCrackNet: A Lightweight Model for Crack Segmentation. 6279-6289 - Shir Bar, Or Hirschorn, Roi Holzman, Shai Avidan:

Sifting Through the Haystack - Efficiently Finding Rare Animal Behaviors in Large-Scale Datasets. 6290-6299 - Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik:

Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance. 6300-6310 - Simone Fobi Nsutezo, Amrita Gupta, Duncan Kebut, Seema Iyer, Luana Marotti, Rahul Dodhia, Juan M. Lavista Ferres, Anthony Ortiz:

PGRID: Power Grid Reconstruction in Informal Developments Using High-Resolution Aerial Imagery. 6311-6319 - Muhammad Arbab Arshad, Talukder Zaki Jubery, Tirtho Roy, Rim Nassiri, Asheesh K. Singh, Arti Singh, Chinmay Hegde, Baskar Ganapathysubramanian, Aditya Balu, Adarsh Krishnamurthy, Soumik Sarkar:

Leveraging Vision Language Models for Specialized Agricultural Tasks. 6320-6329 - Farnoosh Koleini, Muhammad Usama Saleem, Pu Wang

, Hongfei Xue
, Ahmed Helmy
, Abbey Fenwick:
BioPose: Biomechanically-Accurate 3D Pose Estimation from Monocular Videos. 6330-6339 - Aniket Bhattacharyya, Anurag Tripathi:

Information Extraction from Heterogeneous Documents Without Ground Truth Labels Using Synthetic Label Generation and Knowledge Distillation. 6351-6361 - Mingjie Xu, Mengyang Wu, Yuzhi Zhao

, Jason Chun Lok Li, Weifeng Ou:
LLaVA-SpaceSGG: Visual Instruct Tuning for Open-Vocabulary Scene Graph Generation with Enhanced Spatial Relations. 6362-6372 - Bishoy Galoaa, Somaieh Amraee, Sarah Ostadabbas:

DragonTrack: Transformer-Enhanced Graphical Multi-Person Tracking in Complex Scenarios. 6373-6382 - Qianying Liu, Paul Henderson, Xiao Gu, Hang Dai, Fani Deligianni:

Learning Semi-Supervised Medical Image Segmentation from Spatial Registration. 6383-6393 - Parinita Nema, Vinod K. Kurmi:

Strategic Base Representation Learning via Feature Augmentations for Few-Shot Class Incremental Learning. 6394-6403 - Bare Luka Zagar, Mingyu Liu, Tim Hertel, Ekim Yurtsever, Alois Knoll:

3D Understanding of Deformable Linear Objects: Datasets and Transferability Benchmark. 6404-6414 - Ryozo Masukwa, Sanggeon Yun, Yoshiki Yamaguchi

, Mohsen Imani:
PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation. 6415-6424 - Pascal Schlachter, Simon Wagner, Bin Yang:

Memory-Efficient Pseudo-Labeling for Online Source-Free Universal Domain Adaptation using a Gaussian Mixture Model. 6425-6434 - Alessio Quercia, Erenus Yildiz, Zhuo Cao, Kai Krajsek, Abigail Morrison, Ira Assent

, Hanno Scharr:
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks. 6435-6445 - Shivangi Rai

, Rini Smita Thakur, Kunal Jangid
, Vinod K. Kurmi:
Label Calibration in Source Free Domain Adaptation. 6446-6455 - Alexey Kravets, Vinay P. Namboodiri:

Zero-Shot Class Unlearning in CLIP with Synthetic Samples. 6456-6464 - Cheng-Yi Lee

, Ching-Chia Kao, Cheng-Han Yeh, Chun-Shien Lu, Chia-Mu Yu, Chu-Song Chen:
Defending Against Repetitive Backdoor Attacks on Semi-Supervised Learning Through Lens of Rate-Distortion-Perception Trade-Off. 6465-6474 - Ahmet Serdar Karadeniz, Dimitrios Mallis, Nesryne Mejri, Kseniya Cherenkova, Anis Kacem, Djamila Aouada:

PICASSO: A Feed-Forward Framework for Parametric Inference of CAD Sketches via Rendering Self-Supervision. 6475-6484 - Favour Ekong, Jun Zhou, Kwabena Sarpong

, Yongsheng Gao:
Pixel-Wise Shuffling with Collaborative Sparsity for Melanoma Hyperspectral Image Classification. 6485-6494 - Chamuditha Jayanga Galappaththige, Zachary Izzo, Xilin He, Honglu Zhou, Muhammad Haris Khan:

Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization. 6495-6505 - Dario Serez, Marco Cristani, Alessio Del Bue, Vittorio Murino, Pietro Morerio:

Pre-trained Multiple Latent Variable Generative Models are Good Defenders Against Adversarial Attacks. 6506-6516 - Xiaoyu Liu, Beitong Zhou, Zuogong Yue, Cheng Cheng:

PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning. 6517-6527 - Thomas Westfechtel, Dexuan Zhang, Tatsuya Harada:

Combining Inherent Knowledge of Vision-Language Models with Unsupervised Domain Adaptation Through Strong-Weak Guidance. 6528-6537 - Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad, Alex M. Bronstein:

Class-Conditioned Transformation for Enhanced Robust Image Classification. 6538-6547 - Sameer Ambekar, Zehao Xiao, Xiantong Zhen, Cees G. M. Snoek:

GeneralizeFormer: Layer-Adaptive Model Generation Across Test-Time Distribution Shifts. 6548-6558 - Shutong Jin, Ruiyu Wang, Kuangyi Chen, Florian T. Pokorny:

PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement. 6559-6569 - Zeyu Shangguan, Daniel Seita, Mohammad Rostami:

Cross-Domain Multi-Modal Few-Shot Object Detection via Rich Text. 6570-6580 - Lei Zhu, Yanyu Xu, Yong Liu, Rick Siow Mong Goh, Xinxing Xu:

Ad2Mix: Adversarial and Adaptive Mixup for Unsupervised Domain Adaptation. 6581-6590 - Matthew Gwilliam, Michael Cogswell, Meng Ye, Karan Sikka, Abhinav Shrivastava, Ajay Divakaran:

A Video is Worth 10, 000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval. 6591-6601 - Jia Fu, Xiao Zhang, Sepideh Pashami, Fatemeh Rahimian, Anders Holst:

DiffPAD: Denoising Diffusion-Based Adversarial Patch Decontamination. 6602-6611 - Romain Hermary

, Vincent Gaudillière, Abd El Rahman Shabayek, Djamila Aouada:
Removing Geometric Bias in One-Class Anomaly Detection with Adaptive Feature Perturbation. 6612-6622 - Md Farhan Ishmam, Ishmam Tashdeed

, Talukder Asir Saadat, Md. Hamjajul Ashmafee, Abu Raihan Mostofa Kamal, Md. Azam Hossain:
Visual Robustness Benchmark for Visual Question Answering (VQA). 6623-6633 - Xiwen Wei, Guihong Li, Radu Marculescu:

Online-LoRA: Task-Free Online Continual Learning via Low Rank Adaptation. 6634-6645 - Benjamin Bauchwitz, Mary L. Cummings:

Task Configuration Impacts Annotation Quality and Model Training Performance in Crowdsourced Image Segmentation. 6646-6656 - Youcef Djenouri, Ahmed Nabil Belbachir, Asma Belhadi, Nassim Belmecheri, Tomasz P. Michalak:

Shapley Consensus Deep Learning for Ensemble Pruning. 6657-6666 - Jiuhong Xiao

, Gao Zhu, Giuseppe Loianno:
VG-SSL: Benchmarking Self-Supervised Representation Learning Approaches for Visual Geo-Localization. 6667-6677 - Chenyu Wang, Weixin Luo, Sixun Dong, Xiaohua Xuan, Zhengxin Li, Lin Ma, Shenghua Gao:

MLLM-Tool: A Multimodal Large Language Model for Tool Agent Learning. 6678-6687 - Nishq Poorav Desai, Ali Etemad, Michael A. Greenspan:

CycleCrash: A Dataset of Bicycle Collision Videos for Collision Prediction and Analysis. 6688-6698 - Marco Colussi

, Sergio Mascetti, Jose Dolz, Christian Desrosiers:
ReC- Ttt: Contrastive Feature Reconstruction for Test-Time Training. 6699-6708 - Manojna Sistla, Yu Wen, Aamir Bader Shah, Chenpei Huang, Lening Wang, Xuqing Wu, Jiefu Chen, Miao Pan, Xin Fu:

Bit-Flip Induced Latency Attacks in Object Detection. 6709-6718 - Jiarui Sun, M. Ugur Akcal, Girish Chowdhary, Wei Zhang:

MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning. 6719-6729 - Amit Baras, Alon Zolfi, Yuval Elovici, Asaf Shabtai:

QuantAttack: Exploiting Quantization Techniques to Attack Vision Transformers. 6730-6740 - Fumioki Sato, Hideaki Hayashi, Hajime Nagahara:

Multi-task Learning of Classification and Generation for Set-structured Data. 6741-6751 - Xiaowei Yu, Zhe Huang, Zao Zhang:

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation. 6752-6761 - Frank Fundel, Johannes Schusterbauer, Vincent Tao Hu, Björn Ommer:

Distillation of Diffusion Features for Semantic Correspondence. 6762-6774 - Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez-Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman:

SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation. 6775-6784 - Neeresh Kumar Perla, Md. Iqbal Hossain, Afia Sajeeda, Ming Shao:

Are Exemplar-Based Class Incremental Learning Models Victim of Black-Box Poison Attacks? 6785-6794 - Hoin Jung, Xiaoqian Wang:

Towards On-the-Fly Novel Category Discovery in Dynamic Long-Tailed Distributions. 6795-6804 - Feng Cheng, Ziyang Wang, Yi-Lin Sung, Yan-Bo Lin, Mohit Bansal, Gedas Bertasius:

Dam: Dynamic Adapter Merging for Continual Video QA Learning. 6805-6817 - Son Minh Nguyen, Tran Duy Linh, Duc Viet Le, Paul J. M. Havinga:

Multi-Surrogate-Teacher Assistance for Representation Alignment in Fingerprint-Based Indoor Localization. 6818-6827 - Yu-Shan Tai, An-Yeu Andy Wu:

AMP-ViT: Optimizing Vision Transformer Efficiency with Adaptive Mixed-Precision Post-Training Quantization. 6828-6837 - Younggeol Cho, Youngrae Kim, Junho Yoon, Seunghoon Hong, Dongman Lee:

Feature Augmentation Based Test-Time Adaptation. 6838-6847 - David Tschirschwitz

, Volker Rodehorst:
Label Convergence: Defining an Upper Performance Bound in Object Recognition Through Contradictory Annotations. 6848-6857 - Tavis Shore, Oscar Mendez, Simon Hadfield:

SpaGBOL: Spatial-Graph-Based Orientated Localisation. 6858-6867 - Sethupathy Parameswaran, Yuan Fang

, Chandan Gautam, Savitha Ramasamy, Xiaoli Li:
Learning to Identify Seen, Unseen and Unknown in the Open World: A Practical Setting for Zero-Shot Learning. 6868-6878 - Junki Mori

, Kosuke Kihara, Taiki Miyagawa, Akinori F. Ebihara, Isamu Teranishi, Hisashi Kashima:
Federated Source-Free Domain Adaptation for Classification: Weighted Cluster Aggregation for Unlabeled Data. 6879-6889 - Alin Dondera, Anuj Singh, Hadi Jamali Rad:

MAGMA: Manifold Regularization for MAEs. 6890-6899 - Fardad Dadboud, Hamid Azad, Varun Mehta, Miodrag Bolic, Iraj Mantegh:

DrIFT: Autonomous Drone Dataset with Integrated Real and Synthetic Data, Flexible Views, and Transformed Domains. 6900-6910 - Xiwen Chen, Huayu Li, Peijie Qiu, Wenhui Zhu, Rahul Amin, Abolfazl Razi:

RD-DPP: Rate-Distortion Theory Meets Determinantal Point Process to Diversify Learning Data Samples. 6911-6920 - Nicolas Harvey Chapman, Christopher F. Lehnert

, Will N. Browne, Feras Dayoub
:
Enhancing Embodied Object Detection with Spatial Feature Memory. 6921-6931 - Tom Pégeot, Eva Feillet, Adrian Popescu, Inna Kucher, Bertrand Delezoide:

Temporal Dynamics in Visual Data: Analyzing the Impact of Time on Classification Accuracy. 6932-6943 - Jayateja Kalla, Rohit Kumar, Soma Biswas:

TACLE: Task and Class-Aware Exemplar-Free Semi-Supervised Class Incremental Learning. 6944-6954 - Tobias Christian Nauen, Sebastian Palacio, Federico Raue, Andreas Dengel:

Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers. 6955-6966 - Zorana Dozdor, Tomislav Hrkac, Zoran Kalafatic:

SV-data2vec: Guiding Video Representation Learning with Latent Skeleton Targets. 6967-6976 - Eric Yang Yu, Christopher Liao, Sathvik Ravi, Theodoros Tsiligkaridis, Brian Kulis:

Image-Caption Encoding for Improving Zero-Shot Generalization. 6977-6986 - Jisu Han

, Jaemin Na, Wonjun Hwang:
Semantic Prompting with Image Token for Continual Learning. 6987-6997 - Henry Hölzemann, Torsten Fiolka:

Semantic Clustering of Image Retrieval Databases used for Visual Localization. 6998-7007 - Moritz Nottebaum, Matteo Dunnhofer, Christian Micheloni:

LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones. 7008-7018 - Maksim Zhdanov, Stanislav Dereka, Sergey Kolesnikov:

Identity Curvature Laplace Approximation for Improved Out-of-Distribution Detection. 7019-7028 - Dhanunjaya Varma Devalraju

, C. Chandra Sekhar:
Uncertainty-Guided Metric Learning Without Labels. 7029-7038 - Meghana Karri, Amit Soni Arya, Koushik Biswas, Nicolo Gennaro, Vedat Cicek, Gorkem Durak, Yuri S. Velichko, Ulas Bagci:

Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-Supervised Medical Image Segmentation. 7039-7048 - Yuguang Yao, Jiancheng Liu, Yifan Gong, Xiaoming Liu, Yanzhi Wang, Xue Lin, Sijia Liu:

Can Adversarial Examples be Parsed to Reveal Victim Model Information? 7049-7061 - Li-Ying Hung, Cooper Cheng-Yuan Ku:

Knockoff Branch: Model Stealing Attack via Adding Neurons in the Pre-Trained Model. 7062-7070 - Mikhail Papkov, Pavel Chizhov, Leopold Parts:

SwinIA: Self-Supervised Blind-Spot Image Denoising Without Convolutions. 7071-7080 - Anton Frolov, Florian Kleiner

, Christiane Rößler, Volker Rodehorst
:
Needles & Haystacks: Dataset and Benchmark for Domain-Agnostic Image-Based Rigid Slice-to-Volume Registration. 7081-7091 - Gustavo Adolfo Vargas Hakim, David Osowiechi, Mehrdad Noori, Milad Cheraghalikhani, Ali Bahri, Moslem Yazdanpanah, Ismail Ben Ayed, Christian Desrosiers:

CLIPArTT: Adaptation of CLIP to New Domains at Test Time. 7092-7101 - Kamalakar Vijay Thakare

, Lalit Lohani
, Kamakshya Prasad Nayak
, Debi Prosad Dogra, Heeseung Choi, Hyungjoo Jung, Ig-Jae Kim:
CLIPping Imbalances: A Novel Evaluation Baseline and PEARL Dataset for Pedestrian Attribute Recognition. 7102-7111 - Sahar Rahimi Malakshan, Mohammad Saeed Ebrahimi Saadabadi, Ali Dabouei, Nasser M. Nasrabadi:

Decomposed Distribution Matching in Dataset Condensation. 7112-7122 - Quazi Mishkatul Alam, Bilel Tarchoun, Ihsen Alouani

, Nael B. Abu-Ghazaleh:
Adversarial Attention Deficit: Fooling Deformable Vision Transformers with Collaborative Adversarial Patches. 7123-7132 - Adrian Iordache, Bogdan Alexe, Radu Tudor Ionescu:

Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets. 7133-7142 - Vandan Gorade, Azad Singh

, Deepak Mishra:
OTCXR: Rethinking Self-supervised Alignment using Optimal Transport for Chest X-ray Analysis. 7143-7152 - Nicholas John Eliopoulos, Purvish Jajal, James C. Davis, Gaowen Liu, George K. Thiravathukal, Yung-Hsiang Lu:

Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge. 7153-7162 - Marina Ceccon, Davide Dalle Pezze, Alessandro Fabris, Gian Antonio Susto:

Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark. 7163-7172 - S. Divakar Bhat, Amit More, Mudit Soni, Bhuvan Aggarwal:

PC-GZSL: Prior Correction for Generalized Zero Shot Learning. 7173-7183 - Daehwan Kim, Hyungmin Kim, Daun Jeong, Sungho Suh, Hansang Cho

:
SPACE: SPAtial-Aware Consistency rEgularization for Anomaly Detection in Industrial Applications. 7184-7194 - Sarmistha Das

, Basha Mujavarsheik, R. E. Zera Lyngkhoi, Sriparna Saha, Alka Maurya:
Deciphering the Complaint Aspects: Towards an Aspect-Based Complaint Identification Model with Video Complaint Dataset in Finance. 7195-7204 - Shubhi Shukla, Subhadeep Dalui

, Manaar Alam, Shubhajit Datta
, Arijit Mondal, Debdeep Mukhopadhyay, Partha Pratim Chakrabarti:
Guardian of the Ensembles: Introducing Pairwise Adversarially Robust Loss for Resisting Adversarial Attacks in DNN Ensembles. 7205-7214 - Michelle Guo, Mia Tang, Hannah Cha, Ruohan Zhang, C. Karen Liu, Jiajun Wu:

CRAFT: Designing Creative and Functional 3D Objects. 7215-7224 - Chaitanya Animesh, Manmohan Chandraker:

Tuned Contrastive Learning. 7225-7234 - Flavien Armangeon, Thibaud Ehret, Enric Meinhardt-Llopis, Rafael Grompone von Gioi, Guillaume Thibault, Marc Petit, Gabriele Facciolo:

IRIS-VIS: A New Dataset for Visibility Estimation in an Industrial Environment. 7235-7243 - Xinglong Sun, Maying Shen, Hongxu Yin, Lei Mao, Pavlo Molchanov, José M. Álvarez:

Advancing Weight and Channel Sparsification with Enhanced Saliency. 7244-7255 - Tushar Kadam, Utkarsh Mishra, Aakarsh Malhotra:

SHIP: Structural Hierarchies for Instance-Dependent Partial Labels. 7256-7265 - Juan Pablo Lagos, Haider Ali, Adnan Faroque, Esa Rahtu

:
Heterogeneous Datasets for Unsupervised Image Anomaly Detection. 7266-7276 - Chandan Kumar Singh, Devesh Kumar, Vipul Sanap, Rajesh Sinha:

LLM-RSPF: Large Language Model-Based Robotic System Planning Framework for Domain Specific Use-cases. 7277-7286 - Mehran Hosseini, Peyman Hosseini:

GeoPos: A Minimal Positional Encoding for Enhanced Fine-Grained Details in Image Synthesis Using Convolutional Neural Networks. 7287-7297 - Sayanta Adhikari, Dupati Srikar Chandra, P. K. Srijith, Pankaj Wasnik, Naoyuki Onoe:

AdaPrefix++: Integrating Adapters, Prefixes and Hypernetwork for Continual Learning. 7298-7307 - Viti Mario, Nadiya Shvai, Arcadi Llanza, Amir Nakib:

A 0-Shot Self-Attention Mechanism for Accelerated Diagonal Attention. 7308-7315 - Juntae Kim, Sungwon Woo, Jongho Nang:

Relational Self-Supervised Distillation with Compact Descriptors for Image Copy Detection. 7316-7325 - Diogo Lavado, Ricardo Santos, André Coelho, João Santos, Alessandra Micheletti, Cláudia Soares:

Learning Under Noisy Labels, Spurious Points, and Diverse Structures: TS40K, a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission Systems. 7326-7336 - Minxia Xu, Han Yang, Bo Song, Weida Hu, Jinshui Miao, Erkang Cheng:

Cross Image Feature Perturbation with Pseudo Label Fusion for Semi-Supervised Medical Image Segmentation. 7337-7347 - Ayumu Saito, Prachi Kudeshia

, Jiju Poovvancheri:
Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud. 7348-7357 - Camille Garcin, Maximilien Servajean, Alexis Joly, Joseph Salmon:

A Two-Head Loss Function for Deep Average-K Classification. 7358-7367 - Diana-Nicoleta Grigore, Mariana-Iuliana Georgescu, Jon Álvarez Justo, Tor Arne Johansen, Andreea Iuliana Ionescu, Radu Tudor Ionescu:

Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers. 7368-7378 - Tanay Agrawal, Mohammed Guermal, Michal Balazia, François Brémond:

CM3T: Framework for Efficient Multimodal Learning for Inhomogeneous Interaction Datasets. 7379-7388 - Ragja Palakkadavath, Hung Le, Thanh Nguyen-Tang, Sunil Gupta, Svetha Venkatesh:

Fair Domain Generalization with Heterogeneous Sensitive Attributes Across Domains. 7389-7398 - Divya Saxena, Jiannong Cao, Jiahao Xu, Tarun Kulshrestha:

Data-Efficient Alignment in Medical Imaging via Reconfigurable Generative Networks. 7399-7408 - Stefan Smeu, Elena Burceanu, Emanuela Haller, Andrei Liviu Nicolicioiu:

Robust Novelty Detection Through Style-Conscious Feature Ranking. 7409-7418 - Sankalp Nagaonkar, Achyut Mani Tripathi, Ashish Mishra:

When Visual State Space Model Meets Backdoor Attacks. 7419-7428 - Arnisha Khondaker, Nilanjan Ray:

Learning Instance-Specific Parameters of Black-Box Models Using Differentiable Surrogates. 7429-7438 - Daniela Ivanova, Marco Aversa, Paul Henderson, John Williamson:

ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage. 7439-7449 - Göksel Mert Çökmez, Yang Zhang, Christopher Schroers, Tunç Ozan Aydin:

CLIP-Fusion: A Spatio-Temporal Quality Metric for Frame Interpolation. 7450-7459 - Haleh Damirchi, Ali Etemad, Michael A. Greenspan:

Socially-Informed Reconstruction for Pedestrian Trajectory Forecasting. 7460-7469 - Colton R. Crum, Adam Czajka:

MENTOR: Human Perception-Guided Pretraining for Increased Generalization. 7470-7479 - Savitha Sam Abraham, Sourav Garg, Feras Dayoub

:
To Ask or Not to Ask? Detecting Absence of Information in Vision and Language Navigation. 7480-7489 - Xuhui Kang

, Yen-Ling Kuo:
Incorporating Task Progress Knowledge for Subgoal Generation in Robotic Manipulation through Image Edits. 7490-7499 - Reza Akbarian Bafghi, Nidhin Harilal, Claire Monteleoni

, Maziar Raissi:
MixDiff: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations. 7500-7511 - Zhongyao Cheng, Fang Wu, Peisheng Qian, Ziyuan Zhao, Xulei Yang:

AIC3DOD: Advancing Indoor Class-Incremental 3D Object Detection with Point Transformer Architecture and Room Layout Constraints. 7512-7521 - Maheswar Bora, Saurabh Atreya, Aritra Mukherjee, Abhijit Das:

KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder. 7522-7532 - Vishnuprasadh Kumaravelu, P. K. Srijith, Sunil Gupta:

EvoCL: Continual Learning over Evolving Domains. 7533-7541 - Cheeun Hong, Sungyong Baik, Junghun Oh, Kyoung Mu Lee:

Difficulty, Diversity, and Plausibility: Dynamic Data-Free Quantization. 7542-7551 - Marco Blanchini, Giovanna Maria Dimitri

, Lydia Abady, Benedetta Tondi, Tarcisio Lancioni, Mauro Barni:
Semiotic-Based Construction of a Large Emotional Image Dataset with Neutral Samples. 7552-7561 - Wojciech Lapacz, Daniel Marczak, Filip Szatkowski, Tomasz Trzcinski:

Exploring the Stability Gap in Continual Learning: The Role of the Classification Head. 7562-7571 - Aniana Cruz

, Guilherme G. Schardong, Luiz Schirmer, João Marcos, Farhad Shadmand, Nuno Gonçalves
:
RiemStega: Covariance-Based Loss for Print-Proof Transmission of Data in Images. 7572-7581 - Yanqi Qiao

, Dazhuang Liu, Rui Wang, Kaitai Liang:
Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm. 7582-7592 - Krishna Kanth Nakka, Alexandre Alahi:

NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability. 7593-7604 - David Tschirschwitz

, Volker Rodehorst
:
CISOL: An Open and Extensible Dataset for Table Structure Recognition in the Construction Industry. 7605-7613 - Mingxian Li, Hao Sun, Yingtie Lei

, Xiaofeng Zhang, Yihang Dong, Yilin Zhou, Zimeng Li, Xuhang Chen
:
High-Fidelity Document Stain Removal via A Large-Scale Real-World Dataset and A Memory-Augmented Transformer. 7614-7624 - Eva Feillet, Adrian Popescu, Céline Hudelot:

A Reality Check on Pre-training for Exemplar-free Class-Incremental Learning. 7625-7636 - Tamara R. Lenhard, Andreas Weinmann

, Kai Franke
, Tobias Koch
:
SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection. 7637-7647 - Olaf Wysocki, Yue Tan, Thomas Froech, Yan Xia, Magdalena Wysocki, Ludwig Hoegner

, Daniel Cremers, Christoph Holst:
ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset. 7648-7658 - Yuwei Chen, Ming-Ching Chang, Matthias Kirchner, Zhenfei Zhang, Xin Li, Arslan Basharat, Anthony Hoogs:

A Semantically Impactful Image Manipulation Dataset: Characterizing Image Manipulations Using Semantic Significance. 7659-7668 - Varun Burde, Assia Benbihi, Pavel Burget, Torsten Sattler:

Comparative Evaluation of 3D Reconstruction Methods for Object Pose Estimation. 7669-7681 - Yachuan Li, Xavier Soria Poma, Yun Bai, Qian Xiao, Chaozhi Yang, Guanlin Li, Zongmin Li:

EDMB: Edge Detector with Mamba. 7682-7691 - Alberto Presta, Enzo Tartaglione, Attilio Fiandrotti, Marco Grangetto, Pamela C. Cosman:

Efficient Progressive Image Compression with Variance-Aware Masking. 7692-7700 - Alex Tianyi Xu, Alex Wilf, Paul Pu Liang, Alexander Obolenskiv, Daniel Fried, Louis-Philippe Morency:

Comparative Knowledge Distillation. 7701-7710 - Evgenii Kruzhkov, Sven Behnke:

LiLMaps: Learnable Implicit Language Maps. 7711-7720 - Sakshi Choudhary, Sai Aparna Aketi, Kaushik Roy:

SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data. 7731-7741 - Manuel Knott, Ignacio Serna

, Ethan Mann, Pietro Perona:
A Rapid Test for Accuracy and Bias of Face Recognition Technology. 7742-7751 - Marius Kästingschäfer, Théo Gieruc, Sebastian Bernhard, Dylan Campbell, Eldar Insafutdinov, Eyvaz Najafli, Thomas Brox:

SEED4D: A Synthetic Ego-Exo Dynamic 4D Data Generator, Driving Dataset and Benchmark. 7752-7764 - Xuesong Li

, Zeeshan Hayder, Ali Zia, Connor Cassidy, Shiming Liu, Warwick Stiller, Eric A. Stone, Warren Conaty, Lars Petersson, Vivien Rolland:
BioNet and NeFF: Crop Biomass Prediction from Point Clouds to Drone Imagery. 7765-7775 - Peyman Rostami, Nilotpal Sinha, Nidhal Eddine Chenni

, Anis Kacem, Abd El Rahman Shabayek
, Carl Shneider, Djamila Aouada:
Information Theoretic Pruning of Coupled Channels in Deep Neural Networks. 7776-7786 - Yibo Zhong, Yao Zhou:

Rethinking Low-Rank Adaptation in Vision: Exploring Head-Level Responsiveness across Diverse Tasks. 7787-7796 - Filippo Botti, Alex Ergasti

, Leonardo Rossi, Tomaso Fontanini, Claudio Ferrari, Massimo Bertozzi, Andrea Prati:
Mamba-ST: State Space Model for Efficient Style Transfer. 7797-7806 - Marco Paul E. Apolinario

, Arani Roy, Kaushik Roy:
LLS: Local Learning Rule for Deep Neural Networks Inspired by Neural Activity Synchronization. 7807-7816 - Kai Chen, Yanze Li, Wenhua Zhang, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-Yan Yeung, Huchuan Lu, Xu Jia:

Automated Evaluation of Large Vision-Language Models on Self-Driving Corner Cases. 7817-7826 - Tejaswini Medi, Steffen Jung, Margret Keuper:

FAIR-TAT: Improving Model Fairness Using Targeted Adversarial Training. 7827-7836 - Xilin He, Cheng Luo, Qinliang Lin, Weicheng Xie, Muhammad Haris Khan, Siyang Song, Linlin Shen:

Towards Robust Training via Gradient-Diversified Backpropagation. 7847-7856 - Arjun Sridhar, Yiran Chen:

Delta-NAS: Difference of Architecture Encoding for Predictor-Based Evolutionary Neural Architecture Search. 7857-7865 - Sagar M. Waghmare, Kimberly Wilber, Dave Hawkey, Xuan Yang, Matthew Wilson, Stephanie Debats, Cattalyya Nuengsigkapian, Astuti Sharma, Lars Pandikow, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko:

SANPO: A Scene Understanding, Accessibility and Human Navigation Dataset. 7866-7875 - Hrishav Bakul Barua, Kalin Stefanov

, KokSheik Wong, Abhinav Dhall, Ganesh Krishnasamy:
GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction. 7876-7886 - Nguyen Son Dinh, Tuan Dung Nguyen, Duc Tri Tran, Nguyen Dang Huy Pham, Thuan Hieu Tran, Ngoc Anh Tong, Quang Huy Hoang, Phi Le Nguyen:

Sign Language Recognition: A Large-scale Multi-view Dataset and Comprehensive Evaluation. 7887-7897 - Purvish Jajal, Nick John Eliopoulos, Benjamin Shiue-Hal Chou, George K. Thiravathukal, James C. Davis, Yung-Hsiang Lu:

Token Turing Machines are Efficient Vision Models. 7898-7907 - Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah:

ROADS: Robust Prompt-Driven Multi-Class Anomaly Detection Under Domain Shift. 7908-7917 - Lingjie Yi, Tao Sun, Yikai Zhang, Songzhu Zheng, Weimin Lyu, Haibin Ling, Chao Chen:

PivotAlign: Improve Semi-Supervised Learning by Learning Intra-Class Heterogeneity and Aligning with Pivots. 7918-7927 - Komal Kumar, Snehashis Chakraborty, Dwarikanath Mahapatra, Behzad Bozorgtabar, Sudipta Roy:

Self-Supervised Anomaly Segmentation via Diffusion Models with Dynamic Transformer UNet. 7928-7938 - Moshe Kimhi, David Vainshtein, Chaim Baskin, Dotan Di Castro:

Robot Instance Segmentation with Few Annotations for Grasping. 7939-7949 - Trung-Anh Dang, Vincent Nguyen, Ngoc-Son Vu, Christel Vrain:

Memory-efficient Continual Learning with Neural Collapse Contrastive. 7950-7959 - Jiahao Xu

, Zikai Zhang
, Rui Hu
:
Identify Backdoored Model in Federated Learning via Individual Unlearning. 7960-7969 - Luca Ciampi

, Nicola Messina, Matteo Pierucci, Giuseppe Amato, Marco Avvenuti, Fabrizio Falchi
:
Mind the Prompt: A Novel Benchmark for Prompt-Based Class-Agnostic Counting. 7970-7979 - Spencer Carmichael, Manohar Bhat, Mani Ramanagopal, Austin Buchan, Ram Vasudevan, Katherine A. Skinner:

TRNeRF: Restoring Blurry, Rolling Shutter, and Noisy Thermal Images with Neural Radiance Fields. 7980-7990 - Sergey Korchagin, Ekaterina Zaychenkova, Aleksei Khalin, Aleksandr Yugay, Alexey Zaytsev, Egor I. Ershov:

Improving Uncertainty Estimation with Confidence-Aware Training Data. 7991-8001 - Jaisidh Singh, Ishaan Shrivastava, Mayank Vatsa, Richa Singh, Aparna Bharati:

Learning the Power of "No": Foundation Models with Negations. 8002-8012 - Guiqiu Liao

, Matjaz Jogan, Sai Koushik, Eric Eaton, Daniel A. Hashimoto:
Disentangling Spatio-Temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video. 8013-8023 - Tianyi Ma, Maoying Qiao:

Disentangle Source and Target Knowledge for Continual Test-Time Adaptation. 8024-8034 - Rini Smita Thakur, Vinod K. Kurmi:

Uncertainty and Energy based Loss Guided Semi-Supervised Semantic Segmentation. 8035-8045 - Jun Chen, Faizan Farooq Khan, Ming Hu, Ammar Sherif, Zongyuan Ge, Boyang Li, Mohamed Elhoseiny

:
Local Masked Reconstruction for Efficient Self-Supervised Learning on High-Resolution Images. 8046-8056 - Shijie Wang, Dahun Kim, Ali Taalimi, Chen Sun, Weicheng Kuo:

Learning Visual Grounding from Generative Vision and Language Model. 8057-8067 - Payal Mohadikar, Ye Duan

:
OmniDiffusion: Reformulating 360 Monocular Depth Estimation Using Semantic and Surface Normal Conditioned Diffusion. 8068-8078 - Jonás Serých

, Michal Neoral, Jiri Matas:
MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation. 8079-8089 - Yuwen Heng, Yihong Wu, Srinandan Dasmahapatra, Hansung Kim:

MatSpectNet: Material Segmentation Network with Domain-Aware and Physically-Constrained Hyperspectral Reconstruction. 8090-8100 - Jiahao Zhang, Frederic Z. Zhang, Cristian Rodriguez, Yizhak Ben-Shabat, Anoop Cherian, Stephen Gould:

Temporally Grounding Instructional Diagrams in Unconstrained Videos. 8101-8111 - Shangbo Mao, Deepu Rajan:

An Encoder-Agnostic Weakly Supervised Method For Describing Textures. 8112-8121 - Khurram Azeem Hashmi, Talha Uddin Sheikh, Didier Stricker, Muhammad Zeshan Afzal

:
Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection. 8122-8133 - Nan Peng, Xun Zhou, Mingming Wang, Xiaojun Yang, Songming Chen, Guisong Chen:

PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction. 8134-8143 - George Leotescu, Alin-Ionut Popa, Diana Grigore, Daniel Voinea, Pietro Perona:

Self-Supervised Incremental Learning of Object Representations from Arbitrary Image Sets. 8144-8154 - Yuhao Lin, Haiming Xu, Lingqiao Liu, Javen Qinfeng Shi:

A Simple-but-Effective Baseline for Training-Free Class-Agnostic Counting. 8155-8164 - Xinpeng Liu, Hiroaki Santo, Yosuke Toda, Fumio Okura:

TreeFormer: Single-View Plant Skeleton Estimation via Tree-Constrained Graph Generation. 8165-8175 - Yanan Gu, Muli Yang, Xu Yang, Kun Wei, Hongyuan Zhu, Gabriel James Goenawan, Cheng Deng:

Dynamic Adapter Tuning for Long-Tailed Class-Incremental Learning. 8176-8185 - Timur Z. Mamedov

, Anton Konushin, Vadim Konushin:
ReMix: Training Generalized Person Re-Identification on a Mixture of Data. 8186-8196 - Shen Zheng, Anurag Ghosh, Srinivasa G. Narasimhan:

Instance-Warp: Saliency Guided Image Warping for Unsupervised Domain Adaptation. 8197-8206 - Riku Inoue

, Masamitsu Tsuchiya, Yuji Yasui:
Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection. 8207-8216 - Philipp Allgeuer, Kyra Ahrens, Stefan Wermter:

Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion. 8217-8228 - Ryo Fujii, Ryo Hachiuma, Hideo Saito:

CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density Forecasting. 8229-8238 - Hong Liu, Yuta Nakashima, Noboru Babaguchi:

Paladin: Understanding Video Intentions in Political Advertisement Videos. 8239-8248 - Clement Tan, Chai Kiat Yeo, Cheston Tan, Basura Fernando

:
Inferring Past Human Actions in Homes with Abductive Reasoning. 8249-8258 - Gyuseong Lee, Wooseok Jang, Jinhyeon Kim, Jaewoo Jung, Seungryong Kim:

Domain Generalization using Large Pretrained Models with Mixture-of-Adapters. 8259-8269 - Hankyul Kang, Jongbin Ryu:

Enriching Local Patterns with Multi-Token Attention for Broad-Sight Neural Networks. 8270-8279 - Jaehyun Choi, Junwon Ko, Dong-Jae Lee, Junmo Kim:

AH-OCDA: Amplitude-Based Curriculum Learning and Hopfield Segmentation Model for Open Compound Domain Adaptation. 8280-8290 - Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong:

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection. 8291-8301 - Mustafa Munir, Md Mostafijur Rahman, Radu Marculescu:

RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone. 8302-8312 - Avi Gupta, Koteswar Rao Jerripothula

, Tammam Tillo:
CIRCOD: Co-Saliency Inspired Referring Camouflaged Object Discovery. 8313-8323 - Takehiko Ohkawa, Takuma Yagi, Taichi Nishimura, Ryosuke Furuta, Atsushi Hashimoto, Yoshitaka Ushiku, Yoichi Sato:

Exo2EgoDVC: Dense Video Captioning of Egocentric Procedural Activities Using Web Instructional Videos. 8324-8335 - Raghav Goyal, Wan-Cyuan Fan, Mennatullah Siam, Leonid Sigal:

TAM-VT: Transformation-Aware Multi-Scale Video Transformer for Segmentation and Tracking. 8336-8345 - Bokyeung Lee, Jonghwan Hong, Hyunuk Shin, Bonhwa Ku, Hanseok Ko:

Dropout Connects Transformers and CNNs: Transfer General Knowledge for Knowledge Distillation. 8346-8355 - Tsung-Yu Chen, Luyu Yang, Tzu-Yu Chuang, Shang-Hong Lai:

CACE: Sim-to-Real Indoor 3D Semantic Segmentation via Context-Aware Augmentation and Consistency Enforcement. 8356-8367 - Koichiro Ito:

Feature Design for Bridging SAM and CLIP Toward Referring Image Segmentation. 8368-8378 - Seonguk Seo, Bohyung Han:

Re-Evaluating Group Robustness via Adaptive Class-Specific Scaling. 8379-8388 - Teppei Kurita, Yuhi Kondo, Legong Sun, Takayuki Sasaki, Sho Nitta, Yasuhiro Hashimoto, Yoshinori Muramatsu, Yusuke Moriuchi:

Revisiting Disparity from Dual-Pixel Images: Physics-Informed Lightweight Depth Estimation. 8389-8399 - Jiacheng Li, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan, Zhiwei Xiong:

Multi-Spectral Image Color Reproduction. 8400-8409 - Jonathan Lee, Bolivar Solarte, Chin-Hsuan Wu, Jin-Cheng Jhang, Fu-En Wang, Yi-Hsuan Tsai, Min Sun:

uLayout: Unified Room Layout Estimation for Perspective and Panoramic Images. 8410-8419 - Thanh-Son Nguyen, Hong Yang, Basura Fernando

:
Effective Scene Graph Generation by Statistical Relation Distillation. 8420-8430 - Christoph Reinders, Radu Berdan, Beril Besbinar, Junji Otsuka, Daisuke Iso:

RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation. 8431-8443 - Aleyna Kütük, Tevfik Metin Sezgin:

Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation. 8444-8453 - Jiyang Yu, Tianhao Zhang, Fuhao Shi, Lei He, Chia-Kai Liang

:
SensorFlow: Sensor and Image Fused Video Stabilization. 8454-8463 - Jaehyun Park

, Nam Ik Cho:
Explicit Guidance for Robust Video Frame Interpolation Against Discontinuous Motions. 8464-8473 - Poulami Sinhamahapatra, Franziska Schwaiger, Shirsha Bose, Huiyu Wang, Karsten Roscher, Stephan Günnemann:

Finding Dino: A Plug-and-Play Framework for Zero-Shot Detection of Out-of-Distribution Objects Using Prototypes. 8474-8483 - Yeshwanth Kumar Adimoolam, Charalambos Poullis, Melinos Averkiou:

Pix2Poly: A Sequence Prediction Method for End-to-End Polygonal Building Footprint Extraction from Remote Sensing Imagery. 8484-8493 - Md. Alimoor Reza, Eric Manley, Sean Chen, Sameer Chaudhary, Jacob Elafros:

SegBuilder: A Semi-Automatic Annotation Tool for Segmentation. 8494-8503 - Ali Bahri, Mehrdad Noori, Gustavo Adolfo Vargas Hakim, Ismail Ben Ayed, Milad Cheraghalikhani, David Osowiechi, Christian Desrosiers, Moslem Yazdanpanah:

FDS: Feedback-Guided Domain Synthesis with Multi-Source Conditional Diffusion Models for Domain Generalization. 8504-8514 - Grégoire Petit, Nathan Palluau, Axel Bauer, Clemens Dlaska:

EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation Using Synthetic Data. 8515-8524 - Piotr Teterwak, Kuniaki Saito, Theodoros Tsiligkaridis, Kate Saenko, Bryan A. Plummer:

ERM++: An Improved Baseline for Domain Generalization. 8525-8535 - Adam Pardyl, Grzegorz Kurzejamski, Jan Olszewski, Tomasz Trzcinski, Bartosz Zielinski:

Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers. 8536-8545 - Nimeshika Udayangani

, Hadi M. Dolatabadi, Sarah M. Erfani, Christopher Leckie:
Exploiting Inter-Sample Information for Long-Tailed Out-of-Distribution Detection. 8546-8555 - Ahmad Darkhalil

, Rhodri Guerrier, Adam W. Harley, Dima Damen
:
EgoPoints: Advancing Point Tracking for Egocentric Videos. 8556-8565 - Atif Belal, Akhil Meethal, Francisco Perdigon Romero, Marco Pedersoli, Eric Granger:

Attention-Based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors. 8566-8575 - Ripon Kumar Saha, Scott McCloskey, Suren Jayasuriya:

MetaVIn: Meteorological and Visual Integration for Atmospheric Turbulence Strength Estimation. 8576-8585 - Minjoon Jung, Youwon Jang, Seongho Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang:

Background-Aware Moment Detection for Video Moment Retrieval. 8586-8596 - Yuka Ogino, Yuho Shoji, Takahiro Toizumi, Atsushi Ito:

ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing. 8597-8605 - Jan Olszewski, Dawid Rymarczyk, Piotr Wójcik, Mateusz Pach, Bartosz Zielinski:

TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration. 8606-8616 - Li Sun, Chaitanya Ahuja, Peng Chen, Matt D'Zmura, Kayhan Batmanghelich, Philip Bontrager:

Multi-Modal Large Language Models are Effective Vision Learners. 8617-8626 - Haojie Mu, Burhan Ul Tayyab, Nicholas Chua:

SpiralMLP: A Lightweight Vision MLP Architecture. 8627-8637 - Jinpeng He, Biyuan Liu, Huaixin Chen:

HDPNet: Hourglass Vision Transformer with Dual-Path Feature Pyramid for Camouflaged Object Detection. 8638-8647 - Filippos Gouidis, Konstantinos E. Papoutsakis, Theodore Patkos, Antonis A. Argyros, Dimitris Plexousakis:

Recognizing Unseen States of Unknown Objects by Leveraging Knowledge Graphs. 8648-8659 - Christian Witte

, Jens Behley, Cyrill Stachniss, Marvin Raaijmakers:
Epipolar Attention Field Transformers for Bird's Eye View Semantic Segmentation. 8660-8669 - Akshaya Athwale, Ichrak Shili, Émile Bergeron, Ola Ahmad, Jean-François Lalonde:

DarSwin-Unet: Distortion Aware Architecture. 8670-8680 - Abdelrahman M. Shaker, Syed Talal Wasim, Martin Danelljan, Salman H. Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan:

Efficient Video Object Segmentation via Modulated Cross-Attention Memory. 8681-8690 - André Sacilotti, Samuel Felipe dos Santos, Nicu Sebe

, Jurandy Almeida
:
Transferable-Guided Attention Is All You Need for Video Domain Adaptation. 8691-8701 - Zahidul Islam, Sujoy Paul, Mrigank Rochan:

Unsupervised Video Highlight Detection by Learning from Audio and Visual Recurrence. 8702-8711 - Junsu Choi, Jin-Seop Lee, Noo-Ri Kim, SuHyun Yoon, Jee-Hyong Lee:

Feature-Level and Spatial-Level Activation Expansion for Weakly-Supervised Semantic Segmentation. 8712-8722 - Weihan Luo, Anagh Malik, David B. Lindell:

Transientangelo: Few-Viewpoint Surface Reconstruction Using Single-Photon Lidar. 8723-8733 - Narongthat Thanyawet, Photchara Ratsamee, Yuki Uranishi, Haruo Takemura:

Detective Networks: Enhancing Disaster Recognition in Images Through Attention Shifting Using Optimal Masking. 8734-8743 - Yaxin Feng, Yuan Lan, Luchan Zhang, Yang Xiang:

ElasticLaneNet: An Efficient Geometry-Flexible Lane Detection Framework. 8744-8753 - Jingyi Xu, Hieu Le, Dimitris Samaras:

Learning to Count from Pseudo-Labeled Segmentation. 8754-8763 - Ci-Siang Lin, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen:

Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation. 8764-8774 - Katherine Xu, Lingzhi Zhang, Jianbo Shi:

Detecting Origin Attribution for Text-to-Image Diffusion Models. 8775-8785 - Alessandro D'Amelio, Giuseppe Cartella, Vittorio Cuculo, Manuele Lucchi, Marcella Cornia, Rita Cucchiara, Giuseppe Boccignone:

TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes. 8786-8795 - Tianlong Tan, Bin Chen, Hongliang Cao, Chenggang Yan, Yike Ma, Feng Dai:

DASC-SPT: Towards Self-Supervised Panoramic Semantic Segmentation. 8796-8805 - Peter Hönig, Stefan Thalhammer, Jean-Baptiste Weibel, Matthias Hirschmanner, Markus Vincze:

Shape-Biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation. 8806-8815 - Lei Shi, Paul C. Bürkner, Andreas Bulling:

ActionDiffusion: An Action-Aware Diffusion Model for Procedure Planning in Instructional Videos. 8816-8825 - Hung Huy Nguyen, Pooyan Rahmanzadehgervi, Long Mai, Anh Totti Nguyen:

Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence. 8826-8833 - Calvin Glisson, Qiuxiao Chen:

HSDA: High-Frequency Shuffle Data Augmentation for Bird's-Eye-View Map Segmentation. 8834-8843 - Pongsakorn Jirachanchaisiri, Nam Tuan Ly, Atsuhiro Takasu:

TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images. 8844-8852 - Roberto Amoroso, Gengyuan Zhang, Rajat Koner, Lorenzo Baraldi, Rita Cucchiara, Volker Tresp:

Perceive. Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries. 8853-8862 - Anindya Sundar Das, Guansong Pang, Monowar Bhuyan:

Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination. 8863-8872 - Yizhe Ruan, Lin Gu

, Yusuke Kurose, Junichi Iho, Youji Tokunaga, Makoto Horie, Yusaku Hayashi, Keisuke Nishizawa, Yasushi Koyama, Tatsuya Harada:
Physiology-Aware PolySnake for Coronary Vessel Segmentation. 8873-8882 - Bowen Jiang, Zhijun Zhuang, Shreyas S. Shivakumar, Camillo J. Taylor:

Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge. 8883-8894 - Anis Amziane:

Learning Deep Illumination-Robust Features from Multispectral Filter Array Images. 8895-8904 - Farhad G. Zanjani, Hong Cai, Hanno Ackermann, Leyla Mirvakhabova, Fatih Porikli:

Planar Gaussian Splatting. 8905-8914 - Takuya Asakura, Nakamasa Inoue, Koichi Shinoda:

Diffusion-Based Generative Regularization for Supervised Discriminative Learning. 8915-8926 - Skanda Bharadwaj, Robert T. Collins, Yanxi Liu:

Recurrence-Based Vanishing Point Detection. 8927-8936 - Geonu Lee, Yonghyun Jeong, Haneol Jang, Youngjoon Yoo:

Domain-Generalized Object Anti-Spoofing: Bridging Gaps and Patch Selection for Robust Detection Across Domains. 8937-8946 - Juho Jung, Migyeong Yang, Hyunseon Won, Jiwon Kim, Jeong Mo Han, Joon Seo Hwang, Daniel Duck-Jin Hwang

, Jinyoung Han:
CAMEL: Confidence-Aware Multi-Task Ensemble Learning with Spatial Information for Retina OCT Image Classification and Segmentation. 8947-8957 - Sangyeon Kim, Sangkuk Lee, Jeesoo Kim, Nojun Kwak:

TPD-STR: Text Polygon Detection with Split Transformers. 8958-8967 - Hila Levi, Guy Heller, Dan Levi:

FOR: Finetuning for Object Level Open Vocabulary Image Retrieval. 8968-8979 - Simon Thomine, Hichem Snoussi:

Single-Layer Distillation with Fourier Convolutions for Texture Anomaly Detection. 8980-8989 - Kha Nhat Le

, Hoang-Tuan Nguyen, Hung Tien Tran, Thanh Duc Ngo:
Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition. 8990-9000 - Suhas Srinath, Aditya Chandrasekar, Hemang Jamadagni, Rajiv Soundararajan, Prathosh AP:

UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors. 9001-9012 - Maxime Fontana, Michael W. Spratling, Miaojing Shi:

Optimizing Dense Visual Predictions Through Multi-Task Coherence and Prioritization. 9013-9022 - Heitor Rapela Medeiros, David Latortue, Eric Granger, Marco Pedersoli:

Mixed Patch Visible-Infrared Modality Agnostic Object Detection. 9023-9032 - Ioannis Sarridis, Christos Koutlis, Giorgos Kordopatis-Zilos, Ioannis Kompatsiaris, Symeon Papadopoulos:

InDistill: Information flow-preserving knowledge distillation for model compression. 9033-9042 - Rohit K. Bharadwaj, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan:

Enhancing Novel Object Detection via Cooperative Foundational Models. 9043-9052 - Yoko Sogabe, Shiori Sugimoto, Ayumi Matsumoto, Masaki Kitahara:

Pre-capture Privacy via Adaptive Single-Pixel Imaging. 9053-9062 - Floyd Hepburn-Dickins

, Mark W. Jones, Mike Edwards, Jay Paul Morgan, Steve Bell:
SIGNN - Star Identification Using Graph Neural Networks. 9063-9072 - Seunghwan Choi, Jooyeol Yun, Jeonghoon Park, Jaegul Choo:

Disentangling Subject-Irrelevant Elements in Personalized Text-to-Image Diffusion via Filtered Self-Distillation. 9073-9082 - Wei-Jhe Huang, Min-Hung Chen, Shang-Hong Lai:

Spatio-Temporal Context Prompting for Zero-Shot Action Detection. 9083-9092 - Sachin Verma, Frank Lindseth, Gabriel Kiss:

SegDesicNet: Lightweight Semantic Segmentation in Remote Sensing with Geo-Coordinate Embeddings for Domain Adaptation. 9093-9104 - Arushi Rai, Adriana Kovashka:

Rubric-Constrained Figure Skating Scoring. 9105-9113 - Junyoung Hong, Hyeri Yang, Ye Ju Kim, Haerim Kim, Shinwoong Kim

, Euna Shim, Kyungjae Lee
:
D2FP: Learning Implicit Prior for Human Parsing. 9114-9124 - Aniket Roy, Anshul Shah, Ketul Shah, Anirban Roy, Rama Chellappa:

Cap2Aug: Caption Guided Image data Augmentation. 9125-9135 - Zijun He, Lishun Wang, Ziyi Meng, Xin Yuan:

Self-supervised Learning with Spectral Low-Rank Prior for Hyperspectral Image Reconstruction. 9136-9145 - Roman Colman, Minh Vu, Manish Bhattarai, Martin Ma, Hari S. Viswanathan, Daniel O'Malley, Javier E. Santos:

PatchFinder: Leveraging Visual Language Models for Accurate Information Retrieval Using Model Uncertainty. 9146-9155 - Debanjan Goswami, Shayok Chakraborty:

Active Learning for Image Segmentation with Binary User Feedback. 9156-9165 - Shin Ishihara, Imari Sato:

Per-Pixel Solution of Multispectral Photometric Stereo. 9166-9175 - Sanaz Karimijafarbigloo, Sina Ghorbani Kolahi, Reza Azad, Ulas Bagci, Dorit Merhof:

Frequency-Domain Refinement of Vision Transformers for Robust Medical Image Segmentation Under Degradation. 9176-9185 - Anirban Roy, Adam D. Cobb, Ramneet Kaur, Sumit Jha, Nathaniel D. Bastian, Alexander M. Berenbeim, Robert H. Thomson, Iain Cruickshank, Alvaro Velasquez, Susmit Jha:

Zero-Shot Detection of Out-of-Context Objects Using Foundation Models. 9186-9195 - Mujing Li, Guanjie Wang, Xingguang Zhang, Qifeng Liao, Chenxi Xiao:

D-LUT: Photorealistic Style Transfer via Diffusion Process. 9206-9214 - Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

:
Situational Scene Graph for Structured Human-Centric Situation Understanding. 9215-9225 - Zhuo Cao, Bingqing Zhang, Heming Du, Xin Yu

, Xue Li
, Sen Wang
:
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. 9226-9236 - David Pujol-Perich, Albert Clapés, Sergio Escalera:

SADA: Semantic Adversarial Unsupervised Domain Adaptation for Temporal Action Localization. 9237-9247 - Jingyu Song, Xudong Chen, Liupei Lu, Jie Li, Katherine A. Skinner:

MemFusionMap: Working Memory Fusion for Online Vectorized HD Map Construction. 9248-9257 - Tz-Ying Wu, Kyle Min, Subarna Tripathi, Nuno Vasconcelos:

Ego-VPA: Egocentric Video Understanding with Parameter-Efficient Adaptation. 9258-9268 - Sai Bhargav Rongali, Mohamad Hassan N C, Ankit Jha, Neha Bhargava, Saurabh Prasad, Biplab Banerjee:

Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering. 9269-9279 - Liang Chen, Weihua Chen, Xin Zhao

, Junyan Wang, Lijun Cao, Junge Zhang:
Distribution Optimization Under Gaussian Hypothesis for Domain Adaptive Semantic Segmentation. 9280-9290 - Lucas Jaffe

, Avideh Zakhor:
Swap Path Network for Robust Person Search Pre-training. 9291-9301 - Frano Rajic, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu:

Segment Anything Meets Point Tracking. 9302-9311 - Raghavendra Ramachandra, Sushma Venkatesh, Guoqiang Li:

PoolAtnRes: Towards Generalisable Differential Morphing Attack Detection. 9312-9321 - Jeffri Murrugarra-Llerena

, Cláudio R. Jung:
Noise-Aware Evaluation of Object Detectors. 9322-9331 - Yidan Shen, Yu Wen, Chen Zhang, Xin Fu, Renjie Hu:

MVMD: A Multi-View Approach for Enhanced Mirror Detection. 9332-9341 - Jiaoyang Yin, Bin Fan, Chao Xu, Tiejun Huang, Boxin Shi

:
Spk2ImgMamba: Spiking Camera Image Reconstruction with Multi-Scale State Space Models. 9342-9352 - Hanning Chen, Yang Ni, Wenjun Huang, Yezi Liu, Sungheon Jeong, Fei Wen, Nathaniel D. Bastian, Hugo Latapie, Mohsen Imani:

VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation. 9353-9363 - Jimut B. Pal

, Shantanu Welling, Himali Saini, Suyash P. Awate:
Reviving Poor Object Segmentations in OOD Medical Images using Variational-Deep-PCA Modeling on Segmentation Maps with Sampling-Free Learning. 9364-9373 - Sungyeon Kim, Donghyun Kim, Suha Kwak:

Learning Unified Distance Metric Across Diverse Data Distributions with Parameter-Efficient Transfer Learning. 9374-9384 - Yue Ma, Xiaodong Cun, Sen Liang, Jinbo Xing, Yingqing He, Chenyang Qi, Siran Chen, Qifeng Chen:

MagicStick: Controllable Video Editing via Control Handle Transformations. 9385-9395 - Zehua Cheng, Di Yuan, Wenhu Zhang, Thomas Lukasiewicz:

Effective and Efficient Medical Image Segmentation with Hierarchical Context Interaction. 9396-9405 - Jeongseok Hyun, Su Ho Han, Hyolim Kang, Joon-Young Lee, Seon Joo Kim:

Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization. 9406-9415 - Junwen Chen, Yingcheng Wang, Keiji Yanai:

Focusing on what to Decode and what to Train: SOV Decoding with Specific Target Guided DeNoising and Vision Language Advisor. 9416-9425 - Hayoung Park

, Choongsang Cho, Guisik Kim:
On the Importance of Dual-Space Augmentation for Domain Generalized Object Detection. 9426-9436 - Junbo Jang, Chanyeong Park, Heegwang Kim, Jiyoon Lee, Joonki Paik:

Multispectral Object Detection Enhanced by Cross-Modal Information Complementary and Cosine Similarity Channel Resampling Modules. 9437-9446 - Jiin Im, Yongho Son, Je Hyeong Hong:

FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data. 9447-9456 - Subhajit Paul, Sahil Kumawat, Ashutosh Gupta

, Deepak Mishra
:
F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring. 9457-9467 - Ram J. Zaveri, Shivang Patel, Yu Gu, Gianfranco Doretto:

Improving Accuracy and Generalization for Efficient Visual Tracking. 9468-9478 - Jiwon Yoo, Dami Ko, Gyeonghwan Kim:

CCASeg: Decoding Multi-Scale Context with Convolutional Cross-Attention for Semantic Segmentation. 9479-9488 - Abbas Khan, Muhammad Asad, Martin Benning, Caroline H. Roney, Gregory G. Slabaugh:

Compositional Segmentation of Cardiac Images Leveraging Metadata. 9489-9498 - Rakesh Raj Madavan, Akshat Kaimal, Badhrinarayanan K. V, Vinayak Gupta, Rohit Choudhary, Chandrakala Shanmuganathan, Kaushik Mitra:

GANESH: Generalizable NeRF for Lensless Imaging. 9499-9508 - Hoonhee Cho, Jae-Young Kang, Taewoo Kim, Yuhwan Jeong, Kuk-Jin Yoon:

Unifying Low-Resolution and High-Resolution Alignment by Event Cameras for Space-Time Video Super-Resolution. 9509-9520 - Reza Ghoddoosian, Nakul Agarwal, Isht Dwivedi, Behzad Darisuh:

ACE: Action Concept Enhancement of Video-Language Models in Procedural Videos. 9521-9531 - Meng Ye, Bingyu Xin, Leon Axel, Dimitris N. Metaxas:

Continuous Spatio-Temporal Memory Networks for 4D Cardiac Cine MRI Segmentation. 9532-9542 - Badri N. Patro, Vinay P. Namboodiri, Vijay Srinivas Agneeswaran:

SpectFormer: Frequency and Attention is what you need in a Vision Transformer. 9543-9554 - Masatoshi Tateno, Takuma Yagi, Ryosuke Furuta, Yoichi Sato:

Learning Multiple Object States from Actions via Large Language Models. 9555-9565 - Yijie Hu, Guanyu Yang, Zhaorui Tan

, Xiaowei Huang, Kaizhu Huang, Qiufeng Wang:
Covariance-Based Space Regularization for Few-Shot Class Incremental Learning. 9566-9576 - Jin-Cheng Jhang, Tao Tu, Fu-En Wang, Ke Zhang, Min Sun, Cheng-Hao Kuo:

V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations. 9577-9586 - Chen Xu

, Chunguo Li, Hongjie Xing:
Discriminative Score Suppression for Weakly Supervised Video Anomaly Detection. 9587-9596 - Amartya Roy Chowdhury, Raghuram Bharadwaj Diddigi

, Prabuchandran K. J., Achyut Mani Tripathi:
Bandit-based Attention Mechanism in Vision Transformers. 9597-9606 - Zhefan Rao, Tianjia Zhang, Yuen-Fui Lau, Qifeng Chen:

Robust Portrait Image Matting and Depth-of-field Synthesis via Multiplane Images. 9607-9617 - Md Raqib Khan, Anshul Negi, Ashutosh Kulkarni, Shruti S. Phutke, Santosh Kumar Vipparthi

, Subrahmanyam Murala:
Phaseformer: Phase-Based Attention Mechanism for Underwater Image Restoration and Beyond. 9618-9629 - Vaibhav Vavilala, Faaris Shaik, David A. Forsyth:

Dequantization and Color Transfer with Diffusion Models. 9630-9639 - Mingchen Xu, Peter Herbert, Yu-Kun Lai, Ze Ji, Jing Wu:

RGB-D Video Mirror Detection. 9640-9649 - Liuyue Xie, Jiancong Guo, László A. Jeni, Zhiheng Jia, Mingyang Li, Yunwen Zhou, Chao Guo:

Through the Curved Cover: Synthesizing Cover Aberrated Scenes with Refractive Field. 9650-9659 - Min Jin Chong, Dejia Xu, Yi Zhang, Zhangyang Wang, David A. Forsyth, Gurunandan Krishnan, Yicheng Wu, Jian Wang:

Copy or Not? Reference-Based Face Image Restoration with Fine Details. 9660-9669 - A S. M. Iftekhar, Raphael Ruschel, Satish Kumar, Suya You, B. S. Manjunath:

DDS: Decoupled Dynamic Scene-Graph Generation Network. 9670-9680 - João P. K. Ferreira, João P. L. Pinto, Júlia S. Moura, Yi Li

, Cristiano Leite Castro, Plamen Angelov
:
Vision-Based Landing Guidance Through Tracking and Orientation Estimation. 9681-9689 - Abhisek Ray, Ayush Raj, Maheshkumar H. Kolekar:

Autoregressive Adaptive Hypergraph Transformer for Skeleton-Based Activity Recognition. 9690-9699 - Minje Kim, Minjun Kim, Xu Yang:

DTA: Dual Temporal-channel-wise Attention for Spiking Neural Networks. 9700-9710

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














