


default search action
18th ECCV 2024: Milan, Italy - Part X
- Ales Leonardis

, Elisa Ricci
, Stefan Roth
, Olga Russakovsky
, Torsten Sattler
, Gül Varol
:
Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part X. Lecture Notes in Computer Science 15068, Springer 2025, ISBN 978-3-031-72683-5 - Chao Huang, Dejan Markovic, Chenliang Xu, Alexander Richard:

Modeling and Driving Human Body Soundfields Through Acoustic Primitives. 1-17 - Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta

, Ranjay Krishna:
m &m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks. 18-34 - Jinxing Zhou

, Dan Guo
, Yuxin Mao, Yiran Zhong, Xiaojun Chang
, Meng Wang
:
Label-Anticipated Event Disentanglement for Audio-Visual Video Parsing. 35-51 - Qi Zuo, Xiaodong Gu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Lingteng Qiu, Liefeng Bo, Zilong Dong:

High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding. 52-69 - Hongtao Wu

, Yijun Yang
, Angelica I. Avilés-Rivero
, Jingjing Ren
, Sixiang Chen
, Haoyu Chen
, Lei Zhu
:
Semi-supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization. 70-89 - Xiaobao Wei

, Jiajun Cao, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang:
I-MedSAM: Implicit Medical Image Segmentation with Segment Anything. 90-107 - Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya Zhang, Yanfeng Wang:

ReMamber: Referring Image Segmentation with Mamba Twister. 108-126 - Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, Lin Gu

:
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting. 127-145 - Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao:

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios. 146-164 - Hengyu Zhou

, Hui Zhang
, Bin Wang
:
Segmentation-Guided Layer-Wise Image Vectorization with Gradient Fills. 165-180 - Yarden Frenkel, Yael Vinker, Ariel Shamir, Daniel Cohen-Or:

Implicit Style-Content Separation Using B-LoRA. 181-198 - Zijian Zhou, Zheng Zhu, Holger Caesar, Miaojing Shi:

OpenPSG: Open-Set Panoptic Scene Graph Generation via Large Multimodal Models. 199-215 - Liangyang Ouyang

, Ruicong Liu
, Yifei Huang
, Ryosuke Furuta
, Yoichi Sato
:
ActionVOS: Actions as Prompts for Video Object Segmentation. 216-235 - Jiedong Zhuang, Jiaqi Hu, Lianrui Mu, Rui Hu, Xiaoyu Liang, Jiangnan Ye, Haoji Hu:

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance. 236-253 - Li Zhang, Weiqing Meng, Yan Zhong, Bin Kong, Mingliang Xu, Jianming Du, Xue Wang, Rujing Wang, Liu Liu:

U-COPE: Taking a Further Step to Universal 9D Category-Level Object Pose Estimation. 254-270 - Naiyu Yin

, Hanjing Wang
, Yue Yu
, Tian Gao
, Amit Dhurandhar
, Qiang Ji
:
Integrating Markov Blanket Discovery Into Causal Representation Learning for Domain Generalization. 271-288 - Byeongho Heo

, Song Park
, Dongyoon Han
, Sangdoo Yun
:
Rotary Position Embedding for Vision Transformer. 289-305 - Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim, Joon-Young Lee:

Local All-Pair Correspondence for Point Tracking. 306-325 - Youngmin Oh

, Hyung-Il Kim
, Seong Tae Kim
, Jung Uk Kim
:
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection. 326-345 - Taewoong Kim

, Cheolhong Min
, Byeonghwi Kim
, Jinyeon Kim
, Wonje Jeung
, Jonghyun Choi
:
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments. 346-364 - Dongze Li, Kang Zhao, Wei Wang

, Yifeng Ma, Bo Peng, Yingya Zhang, Jing Dong:
S3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis. 365-382 - Hyolim Kang

, Jeongseok Hyun
, Joungbin An
, Youngjae Yu
, Seon Joo Kim
:
ActionSwitch: Class-Agnostic Detection of Simultaneous Actions in Streaming Videos. 383-400 - Subin Jeon

, In Cho
, Minsu Kim
, Woong Oh Cho
, Seon Joo Kim
:
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos. 401-419 - Xiaoyu Liu, Xin Ding, Lei Yu

, Yuanyuan Xi, Wei Li, Zhijun Tu, Jie Hu, Hanting Chen, Baoqun Yin, Zhiwei Xiong:
PQ-SAM: Post-training Quantization for Segment Anything Model. 420-437 - Yuanhong Chen

, Chong Wang, Yuyuan Liu
, Hu Wang, Gustavo Carneiro
:
CPM: Class-Conditional Prompting Machine for Audio-Visual Segmentation. 438-456 - Shreyank N. Gowda, Anurag Arnab, Jonathan Huang:

Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition. 457-474 - Jiuming Liu, Dong Zhuo

, Zhiheng Feng
, Siting Zhu
, Chensheng Peng
, Zhe Liu
, Hesheng Wang
:
DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-directional Structure Alignment. 475-493

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














