


default search action
31st ACM Multimedia 2023: Ottawa, ON, Canada
- Abdulmotaleb El Saddik, Tao Mei, Rita Cucchiara, Marco Bertini, Diana Patricia Tobon Vallejo, Pradeep K. Atrey, M. Shamim Hossain:

Proceedings of the 31st ACM International Conference on Multimedia, MM 2023, Ottawa, ON, Canada, 29 October 2023- 3 November 2023. ACM 2023
Keynote Talks
- Chang Wen Chen

:
Internet of Video Things: Technical Challenges and Emerging Applications. 1-2 - Alejandro Jaimes

:
Multimodal AI & LLMs for Peacekeeping and Emergency Response. 3-4 - Ralf Steinmetz

:
Transition and Adaptability: The Cornerstone of Resilience in Future Networked Multimedia Systems and Beyond. 5-6
Oral Session I: Understanding Multimedia Content -- Media Interpretation
- Hao Shen

, Zhong-Qiu Zhao
, Yulun Zhang
, Zhao Zhang
:
Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing. 7-16 - Yang Jiao

, Zequn Jie
, Jingjing Chen
, Lin Ma
, Yu-Gang Jiang
:
Suspected Objects Matter: Rethinking Model's Prediction for One-stage Visual Grounding. 17-26 - Sophyani Banaamwini Yussif

, Ning Xie
, Yang Yang
, Heng Tao Shen
:
Self-Relational Graph Convolution Network for Skeleton-Based Action Recognition. 27-36 - Qian Ning

, Fangfang Wu
, Weisheng Dong
, Xin Li
, Guangming Shi
:
Exploring Correlations in Degraded Spatial Identity Features for Blind Face Restoration. 37-45 - Chuhao Zhou

, Jinxing Li
, Huafeng Li
, Guangming Lu
, Yong Xu
, Min Zhang
:
Video-based Visible-Infrared Person Re-Identification via Style Disturbance Defense and Dual Interaction. 46-55 - Wenmiao Hu

, Yichen Zhang
, Yuxuan Liang
, Xianjing Han
, Yifang Yin
, Hannes Kruppa
, See-Kiong Ng
, Roger Zimmermann
:
PetalView: Fine-grained Location and Orientation Extraction of Street-view Images via Cross-view Local Search. 56-66 - Haorui Wang

, Yibo Hu
, Yangfu Zhu
, Jinsheng Qi
, Bin Wu
:
Shifted GCN-GAT and Cumulative-Transformer based Social Relation Recognition for Long Videos. 67-76 - Jilong Wang

, Saihui Hou
, Yan Huang
, Chunshui Cao
, Xu Liu
, Yongzhen Huang
, Liang Wang
:
Causal Intervention for Sparse-View Gait Recognition. 77-85 - Digbalay Bose

, Rajat Hebbar
, Tiantian Feng
, Krishna Somandepalli
, Anfeng Xu
, Shrikanth Narayanan
:
MM-AU: Towards Multimodal Understanding of Advertisement Videos. 86-95 - Huiwei Lin

, Shanshan Feng
, Baoquan Zhang
, Hongliang Qiao
, Xutao Li
, Yunming Ye
:
UER: A Heuristic Bias Addressing Approach for Online Continual Learning. 96-104 - Peng Wu

, Xiankai Lu
, Jianbing Shen
, Yilong Yin
:
Clip Fusion with Bi-level Optimization for Human Mesh Reconstruction from Monocular Videos. 105-115 - Jinkai Zheng

, Xinchen Liu
, Shuai Wang
, Lihao Wang
, Chenggang Yan
, Wu Liu
:
Parsing is All You Need for Accurate Gait Recognition in the Wild. 116-124 - Dingyi Zhang

, Yingming Li
, Zhongfei Zhang
:
Multi-Scale Similarity Aggregation for Dynamic Metric Learning. 125-134 - Yue Feng

, Zhengye Zhang
, Rong Quan
, Limin Wang
, Jie Qin
:
RefineTAD: Learning Proposal-free Refinement for Temporal Action Detection. 135-143 - Zhenguang Liu

, Xinyang Yu
, Ruili Wang
, Shuai Ye
, Zhe Ma
, Jianfeng Dong
, Sifeng He
, Feng Qian
, Xiaobo Zhang
, Roger Zimmermann
, Lei Yang
:
Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization. 144-152 - Dongbao Yang

, Yu Zhou
, Xiaopeng Hong
, Aoting Zhang
, Xin Wei
, Linchengxi Zeng
, Zhi Qiao
, Weiping Wang
:
Pseudo Object Replay and Mining for Incremental Object Detection. 153-162 - Shiqin Wang

, Xin Xu
, Xianzheng Ma
, Kui Jiang
, Zheng Wang
:
Informative Classes Matter: Towards Unsupervised Domain Adaptive Nighttime Semantic Segmentation. 163-172 - Ye Tian

, Mengyu Yang
, Lanshan Zhang
, Zhizhen Zhang
, Yang Liu
, Xiaohui Xie
, Xirong Que
, Wendong Wang
:
View while Moving: Efficient Video Recognition in Long-untrimmed Videos. 173-183 - Yimin Deng

, Huaizhen Tang
, Xulong Zhang
, Jianzong Wang
, Ning Cheng
, Jing Xiao
:
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion. 184-192 - Gege Shi

, Xueyang Fu
, Chengzhi Cao
, Zheng-Jun Zha
:
Alleviating Spatial Misalignment and Motion Interference for UAV-based Video Recognition. 193-202 - Yang Liu

, Zhaoyang Xia
, Mengyang Zhao
, Donglai Wei
, Yuzheng Wang
, Siao Liu
, Bobo Ju
, Gaoyun Fang
, Jing Liu
, Liang Song
:
Learning Causality-inspired Representation Consistency for Video Anomaly Detection. 203-212 - Dongyue Guo

, Yi Lin
, Xuehang You
, Zhongping Yang
, Jizhe Zhou
, Bo Yang
, Jianwei Zhang
, Han Shi
, Shasha Hu
, Zheng Zhang
:
M2ATS: A Real-world Multimodal Air Traffic Situation Benchmark Dataset and Beyond. 213-221 - Jianghu Lu

, Shikun Li
, Kexin Bao
, Pengju Wang
, Zhenxing Qian
, Shiming Ge
:
Federated Learning with Label-Masking Distillation. 222-232 - Lingxiao Lu

, Jiangtong Li
, Junyan Cao
, Li Niu
, Liqing Zhang
:
Painterly Image Harmonization using Diffusion Model. 233-241 - Xingran Xie

, Ting Jin
, Boxiang Yun
, Qingli Li
, Yan Wang
:
Exploring Hyperspectral Histopathology Image Segmentation from a Deformable Perspective. 242-251 - Runhua Jiang

, Yahong Han
:
Uncertainty-Aware Variate Decomposition for Self-supervised Blind Image Deblurring. 252-260
Oral Session II: Understanding Multimedia Content -- Multimodal Fusion and Embedding
- Chao Sun

, Min Chen
, Jialiang Cheng
, Han Liang
, Chuanbo Zhu
, Jincai Chen
:
SCLAV: Supervised Cross-modal Contrastive Learning for Audio-Visual Coding. 261-270 - Feng Lin

, Kaiqiang Fu
, Hao Luo
, Ziyue Zhan
, Zhibo Wang
, Zhenguang Liu
, Lorenzo Cavallaro
, Kui Ren
:
Cross-Modal and Multi-Attribute Face Recognition: A Benchmark. 271-279 - Ye Wang

, Junyang Chen
, Mengzhu Wang, Hao Li
, Wei Wang, Houcheng Su, Zhihui Lai
, Wei Wang, Zhenghan Chen
:
A Closer Look at Classifier in Adversarial Domain Generalization. 280-289 - Mengzhu Wang

, Jianlong Yuan
, Zhibin Wang
:
Mixture-of-Experts Learner for Single Long-Tailed Domain Generalization. 290-299 - Chao Zhang

, Jingwen Wei
, Bo Wang
, Zechao Li
, Chunlin Chen
, Huaxiong Li
:
Robust Spectral Embedding Completion Based Incomplete Multi-view Clustering. 300-308 - Jinhui Pang

, Zixuan Wang
, Jiliang Tang
, Mingyan Xiao
, Nan Yin
:
SA-GDA: Spectral Augmentation for Graph Domain Adaptation. 309-318 - Xihong Yang

, Cheng Tan
, Yue Liu
, Ke Liang
, Siwei Wang
, Sihang Zhou
, Jun Xia
, Stan Z. Li
, Xinwang Liu
, En Zhu
:
CONVERT: Contrastive Graph Clustering with Reliable Augmentation. 319-327 - Jintian Ji

, Songhe Feng
:
High-order Complementarity Induced Fast Multi-View Clustering with Enhanced Tensor Rank Minimization. 328-336 - Xihong Yang

, Jiaqi Jin
, Siwei Wang
, Ke Liang
, Yue Liu
, Yi Wen
, Suyuan Liu
, Sihang Zhou
, Xinwang Liu
, En Zhu
:
DealMVC: Dual Contrastive Calibration for Multi-view Clustering. 337-346 - Junming Hou

, Qi Cao
, Ran Ran
, Che Liu
, Junling Li
, Liang-Jian Deng
:
Bidomain Modeling Paradigm for Pansharpening. 347-357 - Yingying Wang

, Yunlong Lin
, Ge Meng
, Zhenqi Fu
, Yuhang Dong
, Linyu Fan
, Hedeng Yu
, Xinghao Ding
, Yue Huang:
Learning High-frequency Feature Enhancement and Alignment for Pan-sharpening. 358-367 - Xingfeng Li

, Yinghui Sun
, Quansen Sun
, Jia Dai
, Zhenwen Ren
:
Distribution Consistency based Fast Anchor Imputation for Incomplete Multi-view Clustering. 368-376 - Yushen Wei

, Yang Liu
, Hong Yan
, Guanbin Li
, Liang Lin
:
Visual Causal Scene Refinement for Video Question Answering. 377-386 - Hongye Liu

, Xianhai Xie
, Yang Gao
, Zhou Yu
:
Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks. 387-396 - Xi Chen

, Yun Xiong
, Siqi Wang
, Haofen Wang
, Tao Sheng
, Yao Zhang
, Yu Ye
:
ReCo: A Dataset for Residential Community Layout Planning. 397-405 - Runmin Cong

, Hongyu Liu
, Chen Zhang
, Wei Zhang
, Feng Zheng
, Ran Song
, Sam Kwong
:
Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection. 406-416 - Jinrong Cui

, Yuting Li
, Yulu Fu
, Jie Wen
:
Multi-view Self-Expressive Subspace Clustering Network. 417-425 - Jian Huang

, Yanli Ji
, Yang Yang
, Heng Tao Shen
:
Cross-modality Representation Interactive Learning for Multimodal Sentiment Analysis. 426-434 - Yixuan Ma

, Xiaolin Zhang
, Peng Zhang
, Kun Zhan
:
Entropy Neural Estimation for Graph Contrastive Learning. 435-443 - Liguo Zhang

, Zilin Tian
, Yunfei Long
, Sizhao Li
, Guisheng Yin
:
Cross-modal and Cross-medium Adversarial Attack for Audio. 444-453 - Liang Peng

, Xin Wang
, Xiaofeng Zhu
:
Unsupervised Multiplex Graph learning with Complementary and Consistent Information. 454-462 - Yixuan Wu

, Jintai Chen
, Jiahuan Yan
, Yiheng Zhu
, Danny Z. Chen
, Jian Wu
:
GCL: Gradient-Guided Contrastive Learning for Medical Image Segmentation with Multi-Perspective Meta Labels. 463-471 - Zhiying Jiang

, Zengxi Zhang
, Jinyuan Liu
, Xin Fan
, Risheng Liu
:
Multi-Spectral Image Stitching via Spatial Graph Reasoning. 472-480 - Jiaming Zhuo

, Can Cui
, Kun Fu
, Bingxin Niu
, Dongxiao He
, Yuanfang Guo
, Zhen Wang
, Chuan Wang
, Xiaochun Cao
, Liang Yang
:
Propagation is All You Need: A New Framework for Representation Learning and Classifier Training on Graphs. 481-489 - Yao Wu

, Mingwei Xing, Yachao Zhang
, Yuan Xie
, Jianping Fan
, Zhongchao Shi
, Yanyun Qu
:
Cross-modal Unsupervised Domain Adaptation for 3D Semantic Segmentation via Bidirectional Fusion-then-Distillation. 490-498
Oral Session III: Understanding Multimedia Content -- Vision and Language
- Yinjie Zhao

, Lichen Zhao
, Qian Yu
, Lu Sheng
, Jing Zhang
, Dong Xu
:
Distortion-aware Transformer in 360° Salient Object Detection. 499-508 - Zixiao Wang

, Hongtao Xie
, Yuxin Wang
, Jianjun Xu
, Boqiang Zhang
, Yongdong Zhang
:
Symmetrical Linguistic Feature Distillation with CLIP for Scene Text Recognition. 509-518 - Bo Zou

, Chao Yang
, Chengbin Quan
, Youjian Zhao
:
SpaceCLIP: A Vision-Language Pretraining Framework With Spatial Reconstruction On Text. 519-528 - Xu Huang

, Jin Liu
, Zhizhong Zhang
, Yuan Xie
:
Improving Cross-Modal Recipe Retrieval with Component-Aware Prompted CLIP Embedding. 529-537 - Shuhan Kong

, Liang Li
, Beichen Zhang
, Wenyu Wang
, Bin Jiang
, Chenggang Yan
, Changhao Xu
:
Dynamic Contrastive Learning with Pseudo-samples Intervention for Weakly Supervised Joint Video MR and HD. 538-546 - Zheng Yuan, Qiao Jin

, Chuanqi Tan, Zhengyun Zhao, Hongyi Yuan
, Fei Huang, Songfang Huang:
RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training. 547-556 - Xiao Wang

, Yaoyu Li
, Tian Gan
, Zheng Zhang
, Jingjing Lv
, Liqiang Nie
:
RTQ: Rethinking Video-language Understanding Based on Image-text Model. 557-566 - Shanshan Zhong

, Zhongzhan Huang
, Wushao Wen
, Jinghui Qin
, Liang Lin
:
SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models. 567-578 - Xin Dong

, Rui Wang
, Siyuan Liang
, Aishan Liu
, Lihua Jing
:
Face Encryption via Frequency-Restricted Identity-Agnostic Attacks. 579-588 - Peipei Song

, Dan Guo
, Xun Yang
, Shengeng Tang
, Erkun Yang
, Meng Wang
:
Emotion-Prior Awareness Network for Emotional Video Captioning. 589-600 - Dong Liu

, Qirong Mao
, Lijian Gao
, Qinghua Ren
, Zhenghan Chen
, Ming Dong
:
TE-KWS: Text-Informed Speech Enhancement for Noise-Robust Keyword Spotting. 601-610 - Jiancheng Pan

, Qing Ma
, Cong Bai
:
A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval. 611-620 - Nirmalendu Prakash

, Han Wang
, Nguyen-Khoi Hoang
, Ming Shan Hee
, Roy Ka-Wei Lee
:
PromptMTopic: Unsupervised Multimodal Topic Modeling of Memes using Large Language Models. 621-631 - Yue Lv, Jinxi Xiang, Jun Zhang

, Wenming Yang, Xiao Han
, Wei Yang:
Dynamic Low-Rank Instance Adaptation for Universal Neural Image Compression. 632-642 - Leigang Qu

, Shengqiong Wu
, Hao Fei
, Liqiang Nie
, Tat-Seng Chua
:
LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation. 643-654 - Yue Zhang

, Suchen Wang
, Shichao Kan
, Zhenyu Weng
, Yigang Cen
, Yap-Peng Tan
:
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition. 655-665 - Shengshan Hu

, Wei Liu
, Minghui Li
, Yechao Zhang
, Xiaogeng Liu
, Xianlong Wang
, Leo Yu Zhang
, Junhui Hou
:
PointCRT: Detecting Backdoor in 3D Point Cloud via Corruption Robustness. 666-675 - Rui Qin

, Ming Sun
, Fangyuan Zhang
, Xing Wen
, Bin Wang
:
Blind Image Super-resolution with Rich Texture-Aware Codebook. 676-687 - Zizhang Wu

, Zhuozheng Li
, Zhi-Gang Fan
, Yunzhe Wu
, Jian Pu
, Xianzhi Li
:
V2Depth: Monocular Depth Estimation via Feature-Level Virtual-View Simulation and Refinement. 688-697 - Kai Chen

, Zhipeng Wei
, Jingjing Chen
, Zuxuan Wu
, Yu-Gang Jiang
:
GCMA: Generative Cross-Modal Transferable Adversarial Attacks from Images to Videos. 698-708 - Lianyu Hu

, Liqing Gao
, Zekang Liu
, Chi-Man Pun
, Wei Feng
:
AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition. 709-718 - Lingfeng Li

, Gangming Zhao
, Yizhou Yu
, Jinpeng Li
:
Dynamic Triple Reweighting Network for Automatic Femoral Head Necrosis Diagnosis from Computed Tomography. 719-727 - Liu Liu, Jianming Du

, Hao Wu
, Xun Yang
, Zhenguang Liu
, Richang Hong
, Meng Wang
:
Category-Level Articulated Object 9D Pose Estimation via Reinforcement Learning. 728-736 - Qichao Ying

, Jiaxin Liu
, Sheng Li
, Haisheng Xu
, Zhenxing Qian
, Xinpeng Zhang
:
RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection. 737-746 - Xueyi Zhang

, Chengwei Zhang
, Tao Wang
, Jun Tang
, Songyang Lao
, Haizhou Li
:
Slow-Fast Time Parameter Aggregation Network for Class-Incremental Lip Reading. 747-756 - Yang Bai

, Jingyao Wang
, Min Cao
, Chen Chen
, Ziqiang Cao
, Liqiang Nie
, Min Zhang
:
Text-based Person Search without Parallel Image-Text Data. 757-767 - Jiawei Liang

, Siyuan Liang
, Aishan Liu
, Ke Ma
, Jingzhi Li
, Xiaochun Cao
:
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation. 768-778 - Sun'ao Liu

, Yiheng Zhang
, Zhaofan Qiu
, Hongtao Xie
, Yongdong Zhang
, Ting Yao
:
CARIS: Context-Aware Referring Image Segmentation. 779-788 - Shizhou Zhang

, Qingchun Yang
, De Cheng
, Yinghui Xing
, Guoqiang Liang
, Peng Wang
, Yanning Zhang
:
Ground-to-Aerial Person Search: Benchmark Dataset and Approach. 789-799 - Fan Jiang

, Zilei Wang
:
Sparse Sharing Relation Network for Panoptic Driving Perception. 800-808
Oral Session IV: Engaging Users with Multimedia -- Emotional and Social Signals
- Daoming Zong

, Chaoyue Ding
, Baoxiang Li
, Jiakui Li
, Ken Zheng
, Qunyan Zhou
:
AcFormer: An Aligned and Compact Transformer for Multimodal Sentiment Analysis. 833-842 - Zeng Tao

, Yan Wang
, Zhaoyu Chen
, Boyang Wang
, Shaoqi Yan
, Kaixun Jiang
, Shuyong Gao
, Wenqiang Zhang
:
Freq-HD: An Interpretable Frequency-based High-Dynamics Affective Clip Selection Method for in-the-Wild Facial Expression Recognition in Videos. 843-852 - Peiguang Jing

, Xianyi Liu
, Ji Wang
, Yinwei Wei
, Liqiang Nie
, Yuting Su
:
StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning. 853-861 - Junjie Zhu

, Bingjun Luo
, Ao Sun
, Jinghang Tan
, Xibin Zhao
, Yue Gao
:
Variance-Aware Bi-Attention Expression Transformer for Open-Set Facial Expression Recognition in the Wild. 862-870 - Zixin Zhang

, Fan Qi
, Shuai Li
, Changsheng Xu
:
AffectFAL: Federated Active Affective Computing with Non-IID Data. 871-882 - Peiliang Gong

, Ziyu Jia
, Pengpai Wang
, Yueying Zhou
, Daoqiang Zhang
:
ASTDF-Net: Attention-Based Spatial-Temporal Dual-Stream Fusion Network for EEG-Based Emotion Recognition. 883-892
Oral Session V: Engaging Users with Multimedia -- Multimedia Search and Recommendation
- Yishu Liu

, Qingpeng Wu
, Zheng Zhang
, Jingyi Zhang
, Guangming Lu
:
Multi-Granularity Interactive Transformer Hashing for Cross-modal Retrieval. 893-902 - Wenjie Wang

, Xinyu Lin
, Liuhui Wang
, Fuli Feng
, Yinwei Wei
, Tat-Seng Chua
:
Equivariant Learning for Out-of-Distribution Cold-start Recommendation. 903-914 - Haokun Wen

, Xian Zhang
, Xuemeng Song
, Yinwei Wei
, Liqiang Nie
:
Target-Guided Composed Image Retrieval. 915-923 - Haoxuan Li

, Yi Bin
, Junrong Liao
, Yang Yang
, Heng Tao Shen
:
Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination. 924-934 - Xin Zhou

, Zhiqi Shen
:
A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation. 935-943 - Guiwei Zhang

, Yongfei Zhang
, Zichang Tan
:
ProtoHPE: Prototype-guided High-frequency Patch Enhancement for Visible-Infrared Person Re-identification. 944-954 - Wei Ji

, Xiangyan Liu
, An Zhang
, Yinwei Wei
, Yongxin Ni
, Xiang Wang
:
Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation. 955-965 - Junyang Chen

, Jialong Wang
, Zhijiang Dai
, Huisi Wu
, Mengzhu Wang
, Qin Zhang
, Huan Wang
:
Zero-shot Micro-video Classification with Neural Variational Inference in Graph Prototype Network. 966-974 - Zhiguo Chen

, Xun Jiang
, Xing Xu
, Zuo Cao
, Yijun Mo
, Heng Tao Shen
:
Joint Searching and Grounding: Multi-Granularity Video Content Retrieval. 975-983 - Yuyuan Li

, Chaochao Chen
, Xiaolin Zheng
, Yizhao Zhang
, Zhongxuan Han
, Dan Meng
, Jun Wang
:
Making Users Indistinguishable: Attribute-wise Unlearning in Recommender Systems. 984-994 - Dugang Liu

, Yang Qiao
, Xing Tang
, Liang Chen
, Xiuqiang He
, Zhong Ming
:
Prior-Guided Accuracy-Bias Tradeoff Learning for CTR Prediction in Multimedia Recommendation. 995-1003 - Haoyue Bai

, Min Hou
, Le Wu
, Yonghui Yang
, Kun Zhang
, Richang Hong
, Meng Wang
:
GoRec: A Generative Cold-start Recommendation Framework. 1004-1012 - Jingzhi Li

, Fengling Li
, Lei Zhu
, Hui Cui
, Jingjing Li:
Prototype-guided Knowledge Transfer for Federated Unsupervised Cross-modal Hashing. 1013-1022
Oral Session VI: Engaging Users with Multimedia -- Interactions and Quality of Experience
- Shuai He

, Anlong Ming
, Shuntian Zheng
, Haobin Zhong
, Huadong Ma
:
EAT: An Enhancer for Aesthetics-Oriented Transformers. 1023-1032 - Sicheng Yang

, Zilin Wang
, Zhiyong Wu
, Minglei Li
, Zhensong Zhang
, Qiaochu Huang
, Lei Hao
, Songcen Xu
, Xiaofei Wu
, Changpeng Yang
, Zonghong Dai
:
UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons. 1033-1044 - Haoning Wu

, Erli Zhang
, Liang Liao
, Chaofeng Chen
, Jingwen Hou
, Annan Wang
, Wenxiu Sun
, Qiong Yan
, Weisi Lin
:
Towards Explainable In-the-Wild Video Quality Assessment: A Database and a Language-Prompted Approach. 1045-1054 - Guangming Zhu

, Siyuan Wang
, Qing Cheng
, Kelong Wu
, Hao Li
, Liang Zhang
:
Sketch Input Method Editor: A Comprehensive Dataset and Methodology for Systematic Input Recognition. 1055-1065 - Tengchuan Kou

, Xiaohong Liu
, Wei Sun
, Jun Jia
, Xiongkuo Min
, Guangtao Zhai
, Ning Liu
:
StableVQA: A Deep No-Reference Quality Assessment Model for Video Stability. 1066-1076 - Jianjun Xiang

, Yuanjie Dang
, Peng Chen
, Ronghua Liang
, Ruohong Huan
, Zhengyu Zhang
:
Spatial-angular Quality-aware Representation Learning for Blind Light Field Image Quality Assessment. 1077-1087 - Yunlong Dong

, Xiaohong Liu
, Yixuan Gao
, Xunchu Zhou, Tao Tan
, Guangtao Zhai
:
Light-VQA: A Multi-Dimensional Quality Assessment Model for Low-Light Video Enhancement. 1088-1097 - Kun Yuan

, Zishang Kong
, Chuanchuan Zheng
, Ming Sun
, Xing Wen
:
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment. 1098-1107 - Kaiyuan Hu

, Haowen Yang
, Yili Jin
, Junhua Liu
, Yongting Chen
, Miao Zhang
, Fangxin Wang
:
Understanding User Behavior in Volumetric Video Watching: Dataset, Analysis and Prediction. 1108-1116 - Xiangfei Sheng

, Leida Li
, Pengfei Chen
, Jinjian Wu
, Weisheng Dong
, Yuzhe Yang
, Liwu Xu
, Yaqian Li
, Guangming Shi
:
AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics Assessment. 1117-1126
Oral Session VII: Engaging Users with Multimedia -- Metaverse, Art and Culture
- Zheng Wei

, Xian Xu
, Lik-Hang Lee
, Wai Tong
, Huamin Qu
, Pan Hui
:
Feeling Present! From Physical to Virtual Cinematography Lighting Education with Metashadow. 1127-1136 - Shao-Kui Zhang

, Jia-Hong Liu
, Yike Li
, Tianyi Xiong
, Ke-Xin Ren
, Hongbo Fu
, Song-Hai Zhang
:
Automatic Generation of Commercial Scenes. 1137-1147 - Yang Chen

, Yingwei Pan
, Yehao Li
, Ting Yao
, Tao Mei
:
Control3D: Towards Controllable Text-to-3D Generation. 1148-1156 - Yuqing Zhang

, Zhou Fang
, Xinyu Yang
, Shengyu Zhang
, Baoyi He
, Huaiyong Dou
, Junchi Yan
, Yongquan Zhang
, Fei Wu
:
Reconnecting the Broken Civilization: Patchwork Integration of Fragments from Ancient Manuscripts. 1157-1166
Oral Session VIII: Engaging Users with Multimedia -- Multimedia Applications
- Zixin Wang

, Yadan Luo
, Zhi Chen
, Sen Wang
, Zi Huang
:
Cal-SFDA: Source-Free Domain-adaptive Semantic Segmentation with Differentiable Expected Calibration Error. 1167-1178 - Runmin Cong

, Mengyao Sun
, Sanyi Zhang
, Xiaofei Zhou
, Wei Zhang
, Yao Zhao
:
Frequency Perception Network for Camouflaged Object Detection. 1179-1189 - Xiaoshuai Wu

, Xin Liao
, Bo Ou
:
SepMark: Deep Separable Watermarking for Unified Source Tracing and Deepfake Detection. 1190-1201 - Runmin Cong

, Yuchen Guan
, Jinpeng Chen
, Wei Zhang
, Yao Zhao
, Sam Kwong
:
SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection. 1202-1211 - Hao Tan

, Weichao Kong
, Feng Zhang
, Wenjin Qin
, Jianjun Wang
:
High-Order Tensor Recovery Coupling Multilayer Subspace Priori with Application in Video Restoration. 1212-1220 - Chen Wang

, Jiadai Sun
, Lina Liu
, Chenming Wu
, Zhelun Shen
, Dayan Wu
, Yuchao Dai
, Liangjun Zhang
:
Digging into Depth Priors for Outdoor Neural Radiance Fields. 1221-1230 - Fanrui Zhang

, Jiawei Liu
, Qiang Zhang
, Esther Sun
, Jingyi Xie
, Zheng-Jun Zha
:
ECENet: Explainable and Context-Enhanced Network for Muti-modal Fact verification. 1231-1240 - Baochen Xiong

, Xiaoshan Yang
, Yaguang Song
, Yaowei Wang
, Changsheng Xu
:
Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning. 1241-1249 - Jinpeng Lin

, Min Zhou
, Ye Ma
, Yifan Gao
, Chenxi Fei
, Yangjian Chen
, Zhang Yu
, Tiezheng Ge
:
AutoPoster: A Highly Automatic and Content-aware Design System for Advertising Poster Generation. 1250-1260 - Gangyan Zeng

, Yuan Zhang
, Yu Zhou
, Bo Fang
, Guoqing Zhao
, Xin Wei
, Weiping Wang
:
Filling in the Blank: Rationale-Augmented Prompt Tuning for TextVQA. 1261-1272 - Liuhan Chen

, Yirou Wang
, Yongyong Chen
:
End-to-end XY Separation for Single Image Blind Deblurring. 1273-1282 - Junxian Chen

, Ying Liu
, Yiqi Liang
, Dandan Long
, Xiaolin He
, Ruihui Li
:
SD-Net: Spatially-Disentangled Point Cloud Completion Network. 1283-1293 - Jiawei Jiang

, Yuchao Feng
, Jiacheng Chen
, Dongyan Guo
, Jianwei Zheng
:
Latent-space Unfolding for MRI Reconstruction. 1294-1302 - Hongpeng Lin

, Ludan Ruan
, Wenke Xia
, Peiyu Liu
, Jingyuan Wen
, Yixin Xu
, Di Hu
, Ruihua Song
, Wayne Xin Zhao
, Qin Jin
, Zhiwu Lu
:
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World. 1303-1313 - Pengteng Li



Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID