


default search action
Dinesh Manocha
Person information
- affiliation: University of Maryland at College Park, MD, USA
- affiliation (former): University of North Carolina at Chapel Hill, NC, USA
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2026
[i388]Vishnu Sashank Dorbala, Dinesh Manocha:
MemCtrl: Using MLLMs as Active Memory Controllers on Embodied Agents. CoRR abs/2601.20831 (2026)
[i387]Dongki Jung, Jaehoon Choi, Adil Qureshi, Somi Jeong, Dinesh Manocha, Suyong Yeon:
Wid3R: Wide Field-of-View 3D Reconstruction via Camera Model Conditioning. CoRR abs/2602.05321 (2026)- 2025
[j254]Daeun Song
, Jing Liang
, Amirreza Payandeh
, Amir Hossain Raj
, Xuesu Xiao
, Dinesh Manocha
:
VLM-Social-Nav: Socially Aware Robot Navigation Through Scoring Using Vision-Language Models. IEEE Robotics Autom. Lett. 10(1): 508-515 (2025)
[j253]Jing Liang
, Zhuo Deng
, Zheming Zhou
, Min Sun
, Omid Ghasemalizadeh, Cheng-Hao Kuo, Arnie Sen, Dinesh Manocha
:
CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition. IEEE Robotics Autom. Lett. 10(5): 4628-4635 (2025)
[j252]James F. Mullen Jr.
, Dhruva Kumar
, Xuewei Qi
, Rajasimman Madhivanan
, Arnie Sen, Dinesh Manocha
, Richard Kim:
HomeEmergency - Using Audio to Find and Respond to Emergencies in the Home. IEEE Robotics Autom. Lett. 10(6): 5649-5656 (2025)
[j251]Daeun Song
, Jing Liang
, Xuesu Xiao
, Dinesh Manocha
:
VL-TGS: Trajectory Generation and Selection Using Vision Language Models in Mapless Outdoor Environments. IEEE Robotics Autom. Lett. 10(6): 5791-5798 (2025)
[j250]Gokul S. Krishnan
, Sarala Padi
, Craig S. Greenberg, Balaraman Ravindran, Dinesh Manocha
, Ram D. Sriram:
LineConGraphs: Line Conversation Graphs for Effective Emotion Recognition Using Graph Neural Networks. IEEE Trans. Affect. Comput. 16(3): 1747-1759 (2025)
[j249]Peihong Yu, Manav Mishra, Alec Koppel, Carl E. Busart, Priya Narayan, Dinesh Manocha, Amrit Singh Bedi, Pratap Tokekar:
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning. Trans. Mach. Learn. Res. 2025 (2025)
[j248]Niall L. Williams
, Logan Stevens
, Aniket Bera
, Dinesh Manocha
:
Sensitivity to Redirected Walking Considering Gaze, Posture, and Luminance. IEEE Trans. Vis. Comput. Graph. 31(5): 3223-3234 (2025)
[c531]Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li:
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation. ACL (Findings) 2025: 2466-2482
[c530]Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha:
ChartLens: Fine-grained Visual Attribution in Charts. ACL (1) 2025: 22447-22462
[c529]Geonsun Lee
, Yue Yang
, Jennifer Healey
, Dinesh Manocha
:
Since U Been Gone: Augmenting Context-Aware Transcriptions for Re-Engaging in Immersive VR Meetings. CHI 2025: 785:1-785:20
[c528]Divya Kothandaraman, Kuldeep Kulkarni, Sumit Shekhar, Balaji Vasan Srinivasan, Dinesh Manocha:
Imposter: Text and Frequency Guidance for Subject Driven Action Personalization using Diffusion Models. COLING 2025: 11013-11028
[c527]Dongki Jung, Jaehoon Choi, Yonghan Lee, Somi Jeong, Taejae Lee, Dinesh Manocha, Suyong Yeon:
EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching. CVPR 2025: 6337-6347
[c526]Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh, Tianrui Guan, Mengdi Wang, Ahmad Beirami, Furong Huang, Alvaro Velasquez, Dinesh Manocha, Amrit Singh Bedi:
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment. CVPR 2025: 25038-25049
[c525]Soumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha, Dinesh Manocha:
RELIC: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples. EMNLP (Findings) 2025: 1502-1517
[c524]Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Vivek Gupta, Dinesh Manocha:
Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents. EMNLP 2025: 22485-22508
[c523]Ashish Seth, Utkarsh Tyagi, Ramaneswaran Selvakumar, Nishit Anand, Sonal Kumar, Sreyan Ghosh, Ramani Duraiswami, Chirag Agarwal, Dinesh Manocha:
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding. EMNLP 2025: 28461-28480
[c522]Ramaneswaran Selvakumar, Ashish Seth, Nishit Anand, Utkarsh Tyagi, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
MULTIVOX: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions. EMNLP 2025: 28481-28493
[c521]Nishit Anand, Ashish Seth, Ramani Duraiswami, Dinesh Manocha:
TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification. ICASSP Workshops 2025: 1-5
[c520]Jae-Sung Bae, Anastasia Kuznetsova, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement. ICASSP Workshops 2025: 1-5
[c519]Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha:
ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds. ICASSP 2025: 1-5
[c518]Jackie Lin, Georg Götz, Hermes Sampedro Llopis, Haukur Hafsteinsson, Steinar Guðjónsson, Daniel Gert Nielsen, Finnur Pind, Paris Smaragdis, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Synthesis of Room Acoustics for Speaker Distance Estimation. ICASSP Workshops 2025: 1-5
[c517]Mohamed Elmoghany, Ryan A. Rossi, Seunghyun Yoon, Subhojyoti Mukherjee, Eslam Mohamed Bakr, Puneet Mathur, Gang Wu, Viet Dac Lai, Nedim Lipka, Ruiyi Zhang, Varun Manjunatha, Chien Nguyen, Daksh Dangi, Abel Salinas, Hongjie Chen, Xiaolei Huang, Joe Barrow, Nesreen K. Ahmed, Hoda Eldardiry, Namyong Park, Yu Wang, Zhengzhong Tu, Thien Huu Nguyen, Dinesh Manocha, Mohamed Elhoseiny, Franck Dernoncourt:
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality. ICCVW 2025: 7082-7094
[c516]Samuel Audia, Soheil Feizi, Matthias Zwicker, Dinesh Manocha:
How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings. ICLR 2025
[c515]Souradip Chakraborty, Sujay Bhatt, Udari Madhushani Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh:
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment. ICLR 2025
[c514]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu Jin, Dinesh Manocha:
Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs. ICLR 2025
[c513]Sreyan Ghosh, Sonal Kumar, Zhifeng Kong, Rafael Valle, Bryan Catanzaro, Dinesh Manocha:
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data. ICLR 2025
[c512]Zhengmian Hu, Tong Zheng, Vignesh Viswanathan, Ziyi Chen, Ryan A. Rossi, Yihan Wu, Dinesh Manocha, Heng Huang:
Towards Optimal Multi-draft Speculative Decoding. ICLR 2025
[c511]S. Sakshi, Utkarsh Tyagi, Sonal Kumar, Ashish Seth, Ramaneswaran Selvakumar, Oriol Nieto, Ramani Duraiswami, Sreyan Ghosh, Dinesh Manocha:
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark. ICLR 2025
[c510]Mohamad Fares El Hajj Chehade, Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Dinesh Manocha, Hao Zhu, Amrit Singh Bedi:
Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time. ICML 2025
[c509]Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S. Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro:
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities. ICML 2025
[c508]Jing Liang, Dibyendu Das, Daeun Song, Md Nahid Hasan Shuvo, Mohammad Durrani, Karthik Taranath, Ivan Penskiy, Dinesh Manocha, Xuesu Xiao:
Gnd: Global Navigation Dataset With Multi-Modal Perception and Multi-Category Traversability in Outdoor Campus Environments. ICRA 2025: 2383-2390
[c507]Mohamed Elnoor, Kasun Weerakoon, Gershom Seneviratne, Ruiqi Xian, Tianrui Guan, Mohamed Khalid M. Jaffar, Vignesh Rajagopal, Dinesh Manocha:
VLM-GroNav: Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments. ICRA 2025: 2391-2398
[c506]Kasun Weerakoon, Mohamed Elnoor, Gershom Seneviratne, Vignesh Rajagopal, Senthil Hariharan Arul, Jing Liang, Mohamed Khalid M. Jaffar, Dinesh Manocha:
Behav: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes. ICRA 2025: 7044-7051
[c505]Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha:
ZSORN: Language-Driven Object-Centric Zero-Shot Object Retrieval and Navigation. ICRA 2025: 10922-10928
[c504]Vishnu Sashank Dorbala, Vishnu Dutt Sharma, Pratap Tokekar, Dinesh Manocha:
Improving Zero-Shot ObjectNav with Generative Communication. ICRA 2025: 12818-12825
[c503]Xiyang Wu, Souradip Chakraborty, Ruiqi Xian, Jing Liang, Tianrui Guan, Fuxiao Liu, Brian M. Sadler, Dinesh Manocha, Amrit Singh Bedi:
On the Vulnerability of LLM/VLM-Controlled Robotics. IROS 2025: 1914-1921
[c502]Jing Liang, He Yin, XueweiTony Qi, Jong Jin Park, Min Sun, Rajasimman Madhivanan, Dinesh Manocha:
ET-Former: Efficient Triplane Deformable Attention for 3D Semantic Scene Completion From Monocular Camera. IROS 2025: 2313-2320
[c501]Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Dinesh Manocha, Amrit Singh Bedi:
Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation. IROS 2025: 2337-2344
[c500]Christopher Maxey, Jaehoon Choi, Yonghan Lee, Hyungtae Lee, Dinesh Manocha, Heesung Kwon:
TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-based Scenes. IROS 2025: 3503-3510
[c499]Gershom Seneviratne, Kasun Weerakoon, Mohamed Elnoor, Vignesh Rajgopal, Harshavarthan Varatharajan, Mohamed Khalid M. Jaffar, Jason L. Pusey, Dinesh Manocha:
CROSS-GAiT: Cross-Attention-Based Multimodal Representation Fusion for Parametric Gait Adaptation in Complex Terrains. IROS 2025: 6079-6086
[c498]Yangzhe Kong, Daeun Song, Jing Liang, Dinesh Manocha, Ziyu Yao, Xuesu Xiao:
AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning. IROS 2025: 11298-11304
[c497]Amirreza Payandeh, Daeun Song, Mohammad Nazeri, Jing Liang, Praneel Mukherjee, Amir Hossain Raj, Yangzhe Kong, Dinesh Manocha, Xuesu Xiao:
Social-LLaVA: Enhancing Social Robot Navigation through Human-Language Reasoning. IROS 2025: 17192-17198
[c496]Vishnu Sashank Dorbala, Prasoon Goyal, Robinson Piramuthu, Michael Johnston, Reza Ghanadan, Dinesh Manocha:
Is the House Ready For Sleeptime? Generating and Evaluating Situational Queries for Embodied Question Answering. IROS 2025: 18430-18437
[c495]James F. Mullen Jr., Dinesh Manocha:
LBAP: Improved Uncertainty Alignment of LLM Planners using Bayesian Inference. IROS 2025: 18716-18723
[c494]Divya Kothandaraman
, Ming Lin
, Dinesh Manocha
:
Financial Models meets Generative Art: Black-Scholes-Inspired Concept Blending in Text-to-Image Diffusion. ACM Multimedia 2025: 12209-12217
[c493]Soumya Suvra Ghosal, Soumyabrata Pal, Koyel Mukherjee, Dinesh Manocha:
PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from related Example Banks. NAACL (Long Papers) 2025: 351-365
[c492]Ramaneswaran Selvakumar, Sonal Kumar, Hemant Kumar Giri, Nishit Anand, Ashish Seth, Sreyan Ghosh, Dinesh Manocha:
Do Audio-Language Models Understand Linguistic Variations? NAACL (Short Papers) 2025: 899-913
[c491]Manan Suri, Puneet Mathur, Franck Dernoncourt, Kanika Goswami, Ryan A. Rossi, Dinesh Manocha:
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation. NAACL (Long Papers) 2025: 6088-6109
[c490]Ashish Seth, Ramaneswaran Selvakumar, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification. NAACL (Long Papers) 2025: 12376-12394
[c489]Sonal Kumar, Sreyan Ghosh, Utkarsh Tyagi, Anton Jeran Ratnarajah, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha:
ProSE: Diffusion Priors for Speech Enhancement. NAACL (Long Papers) 2025: 12470-12483
[c488]Soumya Suvra Ghosal, Ramya Hebbalaguppe, Dinesh Manocha:
Better Features, Better Calibration: A Simple Fix for Overconfident Networks. ECML/PKDD (1) 2025: 231-247
[c487]Geonsun Lee, Min Xia, Nels Numan
, Xun Qian, David Li, Yanhe Chen, Achin Kulshrestha, Ishan Chatterjee
, Yinda Zhang, Dinesh Manocha
, David Kim, Ruofei Du:
Sensible Agent: A Framework for Unobtrusive Interaction with Proactive AR Agents. UIST 2025: 119:1-119:22
[c486]Sonal Kumar, Prem Seetharaman, Justin Salamon, Dinesh Manocha, Oriol Nieto:
SILA: Signal-to-Language Augmentation for Enhanced Control in Text-to-Audio Generation. WASPAA 2025: 1-5
[d3]Chao-Han Huck Yang
, Sreyan Ghosh
, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, Fnu Sakshi, Vaibhavi Lokegaonkar, Ramani Duraiswami, Dinesh Manocha, Jun Du, Rafeal Valle:
Multi-Domain Audio Question Answering in the DCASE 2025 Challenge. Version 1. Zenodo, 2025 [all versions]
[d2]Chao-Han Huck Yang
, Sreyan Ghosh
, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, Fnu Sakshi, Vaibhavi Lokegaonkar, Ramani Duraiswami, Dinesh Manocha, Jun Du, Rafeal Valle:
Multi-Domain Audio Question Answering in the DCASE 2025 Challenge. Version 2. Zenodo, 2025 [all versions]
[i386]Nishit Anand, Ashish Seth, Ramani Duraiswami, Dinesh Manocha:
TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification. CoRR abs/2501.00398 (2025)
[i385]Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Yaoting Wang, Mohamed Elhoseiny, Ruohan Gao
, Dinesh Manocha:
AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs. CoRR abs/2501.02135 (2025)
[i384]Pooja Guhan, Tsung-Wei Huang, Guan-Ming Su, Subhadra Gopalakrishnan, Dinesh Manocha:
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation. CoRR abs/2501.07983 (2025)
[i383]Amirreza Payandeh, Daeun Song, Mohammad Nazeri, Jing Liang, Praneel Mukherjee, Amir Hossain Raj, Yangzhe Kong, Dinesh Manocha, Xuesu Xiao:
Social-LLaVA: Enhancing Robot Navigation through Human-Language Reasoning in Social Spaces. CoRR abs/2501.09024 (2025)
[i382]Jackie Lin, Georg Götz, Hermes Sampedro Llopis, Haukur Hafsteinsson, Steinar Guðjónsson, Daniel Gert Nielsen, Finnur Pind, Paris Smaragdis, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Synthesis of Room Acoustics for Speaker Distance Estimation. CoRR abs/2501.13250 (2025)
[i381]Jae-Sung Bae, Anastasia Kuznetsova, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement. CoRR abs/2501.13372 (2025)
[i380]Dongki Jung, Jaehoon Choi
, Yonghan Lee, Dinesh Manocha:
IM360: Textured Mesh Reconstruction for Large-scale Indoor Mapping with 360ô Cameras. CoRR abs/2502.12545 (2025)
[i379]Zhengmian Hu, Tong Zheng, Vignesh Viswanathan, Ziyi Chen, Ryan A. Rossi, Yihan Wu, Dinesh Manocha, Heng Huang:
Towards Optimal Multi-draft Speculative Decoding. CoRR abs/2502.18779 (2025)
[i378]Dongki Jung, Jaehoon Choi
, Yonghan Lee, Somi Jeong, Taejae Lee, Dinesh Manocha, Suyong Yeon:
EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching. CoRR abs/2502.20685 (2025)
[i377]Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S. Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro:
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities. CoRR abs/2503.03983 (2025)
[i376]Yangzhe Kong, Daeun Song, Jing Liang, Dinesh Manocha, Ziyu Yao, Xuesu Xiao:
AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning. CoRR abs/2503.07557 (2025)
[i375]Mohamed Elnoor, Kasun Weerakoon, Gershom Seneviratne, Jing Liang, Vignesh Rajagopal, Dinesh Manocha:
Vi-LAD: Vision-Language Attention Distillation for Socially-Aware Robot Navigation in Dynamic Environments. CoRR abs/2503.09820 (2025)
[i374]Niall L. Williams, Logan Stevens, Aniket Bera, Dinesh Manocha:
Sensitivity to Redirected Walking Considering Gaze, Posture, and Luminance. CoRR abs/2503.15505 (2025)
[i373]Geonsun Lee, Yue Yang, Jennifer Healey, Dinesh Manocha:
Since U Been Gone: Augmenting Context-Aware Transcriptions for Re-engaging in Immersive VR Meetings. CoRR abs/2503.16739 (2025)
[i372]Chak Lam Shek, Amrit Singh Bedi, Anjon Basak, Ellen R. Novoseller, Nicholas R. Waytowich, Priya Narayanan, Dinesh Manocha, Pratap Tokekar:
Learning Multi-Robot Coordination through Locality-Based Factorized Multi-Agent Actor-Critic Algorithm. CoRR abs/2503.18816 (2025)
[i371]Souradip Chakraborty, Sujay Bhatt, Udari Madhushani Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh:
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment. CoRR abs/2503.21720 (2025)
[i370]Sanjoy Chowdhury, Hanan Gani, Nishit Anand, Sayan Nag, Ruohan Gao
, Mohamed Elhoseiny, Salman Khan, Dinesh Manocha:
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs. CoRR abs/2503.23219 (2025)
[i369]James F. Mullen Jr., Dhruva Kumar, Xuewei Qi, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha, Richard Kim:
HomeEmergency - Using Audio to Find and Respond to Emergencies in the Home. CoRR abs/2504.01089 (2025)
[i368]Jaehoon Choi, Dongki Jung, Yonghan Lee, Sungmin Eum, Dinesh Manocha, Heesung Kwon:
UAVTwin: Neural Digital Twins for UAVs using Gaussian Splatting. CoRR abs/2504.02158 (2025)
[i367]Pooja Guhan, Divya Kothandaraman, Tsung-Wei Huang, Guan-Ming Su, Dinesh Manocha:
CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models. CoRR abs/2504.09472 (2025)
[i366]Samuel Audia, Soheil Feizi, Matthias Zwicker, Dinesh Manocha:
How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings. CoRR abs/2504.13412 (2025)
[i365]Zongxia Li, Xiyang Wu, Guangyao Shi, Yubin Qin, Hongyang Du, Tianyi Zhou
, Dinesh Manocha, Jordan Lee Boyd-Graber:
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding. CoRR abs/2505.01481 (2025)
[i364]Chao-Han Huck Yang, Sreyan Ghosh, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, Sakshi Singh, Vaibhavi Lokegaonkar, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Gunhee Kim, Jun Du, Rafael Valle, Bryan Catanzaro:
Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge. CoRR abs/2505.07365 (2025)
[i363]Prakhar Mishra, Amir Hossain Raj, Xuesu Xiao, Dinesh Manocha:
McARL:Morphology-Control-Aware Reinforcement Learning for Generalizable Quadrupedal Locomotion. CoRR abs/2505.18418 (2025)
[i362]Prakhar Mishra, Amir Hossain Raj, Xuesu Xiao, Dinesh Manocha:
HACL: History-Aware Curriculum Learning for Fast Locomotion. CoRR abs/2505.18429 (2025)
[i361]Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha:
ChartLens: Fine-grained Visual Attribution in Charts. CoRR abs/2505.19360 (2025)
[i360]Mohamad Fares El Hajj Chehade, Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Dinesh Manocha, Hao Zhu, Amrit Singh Bedi:
Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time. CoRR abs/2505.23729 (2025)
[i359]Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Vivek Gupta, Dinesh Manocha:
Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents. CoRR abs/2506.01344 (2025)
[i358]Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Yifu Lu, Mengdi Wang, Dinesh Manocha, Furong Huang, Mohammad Ghavamzadeh, Amrit Singh Bedi:
Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models. CoRR abs/2506.04210 (2025)
[i357]Jaehoon Choi, Dongki Jung, Christopher Maxey, Yonghan Lee, Sungmin Eum, Dinesh Manocha, Heesung Kwon:
UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting. CoRR abs/2506.05011 (2025)
[i356]Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe, Junjie Fei, Sayan Nag, Salman Khan, Mohamed Elhoseiny, Dinesh Manocha:
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks. CoRR abs/2506.07016 (2025)
[i355]Soumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha, Dinesh Manocha:
Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples. CoRR abs/2506.16502 (2025)
[i354]Sanjoy Chowdhury, Subrata Biswas, Sayan Nag, Tushar Nagarajan, Calvin Murdock, Ishwarya Ananthabhotla, Yijun Qian, Vamsi Krishna Ithapu, Dinesh Manocha, Ruohan Gao:
EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception. CoRR abs/2506.21080 (2025)
[i353]Jing Liang, Kasun Weerakoon, Daeun Song, Senthurbavan Kirubaharan, Xuesu Xiao, Dinesh Manocha:
MOSU: Autonomous Long-range Robot Navigation with Multi-modal Scene Understanding. CoRR abs/2507.04686 (2025)
[i352]Mohamed Elmoghany, Ryan A. Rossi, Seunghyun Yoon, Subhojyoti Mukherjee, Eslam Mohamed Bakr, Puneet Mathur, Gang Wu, Viet Dac Lai, Nedim Lipka, Ruiyi Zhang, Varun Manjunatha, Chien Van Nguyen, Daksh Dangi, Abel Salinas, Mohammad Reza Taesiri, Hongjie Chen, Xiaolei Huang, Joe Barrow, Nesreen K. Ahmed, Hoda Eldardiry, Namyong Park, Yu Wang, Jaemin Cho, Anh Totti Nguyen, Zhengzhong Tu, Thien Huu Nguyen, Dinesh Manocha, Mohamed Elhoseiny, Franck Dernoncourt:
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality. CoRR abs/2507.07202 (2025)
[i351]Arushi Goel, Sreyan Ghosh, Jaehyeon Kim, Sonal Kumar, Zhifeng Kong, Sang-gil Lee, Chao-Han Huck Yang, Ramani Duraiswami, Dinesh Manocha, Rafael Valle, Bryan Catanzaro:
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models. CoRR abs/2507.08128 (2025)
[i350]Ramaneswaran Selvakumar, Ashish Seth, Nishit Anand, Utkarsh Tyagi, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
MultiVox: Benchmarking Voice Assistants for Multimodal Interactions. CoRR abs/2507.10859 (2025)
[i349]Gershom Seneviratne, Jianyu An, Sahire Ellahy, Kasun Weerakoon, Mohamed Elnoor, Jonathan Deepak Kannan, Amogha Thalihalla Sunil, Dinesh Manocha:
HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation. CoRR abs/2508.01539 (2025)
[i348]Ashish Seth, Utkarsh Tyagi, Ramaneswaran Selvakumar, Nishit Anand, Sonal Kumar, Sreyan Ghosh, Ramani Duraiswami, Chirag Agarwal, Dinesh Manocha:
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding. CoRR abs/2508.12687 (2025)
[i347]Sonal Kumar, Simon Sedlácek, Vaibhavi Lokegaonkar, Fernando López, Wenyi Yu, Nishit Anand, Hyeonggon Ryu, Lichang Chen, Maxim Plicka, Miroslav Hlavácek, William Fineas Ellingwood, Sathvik Udupa, Siyuan Hou, Allison Ferner, Sara Barahona, Cecilia Bolaños, Satish Rahi, Laura Herrera-Alarcón, Satvik Dixit, Rupali S. Patil, Soham Deshmukh, Lasha Koroshinadze, Yao Liu, Leibny Paola García-Perera
, Eleni Zanou, Themos Stafylakis, Joon Son Chung, David Harwath, Chao Zhang, Dinesh Manocha, Alicia Lozano-Diez, Santosh Kesiraju, Sreyan Ghosh, Ramani Duraiswami:
MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence. CoRR abs/2508.13992 (2025)
[i346]Geonsun Lee, Min Xia, Nels Numan, Xun Qian, David Li, Yanhe Chen, Achin Kulshrestha, Ishan Chatterjee, Yinda Zhang, Dinesh Manocha, David Kim, Ruofei Du:
Sensible Agent: A Framework for Unobtrusive Interaction with Proactive AR Agents. CoRR abs/2509.09255 (2025)
[i345]Botao He, Amir-Hossein Shahidzadeh, Yu Chen, Jiayi Wu, Tianrui Guan, Guofei Chen, Howie Choset, Dinesh Manocha, Glen Chou, Cornelia Fermüller, Yiannis Aloimonos:
NavMoE: Hybrid Model- and Learning-based Traversability Estimation for Local Navigation via Mixture of Experts. CoRR abs/2509.12747 (2025)
[i344]Xijun Wang, Junyun Huang, Rayyan Abdalla, Chengyuan Zhang, Ruiqi Xian, Dinesh Manocha:
Bi-VLM: Pushing Ultra-Low Precision Post-Training Quantization Boundaries in Vision-Language Models. CoRR abs/2509.18763 (2025)
[i343]Dongki Jung, Jaehoon Choi, Yonghan Lee, Dinesh Manocha:
RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization. CoRR abs/2509.23991 (2025)
[i342]Dongki Jung, Jaehoon Choi, Yonghan Lee, Sungmin Eum, Heesung Kwon, Dinesh Manocha:
MoRe: Monocular Geometry Refinement via Graph Optimization for Cross-View Consistency. CoRR abs/2510.07119 (2025)
[i341]Xijun Wang, Tanay Sharma, Achin Kulshrestha, Abhimitra Meka, Aveek Purohit, Dinesh Manocha:
EgoSocial: Benchmarking Proactive Intervention Ability of Omnimodal LLMs via Egocentric Social Interaction Perception. CoRR abs/2510.13105 (2025)
[i340]Atharvan Dogra, Soumya Suvra Ghosal, Ameet Deshpande, Ashwin Kalyan, Dinesh Manocha:
Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models. CoRR abs/2510.18454 (2025)
[i339]Sakshi Singh, Vaibhavi Lokegaonkar, Neil Zhang, Ramani Duraiswami, Sreyan Ghosh, Dinesh Manocha, Lie Lu:
SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models. CoRR abs/2511.06606 (2025)
[i338]Samuel Audia, Dinesh Manocha, Matthias Zwicker:
Accelerated, Memory-Efficient Far-Field Scattering Computation with Monte Carlo SBR. CoRR abs/2511.07586 (2025)
[i337]Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha:
Structured Uncertainty guided Clarification for LLM Agents. CoRR abs/2511.08798 (2025)
[i336]Sreyan Ghosh, Arushi Goel, Lasha Koroshinadze, Sang-gil Lee, Zhifeng Kong, João Felipe Santos, Ramani Duraiswami, Dinesh Manocha, Weiping Ding, Mohammad Shoeybi, Bryan Catanzaro:
Music Flamingo: Scaling Music Understanding in Audio Language Models. CoRR abs/2511.10289 (2025)
[i335]Vignesh Rajagopal, Kasun Weerakoon Kulathun Mudiyanselage, Gershom Seneviratne, Pon Aswin Sankaralingam, Mohamed Elnoor, Jing Liang, Rohan Chandra, Dinesh Manocha:
DR. Nav: Semantic-Geometric Representations for Proactive Dead-End Recovery and Navigation. CoRR abs/2511.12778 (2025)
[i334]Kristy Sakano, Jianyu An, Dinesh Manocha, Huan Xu:
SAFE-SMART: Safety Analysis and Formal Evaluation using STL Metrics for Autonomous RoboTs. CoRR abs/2511.17781 (2025)
[i333]Xiyang Wu, Zongxia Li, Jihui Jin, Guangyao Shi, Gouthaman KV, Vishnu Raj, Nilotpal Sinha, Jingxi Chen, Fan Du, Dinesh Manocha:
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models. CoRR abs/2511.18373 (2025)
[i332]Samarth Chopra, Jing Liang, Gershom Seneviratne, Yonghan Lee, Jaehoon Choi, Jianyu An, Stephen Cheng, Dinesh Manocha:
Splatblox: Traversability-Aware Gaussian Splatting for Outdoor Robot Navigation. CoRR abs/2511.18525 (2025)
[i331]Samarth Chopra, Jing Liang, Gershom Seneviratne, Dinesh Manocha:
PhysGS: Bayesian-Inferred Gaussian Splatting for Physical Property Estimation. CoRR abs/2511.18570 (2025)
[i330]Yonghan Lee, Tsung-Wei Huang, Shiv Gehlot, Jaehoon Choi, Guan-Ming Su, Dinesh Manocha:
SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting. CoRR abs/2512.04315 (2025)
[i329]Sanjoy Chowdhury, Karren D. Yang, Xudong Liu, Fartash Faghri, Pavan Kumar Anasosalu Vasu, Oncel Tuzel, Dinesh Manocha, Chun-Liang Li, Raviteja Vemulapalli:
AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding. CoRR abs/2512.16250 (2025)- 2024
[j247]Vishnu Sashank Dorbala
, James F. Mullen Jr.
, Dinesh Manocha
:
Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Based Zero-Shot Object Navigation. IEEE Robotics Autom. Lett. 9(5): 4083-4090 (2024)
[j246]Mohamed Elnoor
, Adarsh Jagan Sathyamoorthy
, Kasun Weerakoon
, Dinesh Manocha
:
ProNav: Proprioceptive Traversability Estimation for Legged Robot Navigation in Outdoor Environments. IEEE Robotics Autom. Lett. 9(8): 7190-7197 (2024)
[j245]James F. Mullen Jr.
, Prasoon Goyal
, Robinson Piramuthu, Michael Johnston, Dinesh Manocha
, Reza Ghanadan:
"Don't Forget to Put the Milk Back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations. IEEE Robotics Autom. Lett. 9(10): 9087-9094 (2024)
[j244]Geonsun Lee
, Dae Yeol Lee
, Guan-Ming Su
, Dinesh Manocha
:
"May I Speak?": Multi-Modal Attention Guidance in Social VR Group Conversations. IEEE Trans. Vis. Comput. Graph. 30(5): 2287-2297 (2024)
[j243]Mohammad R. Saeedpour-Parizi, Niall L. Williams, Tim Wong, Phillip Guan, Dinesh Manocha
, Ian M. Erkelens:
Perceptual Thresholds for Radial Optic Flow Distortion in Near-Eye Stereoscopic Displays. IEEE Trans. Vis. Comput. Graph. 30(5): 2570-2579 (2024)
[j242]Elizabeth Childs
, Ferzam Mohammad, Logan Stevens, Hugo Burbelo, Amanuel Awoke, Nicholas Rewkowski, Dinesh Manocha
:
An Overview of Enhancing Distance Learning Through Emerging Augmented and Virtual Reality Technologies. IEEE Trans. Vis. Comput. Graph. 30(8): 4480-4496 (2024)
[c485]Jaehoon Choi
, Yonghan Lee, Hyungtae Lee, Heesung Kwon, Dinesh Manocha:
MeshGS: Adaptive Mesh-Aligned Gaussian Splatting for High-Quality Rendering. ACCV (9) 2024: 262-279
[c484]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, S. Sakshi, Sanjoy Chowdhury, Dinesh Manocha:
ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious Correlations. ACL (Findings) 2024: 386-406
[c483]Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramaneswaran S., S. Sakshi, Dinesh Manocha:
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions. ACL (1) 2024: 726-748
[c482]Pooja Guhan, Uttaran Bhattacharya, Somdeb Sarkhel, Vahid Azizi, Xiang Chen, Saayan Mitra, Aniket Bera, Dinesh Manocha:
TAME-RD: Text Assisted Replication of Image Multi-Adjustments for Reverse Designing. ACL (Findings) 2024: 10710-10727
[c481]Puneet Mathur, Zhe Liu, Ke Li, Yingyi Ma, Gil Keren, Zeeshan Ahmed, Dinesh Manocha, Xuedong Zhang:
DOC-RAG: ASR Language Model Personalization with Domain-Distributed Co-occurrence Retrieval Augmentation. LREC/COLING 2024: 5132-5139
[c480]Puneet Mathur, Vlad I. Morariu, Aparna Garimella, Franck Dernoncourt, Jiuxiang Gu, Ramit Sawhney, Preslav Nakov, Dinesh Manocha, Rajiv Jain:
DocScript: Document-level Script Event Prediction. LREC/COLING 2024: 5140-5155
[c479]Samyak Jain, Parth Chhabra, Atula Tejaswi Neerkaje, Puneet Mathur, Ramit Sawhney, Shivam Agarwal, Preslav Nakov, Sudheer Chava, Dinesh Manocha:
Saliency-Aware Interpolative Augmentation for Multimodal Financial Prediction. LREC/COLING 2024: 14285-14297
[c478]Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha:
Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs. CVPR Workshops 2024: 1877-1887
[c477]Jaehoon Choi
, Rajvi Shah, Qinbo Li, Yipeng Wang, Ayush Saraf, Changil Kim, Jia-Bin Huang, Dinesh Manocha, Suhib Alsisan,


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID