


default search action
Dan Roth 0001
Person information
- affiliation (since 2017): University of Pennsylvania, Department of Computer and Information Science, Philadelphia, PA, USA
- affiliation (since 1997): University of Illinois Urbana-Champaign, IL, USA
- affiliation (1995 - 1997): Weizmann Institute, Rehovot, Israel
- affiliation (PhD 1995): Harvard University, Cambridge, MA, USA
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2026
[j68]Tomer Wolfson, Harsh Trivedi, Mor Geva, Yoav Goldberg, Dan Roth, Tushar Khot, Ashish Sabharwal, Reut Tsarfaty:
MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents. Trans. Assoc. Comput. Linguistics 14: 23-46 (2026)
[j67]Yahan Yang, Soham Dan, Dan Roth, Insup Lee:
On Calibration of Multilingual Question Answering LLMs. Trans. Mach. Learn. Res. 2026 (2026)
[i264]Aparna Elangovan, Lei Xu, Mahsa Elyasi, Ismail Akdulum, Mehmet Aksakal, Enes Gurun, Brian Hur, Saab Mansour, Ravid Shwartz Ziv, Karin Verspoor, Dan Roth:
The Illusion of Human AI Parity Under Uncertainty: Navigating Elusive Ground Truth via a Probabilistic Paradigm. CoRR abs/2601.05500 (2026)
[i263]Xuanming Zhang, Shwan Ashrafi, Aziza Mirsaidova, Amir Rezaeian, Miguel Ballesteros, Lydia B. Chilton, Zhou Yu, Dan Roth:
Budget-Aware Anytime Reasoning with LLM-Synthesized Preference Data. CoRR abs/2601.11038 (2026)
[i262]Hassan Soliman, Vivek Gupta, Dan Roth, Iryna Gurevych:
CORE-T: COherent REtrieval of Tables for Text-to-SQL. CoRR abs/2601.13111 (2026)
[i261]Dhruv Madhwal, Lyuxin David Zhang, Dan Roth, Tomer Wolfson, Vivek Gupta:
Decomposed Prompting Does Not Fix Knowledge Gaps, But Helps Models Say "I Don't Know". CoRR abs/2602.04853 (2026)
[i260]Aaditya Naik, Efthymia Tsamoura, Shibo Jin, Mayur Naik, Dan Roth:
On Improving Neurosymbolic Learning by Exploiting the Representation Space. CoRR abs/2602.07973 (2026)- 2025
[j66]Vinay K. Chaudhri, Chaitan Baru, Brandon Bennett, Mehul Bhatt, Darion Cassel, Anthony G. Cohn, Rina Dechter, Esra Erdem, David A. Ferrucci, Kenneth D. Forbus, Gregory Gelfond, Michael R. Genesereth
, Andrew S. Gordon, Benjamin N. Grosof, Gopal Gupta, Jim Hendler, Sharat Israni, Tyler R. Josephson, Patrick C. Kyllonen, Yuliya Lierler, Vladimir Lifschitz, Clifton James McFate, Hande K. McGinty, Leora Morgenstern, Alessandro Oltramari, Praveen K. Paritosh, Dan Roth, Blake Shepard, Cogan Shimizu, Denny Vrandecic, Mark Whiting, Michael Witbrock:
A community-driven vision for a new knowledge resource for AI. AI Mag. 46(4) (2025)
[j65]William K. S. Ojemann, Kevin Xie, Kevin Liu, Ellie Chang, Dan Roth, Brian Litt, Colin A. Ellis:
Zero-Shot Extraction of Seizure Outcomes from Clinical Notes Using Generative Pretrained Transformers. J. Heal. Informatics Res. 9(3): 380-400 (2025)
[j64]Chaitanya Malaviya, Joseph Chee Chang, Dan Roth, Mohit Iyyer, Mark Yatskar, Kyle Lo:
Contextualized Evaluations: Judging Language Model Responses to Underspecified Queries. Trans. Assoc. Comput. Linguistics 13: 878-900 (2025)
[c451]Nishanth Sridhar Nakshatri, Nikhil Mehta, Siyi Liu, Sihao Chen, Daniel Hopkins, Dan Roth, Dan Goldwasser:
Talking Point based Ideological Discourse Analysis in News Events. ACL (Findings) 2025: 575-594
[c450]Adnan Qidwai, Srija Mukhopadhyay, Prerana Khatiwada, Dan Roth, Vivek Gupta:
PRAISE: Enhancing Product Descriptions with LLM-Driven Structured Insights. ACL (3) 2025: 644-652
[c449]Atharv Kulkarni, Kushagra Dixit, Vivek Srikumar, Dan Roth, Vivek Gupta:
LLM-Symbolic Integration for Robust Temporal Tabular Reasoning. ACL (Findings) 2025: 19914-19940
[c448]James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi-An Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth:
DeAL: Decoding-time Alignment for Large Language Models. ACL (1) 2025: 26280-26300
[c447]Peter Baile Chen, Yi Zhang, Mike Cafarella, Dan Roth:
Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method. ACL (1) 2025: 30298-30317
[c446]Amit Agarwal, Hansa Meghwani, Hitesh Laxmichand Patel, Tao Sheng, Sujith Ravi, Dan Roth:
Aligning LLMs for Multilingual Consistency in Enterprise Applications. EMNLP (Industry Track) 2025: 117-137
[c445]Amit Agarwal, Hitesh Laxmichand Patel, Srikant Panda, Hansa Meghwani, Jyotika Singh, Karan Dua, Paul Li, Tao Sheng, Sujith Ravi, Dan Roth:
RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks. EMNLP (Industry Track) 2025: 138-157
[c444]Hitesh Laxmichand Patel, Amit Agarwal, Srikant Panda, Hansa Meghwani, Karan Dua, Paul Li, Tao Sheng, Sujith Ravi, Dan Roth:
PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications. EMNLP (Industry Track) 2025: 195-214
[c443]Jyotika Singh, Weiyi Sun, Amit Agarwal, Viji Krishnamurthy, Yassine Benajiba, Sujith Ravi, Dan Roth:
Can LLMs Narrate Tabular Data? An Evaluation Framework for Natural Language Representations of Text-to-SQL System Outputs. EMNLP (Industry Track) 2025: 883-902
[c442]Siyi Liu, Dan Roth:
Conflicts in Texts: Data, Implications and Challenges. EMNLP (Findings) 2025: 10073-10091
[c441]Yanzhen Shen, Sihao Chen, Xueqiang Xu, Yunyi Zhang, Chaitanya Malaviya, Dan Roth:
LogiCoL: Logically-Informed Contrastive Learning for Set-based Dense Retrieval. EMNLP 2025: 12114-12125
[c440]Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, Dan Roth:
Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model Uncertainty. EMNLP (Findings) 2025: 12349-12375
[c439]Mohammadtaher Safarzadeh, Afshin Oroojlooy, Dan Roth:
Evaluating NL2SQL via SQL2NL. EMNLP (Findings) 2025: 18954-18968
[c438]Yahan Yang, Soham Dan, Shuo Li, Dan Roth, Insup Lee:
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety. EMNLP 2025: 27377-27396
[c437]Rohit Khoja, Devanshu Gupta, Yanjie Fu, Dan Roth, Vivek Gupta:
Weaver: Interweaving SQL and LLM for Table Reasoning. EMNLP 2025: 28282-28308
[c436]Fengze Liu, Haoyu Wang
, Joonhyuk Cho, Dan Roth, Andrew Lo:
AutoCT: Automating Interpretable Clinical Trial Prediction with LLM Agents. EMNLP 2025: 30945-30970
[c435]Yu Feng, Ben Zhou, Weidong Lin, Dan Roth:
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models. ICLR 2025
[c434]Aparna Elangovan, Lei Xu, Jongwoo Ko, Mahsa Elyasi, Ling Liu, Sravan Babu Bodapati, Dan Roth:
Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge. ICLR 2025
[c433]Fei Wang, Xingyu Fu, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Jacky Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen:
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding. ICLR 2025
[c432]Xingyu Fu, Minqian Liu, Zhengyuan Yang, John Corring, Yijuan Lu, Jianwei Yang, Dan Roth, Dinei A. F. Florêncio, Cha Zhang:
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding. ICML 2025
[c431]Jiashu He, Mingyu Derek Ma, Jinxuan Fan, Dan Roth, Wei Wang, Alejandro Ribeiro:
GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation. ICML 2025
[c430]Anirudh Iyengar Kaniyar Narayana Iyengar, Srija Mukhopadhyay, Adnan Qidwai, Shubhankar Singh, Dan Roth, Vivek Gupta:
INTERCHART: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information. IJCNLP-AACL (long papers) 2025: 2046-2067
[c429]Abhishek Rajgaria, Kushagra Dixit, Mayank Vyas, Harshavardhan Kalalbandi, Dan Roth, Vivek Gupta:
No Universal Prompt: Unifying Reasoning through Adaptive Prompting for Temporal Table Reasoning. IJCNLP-AACL (long papers) 2025: 2800-2821
[c428]Dan Roth
:
On Reasoning LLMs: Myths, Merits, and How to Move Forward. KDD (2) 2025: 2
[c427]Siyi Liu, Qiang Ning, Kishaloy Halder, Zheng Qi, Wei Xiao, Phu Mon Htut, Yi Zhang, Neha Anna John, Bonan Min, Yassine Benajiba, Dan Roth:
Open Domain Question Answering with Conflicting Contexts. NAACL (Findings) 2025: 1838-1854
[c426]Pranshu Pandya, Vatsal Gupta
, Agney S. Talwarr, Tushar Kataria, Dan Roth, Vivek Gupta:
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models. NAACL (Findings) 2025: 3680-3708
[c425]Irwin Deng, Kushagra Dixit, Dan Roth, Vivek Gupta:
Enhancing Temporal Understanding in LLMs for Semi-structured Tables. NAACL (Findings) 2025: 4936-4955
[c424]Fei Wang, Chao Shang, Shuai Wang, Sarthak Jain, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth:
Aligning to Constraints for Data-Efficient Language Model Customization. NAACL (Findings) 2025: 5310-5325
[c423]Siddharth Khincha, Tushar Kataria, Ankita Anand, Dan Roth, Vivek Gupta:
Leveraging LLM For Synchronizing Information Across Multilingual Tables. NAACL (Long Papers) 2025: 6474-6492
[c422]Abhilash Reddy Shankarampeta, Harsh Mahajan, Tushar Kataria, Dan Roth, Vivek Gupta:
TRANSIENTTABLES: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables. NAACL (Long Papers) 2025: 6526-6544
[c421]Siyi Liu, Kishaloy Halder, Zheng Qi, Wei Xiao, Nikolaos Pappas, Phu Mon Htut, Neha Anna John, Yassine Benajiba, Dan Roth:
Towards Long Context Hallucination Detection. NAACL (Findings) 2025: 7827-7835
[c420]Sihao Chen, Chaitanya Malaviya, Alex Fabrikant, Hagai Taitelbaum, Tal Schuster, Senaka Buthpitiya, Dan Roth:
On Reference (In-)Determinacy in Natural Language Inference. NAACL (Findings) 2025: 8066-8078
[c419]Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy
:
H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables. NAACL (Long Papers) 2025: 8841-8863
[c418]Srija Mukhopadhyay, Abhishek Rajgaria, Prerana Khatiwada, Manish Shrivastava, Dan Roth, Vivek Gupta:
MAPWise: Evaluating Vision-Language Models for Advanced Map Queries. NAACL (Long Papers) 2025: 9348-9378
[i259]Xingyu Fu, Minqian Liu, Zhengyuan Yang, John Corring, Yijuan Lu, Jianwei Yang, Dan Roth, Dinei A. F. Florêncio, Cha Zhang:
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding. CoRR abs/2501.05452 (2025)
[i258]Peter Baile Chen, Yi Zhang, Michael J. Cafarella, Dan Roth:
Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method. CoRR abs/2501.18539 (2025)
[i257]Ben Zhou, Sarthak Jain, Yi Zhang, Qiang Ning, Shuai Wang, Yassine Benajiba, Dan Roth:
Self-supervised Analogical Learning using Language Models. CoRR abs/2502.00996 (2025)
[i256]Sihao Chen, Chaitanya Malaviya, Alex Fabrikant, Hagai Taitelbaum, Tal Schuster, Senaka Buthpitiya, Dan Roth:
On Reference (In-)Determinacy in Natural Language Inference. CoRR abs/2502.05793 (2025)
[i255]Abhilash Reddy Shankarampeta, Harsh Mahajan, Tushar Kataria, Dan Roth, Vivek Gupta:
TransientTables: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables. CoRR abs/2504.01879 (2025)
[i254]Siddharth Khincha, Tushar Kataria, Ankita Anand, Dan Roth, Vivek Gupta:
Leveraging LLM For Synchronizing Information Across Multilingual Tables. CoRR abs/2504.02559 (2025)
[i253]Peter Baile Chen, Tomer Wolfson, Michael J. Cafarella, Dan Roth:
EnrichIndex: Using LLMs to Enrich Retrieval Indices Offline. CoRR abs/2504.03598 (2025)
[i252]Nishanth Sridhar Nakshatri, Nikhil Mehta, Siyi Liu, Sihao Chen, Daniel Hopkins, Dan Roth, Dan Goldwasser:
Talking Point based Ideological Discourse Analysis in News Events. CoRR abs/2504.07400 (2025)
[i251]Bowen Jiang, Zhuoqun Hao, Young-Min Cho, Bryan Li, Yuan Yuan, Sihao Chen, Lyle H. Ungar, Camillo J. Taylor, Dan Roth:
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale. CoRR abs/2504.14225 (2025)
[i250]Yahan Yang, Soham Dan, Shuo Li, Dan Roth, Insup Lee:
MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning. CoRR abs/2504.15241 (2025)
[i249]Siyi Liu, Kishaloy Halder, Zheng Qi, Wei Xiao, Nikolaos Pappas, Phu Mon Htut, Neha Anna John, Yassine Benajiba, Dan Roth:
Towards Long Context Hallucination Detection. CoRR abs/2504.19457 (2025)
[i248]Siyi Liu, Dan Roth:
Conflicts in Texts: Data, Implications and Challenges. CoRR abs/2504.19472 (2025)
[i247]Peter Baile Chen, Yi Zhang, Dan Roth, Samuel Madden, Jacob Andreas, Michael J. Cafarella:
Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation. CoRR abs/2505.14398 (2025)
[i246]Jiashu He, Jinxuan Fan
, Bowen Jiang, Ignacio Houine, Dan Roth, Alejandro Ribeiro:
Self-GIVE: Associative Thinking from Limited Structured Knowledge for Enhanced Large Language Model Reasoning. CoRR abs/2505.15062 (2025)
[i245]Poojah Ganesan, Rajat Aayush Jha, Dan Roth, Vivek Gupta:
UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification. CoRR abs/2505.18122 (2025)
[i244]Rohit Khoja, Devanshu Gupta, Yanjie Fu, Dan Roth, Vivek Gupta:
Weaver: Interweaving SQL and LLM for Table Reasoning. CoRR abs/2505.18961 (2025)
[i243]Yanzhen Shen, Sihao Chen, Xueqiang Xu, Yunyi Zhang, Chaitanya Malaviya, Dan Roth:
LogiCoL: Logically-Informed Contrastive Learning for Set-based Dense Retrieval. CoRR abs/2505.19588 (2025)
[i242]Fengze Liu, Haoyu Wang, Joonhyuk Cho, Dan Roth, Andrew W. Lo:
AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents. CoRR abs/2506.04293 (2025)
[i241]Atharv Kulkarni, Kushagra Dixit, Vivek Srikumar, Dan Roth, Vivek Gupta:
LLM-Symbolic Integration for Robust Temporal Tabular Reasoning. CoRR abs/2506.05746 (2025)
[i240]Kushagra Dixit, Abhishek Rajgaria, Harshavardhan Kalalbandi, Dan Roth, Vivek Gupta:
No Universal Prompt: Unifying Reasoning through Adaptive Prompting for Temporal Table Reasoning. CoRR abs/2506.11246 (2025)
[i239]Vinay K. Chaudhri, Chaitan Baru, Brandon Bennett, Mehul Bhatt, Darion Cassel, Anthony G. Cohn, Rina Dechter, Esra Erdem, David A. Ferrucci, Kenneth D. Forbus, Gregory Gelfond, Michael R. Genesereth, Andrew S. Gordon, Benjamin N. Grosof, Gopal Gupta, Jim Hendler, Sharat Israni, Tyler R. Josephson, Patrick C. Kyllonen, Yuliya Lierler, Vladimir Lifschitz, Clifton James McFate, Hande K. McGinty, Leora Morgenstern, Alessandro Oltramari, Praveen K. Paritosh, Dan Roth, Blake Shepard, Cogan Shimizu, Denny Vrandecic, Mark Whiting, Michael Witbrock:
A Community-driven vision for a new Knowledge Resource for AI. CoRR abs/2506.16596 (2025)
[i238]Adnan Qidwai, Srija Mukhopadhyay, Prerana Khatiwada, Dan Roth, Vivek Gupta:
PRAISE: Enhancing Product Descriptions with LLM-Driven Structured Insights. CoRR abs/2506.17314 (2025)
[i237]Anirudh Iyengar Kaniyar Narayana Iyengar, Srija Mukhopadhyay, Adnan Qidwai, Shubhankar Singh
, Dan Roth, Vivek Gupta:
InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information. CoRR abs/2508.07630 (2025)
[i236]Tomer Wolfson, Harsh Trivedi, Mor Geva, Yoav Goldberg, Dan Roth, Tushar Khot, Ashish Sabharwal, Reut Tsarfaty:
MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents. CoRR abs/2508.11133 (2025)
[i235]Krunal Shah, Dan Roth:
Reasoning is about giving reasons. CoRR abs/2508.14488 (2025)
[i234]Mohammadtaher Safarzadeh, Afshin Oroojlooyjadid, Dan Roth:
Evaluating NL2SQL via SQL2NL. CoRR abs/2509.04657 (2025)
[i233]Haoyu Wang, Fengze Liu, Jiayao Zhang, Dan Roth, Kyle Richardson:
Event Causality Identification with Synthetic Control. CoRR abs/2509.18156 (2025)
[i232]Xingyu Fu, Siyi Liu, Yinuo Xu, Pan Lu, Guangqiuse Hu, Tianbo Yang, Taran Anantasagar, Christopher Shen, Yikai Mao, Yuanzhe Liu, Keyush Shah, Chung Un Lee, Yejin Choi, James Zou, Dan Roth, Chris Callison-Burch:
Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs. CoRR abs/2509.22646 (2025)
[i231]Amit Agarwal, Hansa Meghwani, Hitesh Laxmichand Patel, Tao Sheng, Sujith Ravi, Dan Roth:
Aligning LLMs for Multilingual Consistency in Enterprise Applications. CoRR abs/2509.23659 (2025)
[i230]Amit Agarwal, Hitesh Laxmichand Patel, Srikant Panda, Hansa Meghwani, Jyotika Singh, Karan Dua, Paul Li, Tao Sheng, Sujith Ravi, Dan Roth:
RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks. CoRR abs/2509.23673 (2025)
[i229]Hitesh Laxmichand Patel, Amit Agarwal, Srikant Panda, Hansa Meghwani, Karan Dua, Paul Li, Tao Sheng, Sujith Ravi, Dan Roth:
PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications. CoRR abs/2509.23879 (2025)
[i228]Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar, Miguel Ballesteros, Alan Ritter, Dan Roth:
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks. CoRR abs/2510.02286 (2025)
[i227]Yining She, Daniel W. Peterson, Marianne Menglin Liu, Vikas Upadhyay, Mohammad Hossein Chaghazardi, Eunsuk Kang, Dan Roth:
RAG Makes Guardrails Unsafe? Investigating Robustness of Guardrails under RAG-style Contexts. CoRR abs/2510.05310 (2025)
[i226]Zhivar Sourati, Zheng Wang, Marianne Menglin Liu, Yazhe Hu, Mengqing Guo, Sujeeth Bharadwaj, Kyu J. Han, Tao Sheng, Sujith Ravi, Morteza Dehghani, Dan Roth:
LAD-RAG: Layout-aware Dynamic RAG for Visually-Rich Document Understanding. CoRR abs/2510.07233 (2025)
[i225]Marianne Menglin Liu, Daniel Garcia, Fjona Parllaku, Vikas Upadhyay, Syed Fahad Allam Shah, Dan Roth:
ToolScope: Enhancing LLM Agent Tool Use through Tool Merging and Context-Aware Filtering. CoRR abs/2510.20036 (2025)
[i224]Jyotika Singh, Weiyi Sun, Amit Agarwal, Viji Krishnamurthy, Yassine Benajiba, Sujith Ravi, Dan Roth:
Can LLMs Narrate Tabular Data? An Evaluation Framework for Natural Language Representations of Text-to-SQL System Outputs. CoRR abs/2510.23854 (2025)
[i223]Marianne Menglin Liu, Sai Ashish Somayajula, Syed Fahad Allam Shah, Sujith Ravi, Dan Roth:
OraPlan-SQL: A Planning-Centric Framework for Complex Bilingual NL2SQL Reasoning. CoRR abs/2510.23870 (2025)
[i222]Rishita Agarwal, Himanshu Singhal, Peter Baile Chen, Manan Roy Choudhury, Dan Roth, Vivek Gupta:
REaR: Retrieve, Expand and Refine for Effective Multitable Retrieval. CoRR abs/2511.00805 (2025)
[i221]Adam Storek, Vikas Upadhyay, Marianne Menglin Liu, Daniel W. Peterson, Anshul Mittal, Sujeeth Bharadwaj, Fahad Shah, Dan Roth:
Routesplain: Towards Faithful and Intervenable Routing for Software-related Tasks. CoRR abs/2511.09373 (2025)
[i220]Bowen Jiang, Yuan Yuan, Maohao Shen, Zhuoqun Hao, Zhangchen Xu, Zichen Chen, Ziyi Liu, Anvesh Rao Vijjini, Jiashu He, Hanchao Yu, Radha Poovendran, Gregory Wornell, Lyle H. Ungar, Dan Roth, Sihao Chen, Camillo Jose Taylor:
PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory. CoRR abs/2512.06688 (2025)
[i219]Peter Baile Chen, Weiyue Li, Dan Roth, Michael J. Cafarella, Samuel Madden, Jacob Andreas:
CONCUR: A Framework for Continual Constrained and Unconstrained Routing. CoRR abs/2512.09386 (2025)- 2024
[j63]Bonan Min
, Hayley Ross
, Elior Sulem
, Amir Pouran Ben Veyseh
, Thien Huu Nguyen
, Oscar Sainz
, Eneko Agirre
, Ilana Heintz
, Dan Roth
:
Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey. ACM Comput. Surv. 56(2): 30:1-30:40 (2024)
[j62]Kevin Xie
, William K. S. Ojemann
, Ryan S. Gallagher, Russell T. Shinohara, Alfredo Lucas, Chloe E. Hill, Roy H. Hamilton, Kevin B. Johnson, Dan Roth, Brian Litt, Colin A. Ellis
:
Disparities in seizure outcomes revealed by large language models. J. Am. Medical Informatics Assoc. 31(6): 1348-1355 (2024)
[c417]Aparna Elangovan, Ling Liu, Lei Xu, Sravan Babu Bodapati, Dan Roth:
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models. ACL (1) 2024: 1137-1160
[c416]Shubhankar Singh
, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta
, Vivek Gupta, Dan Roth:
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts. ACL (Findings) 2024: 1330-1350
[c415]Peter Baile Chen, Yi Zhang, Dan Roth:
Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval. ACL (1) 2024: 2687-2699
[c414]Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth:
Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering. ACL (Findings) 2024: 3853-3878
[c413]Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang:
CoCoMIC: Code Completion by Jointly Modeling In-file and Cross-file Context. LREC/COLING 2024: 3433-3445
[c412]Haoyu Wang, Hongming Zhang, Kaiqiang Song, Dong Yu, Dan Roth:
Event Semantic Classification in Context. EACL (Findings) 2024: 1395-1407
[c411]Xingyu Fu
, Yushi Hu
, Bangzheng Li, Yu Feng, Haoyu Wang
, Xudong Lin, Dan Roth
, Noah A. Smith
, Wei-Chiu Ma
, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. ECCV (23) 2024: 148-166
[c410]Wenpeng Yin, Muhao Chen, Rui Zhang, Ben Zhou, Fei Wang, Dan Roth:
Enhancing LLM Capabilities Beyond Scaling Up. EMNLP (Tutorial Abstracts) 2024: 1-10
[c409]Haoyu Wang
, Tao Li, Zhiwei Deng, Dan Roth, Yang Li:
Devil's Advocate: Anticipatory Reflection for LLM Agents. EMNLP (Findings) 2024: 966-978
[c408]Haoyu Wang
, Fengze Liu, Jiayao Zhang, Dan Roth, Kyle Richardson:
Event Causality Identification with Synthetic Control. EMNLP 2024: 1725-1737
[c407]Bowen Jiang, Yangxinyu Xie
, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie Su, Camillo J. Taylor, Dan Roth:
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners. EMNLP 2024: 4722-4756
[c406]Suyash Vardhan Mathur, Jainit Sushil Bafna, Kunal Kartik, Harshita Khandelwal, Manish Shrivastava, Vivek Gupta, Mohit Bansal, Dan Roth:
Knowledge-Aware Reasoning over Multimodal Semi-structured Tables. EMNLP (Findings) 2024: 14054-14073
[c405]Srija Mukhopadhyay, Adnan Qidwai, Aparna Garimella, Pritika Ramu, Vivek Gupta, Dan Roth:
Unraveling the Truth: Do VLMs really Understand Charts? A Deep Dive into Consistency and Robustness. EMNLP (Findings) 2024: 16696-16717
[c404]Vatsal Gupta
, Pranshu Pandya, Tushar Kataria
, Vivek Gupta, Dan Roth:
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets. EMNLP 2024: 22162-22184
[c403]Dejiao Zhang, Wasi Uddin Ahmad, Ming Tan, Hantian Ding, Ramesh Nallapati, Dan Roth, Xiaofei Ma, Bing Xiang:
Code Representation Learning at Scale. ICLR 2024
[c402]Hantian Ding, Zijian Wang, Giovanni Paolini, Varun Kumar, Anoop Deoras, Dan Roth, Stefano Soatto:
Fewer Truncations Improve Language Modeling. ICML 2024: 11030-11048
[c401]Xiaodong Yu, Hao Cheng, Xiaodong Liu, Dan Roth, Jianfeng Gao:
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks. NAACL-HLT (Findings) 2024: 1333-1351
[c400]Sihao Chen, Hongming Zhang, Tong Chen, Ben Zhou, Wenhao Yu, Dian Yu, Baolin Peng, Hongwei Wang, Dan Roth, Dong Yu:
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations. NAACL-HLT 2024: 1596-1609
[c399]Hangfeng He, Hongming Zhang, Dan Roth:
SocREval: Large Language Models with the Socratic Method for Reference-free Reasoning Evaluation. NAACL-HLT (Findings) 2024: 2736-2764
[c398]Chaitanya Malaviya, Subin Lee, Sihao Chen, Elizabeth Sieber, Mark Yatskar, Dan Roth:
ExpertQA: Expert-Curated Questions and Attributed Answers. NAACL-HLT 2024: 3025-3045
[c397]Chaitanya Malaviya, Subin Lee, Dan Roth, Mark Yatskar:
What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception. NAACL-HLT 2024: 3046-3065
[c396]Bangzheng Li, Ben Zhou, Fei Wang, Xingyu Fu, Dan Roth, Muhao Chen:
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? NAACL-HLT 2024: 7675-7688
[c395]Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Ranjay Krishna:
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models. NeurIPS 2024
[c394]Tianyue Ou, Frank F. Xu, Aman Madaan, Jiarui Liu, Robert Lo, Abishek Sridhar, Sudipta Sengupta, Dan Roth, Graham Neubig, Shuyan Zhou:
Synatra: Turning Indirect Knowledge into Direct Demonstrations for Digital Agents at Scale. NeurIPS 2024
[i218]Dejiao Zhang, Wasi Uddin Ahmad, Ming Tan, Hantian Ding, Ramesh Nallapati, Dan Roth, Xiaofei Ma, Bing Xiang:
Code Representation Learning At Scale. CoRR abs/2402.01935 (2024)
[i217]James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi'an Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth:
DeAL: Decoding-time Alignment for Large Language Models. CoRR abs/2402.06147 (2024)
[i216]Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth:
Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering. CoRR abs/2402.11194 (2024)
[i215]Fei Wang, Chao Shang, Sarthak Jain, Shuai Wang, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth:
From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification. CoRR abs/2403.06326 (2024)
[i214]Bowen Jiang, Zhijun Zhuang, Shreyas S. Shivakumar, Dan Roth, Camillo J. Taylor:
Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering. CoRR abs/2403.14783 (2024)
[i213]Ben Zhou, Hongming Zhang, Sihao Chen, Dian Yu, Hongwei Wang, Baolin Peng, Dan Roth, Dong Yu:
Conceptual and Unbiased Reasoning in Language Models. CoRR abs/2404.00205 (2024)
[i212]Peter Baile Chen, Yi Zhang, Dan Roth:
Is Table Retrieval a Solved Problem? Join-Aware Multi-Table Retrieval. CoRR abs/2404.09889 (2024)
[i211]Hantian Ding, Zijian Wang, Giovanni Paolini, Varun Kumar, Anoop Deoras, Dan Roth, Stefano Soatto:
Fewer Truncations Improve Language Modeling. CoRR abs/2404.10830 (2024)
[i210]Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. CoRR abs/2404.12390 (2024)
[i209]Yu Feng, Ben Zhou, Weidong Lin, Dan Roth:
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models. CoRR abs/2404.12494 (2024)
[i208]Haoyu Wang, Tao Li, Zhiwei Deng, Dan Roth, Yang Li:
Devil's Advocate: Anticipatory Reflection for LLM Agents. CoRR abs/2405.16334 (2024)
[i207]Aparna Elangovan, Ling Liu, Lei Xu, Sravan Bodapati, Dan Roth:
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models. CoRR abs/2405.18638 (2024)
[i206]Xingyu Fu, Muyu He, Yujie Lu, William Yang Wang, Dan Roth:
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? CoRR abs/2406.07546 (2024)
[i205]Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Ranjay Krishna:
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models. CoRR abs/2406.09403 (2024)
[i204]Fei Wang, Xingyu Fu, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen:
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding. CoRR abs/2406.09411 (2024)
[i203]Bowen Jiang, Yangxinyu Xie, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie J. Su
, Camillo J. Taylor, Dan Roth:
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners. CoRR abs/2406.11050 (2024)
[i202]Bangzheng Li, Ben Zhou, Xingyu Fu, Fei Wang, Dan Roth, Muhao Chen:
FamiCom: Further Demystifying Prompts for Language Models with Task-Agnostic Performance Estimation. CoRR abs/2406.11243 (2024)
[i201]Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta
, Vivek Gupta, Dan Roth:
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts. CoRR abs/2406.19237 (2024)
[i200]Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy
:
H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables. CoRR abs/2407.05952 (2024)
[i199]Kaifu Wang, Efthymia Tsamoura, Dan Roth:
On Characterizing and Mitigating Imbalances in Multi-Instance Partial Label Learning. CoRR abs/2407.10000 (2024)
[i198]Pranshu Pandya, Agney S. Talwarr, Vatsal Gupta
, Tushar Kataria, Vivek Gupta, Dan Roth:
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models. CoRR abs/2407.10380 (2024)
[i197]Srija Mukhopadhyay, Adnan Qidwai, Aparna Garimella, Pritika Ramu, Vivek Gupta, Dan Roth:
Unraveling the Truth: Do LLMs really Understand Charts? A Deep Dive into Consistency and Robustness. CoRR abs/2407.11229 (2024)
[i196]Irwin Deng, Kushagra Dixit, Vivek Gupta, Dan Roth:
Enhancing Temporal Understanding in LLMs for Semi-structured Tables. CoRR abs/2407.16030 (2024)
[i195]Suyash Vardhan Mathur, Jainit Sushil Bafna, Kunal Kartik, Harshita Khandelwal, Manish Shrivastava, Vivek Gupta, Mohit Bansal, Dan Roth:
Knowledge-Aware Reasoning over Multimodal Semi-structured Tables. CoRR abs/2408.13860 (2024)
[i194]Srija Mukhopadhyay, Abhishek Rajgaria, Prerana Khatiwada, Vivek Gupta, Dan Roth:
MAPWise: Evaluating Vision-Language Models for Advanced Map Queries. CoRR abs/2409.00255 (2024)


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID