


default search action
ACL 2025: Vienna, Austria
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar:

Findings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-256-5 - Frontmatter.

- Yachao Zhao, Bo Wang, Yan Wang, Dongming Zhao, Ruifang He, Yuexian Hou:

Explicit vs. Implicit: Investigating Social Bias in Large Language Models through Self-Reflection. 1-12 - Yanbei Jiang, Yihao Ding, Chao Lei, Jiayang Ao, Jey Han Lau, Krista A. Ehinger:

Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task. 13-45 - Guhao Feng, Kai Yang, Yuntian Gu, Xinyue Ai, Shengjie Luo, Jiacheng Sun, Di He, Zhenguo Li, Liwei Wang:

How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs. 46-85 - Zeliang Zhang, Xiaodong Liu, Hao Cheng, Chenliang Xu, Jianfeng Gao:

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts. 86-102 - Dongshuo Liu, Zhijing Wu, Dandan Song, Heyan Huang:

A Persona-Aware LLM-Enhanced Framework for Multi-Session Personalized Dialogue Generation. 103-123 - Yanzhi Tian, Zeming Liu, Zhengyang Liu, Yuhang Guo:

Exploring In-Image Machine Translation with Real-World Background. 124-137 - Wei Li, Lujun Li, Mark G. Lee, Shengjie Sun, Lei Zhang, Wei Xue, Yike Guo:

BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios. 138-152 - Lingyuan Liu, Mengxiang Zhang:

GOLFer: Smaller LMs-Generated Documents Hallucination Filter & Combiner for Query Expansion in Information Retrieval. 153-162 - Lingyuan Liu, Mengxiang Zhang:

Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Model-based Query Expansion. 163-173 - Alexander Shvets:

Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification. 174-191 - Zifeng Cheng, Zhaoling Chen, Zhiwei Jiang, Yafeng Yin, Cong Wang, Shiping Ge, Qing Gu:

Multi-Prompting Decoder Helps Better Language Understanding. 192-208 - Sam O'Connor Russell, Naomi Harte:

Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction. 209-221 - Bingxiang He, Ning Ding, Cheng Qian, Jia Deng, Ganqu Cui, Lifan Yuan, Haiwen Hong, Huan-ang Gao, Longtao Huang, Hui Xue, Huimin Chen, Zhiyuan Liu, Maosong Sun:

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning. 222-243 - Jie Zhu, Junhui Li, Yalong Wen, Xiandong Li, Lifan Guo, Feng Chen:

MFinMeeting: A Multilingual, Multi-Sector, and Multi-Task Financial Meeting Understanding Evaluation Dataset. 244-266 - Yijie Zhong, Yunfan Gao, Xiaolian Zhang, Haofen Wang:

ODDA: An OODA-Driven Diverse Data Augmentation Framework for Low-Resource Relation Extraction. 267-285 - Luca Cagliero, Lorenzo Vaiani, Eliana Pastor, Alkis Koudounas, Elena Baralis, Vittorio Mazzia, Sandro Pollastrini, Thomas Gueudré, Manuel Giollo, Daniele Amberti, Yue Wu:

Detecting and Mitigating Challenges in Zero-Shot Video Summarization with Video LLMs. 286-301 - Tarek Mahmoud, Zhuohan Xie, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis, Purificação Silvano, Roman Yangarber, Shivam Sharma, Elisa Sartori, Nicolas Stefanovitch, Giovanni Da San Martino, Jakub Piskorski, Preslav Nakov:

Entity Framing and Role Portrayal in the News. 302-326 - Guangya Wan, Yuqi Wu, Hao Wang, Shengming Zhao, Jie Chen, Sheng Li:

Derailer-Rerailer: Adaptive Verification for Efficient and Reliable Language Model Reasoning. 327-348 - Yiming Li, Zhao Zhang:

Leveraging Large Language Models for Conversational Multi-Doc Question Answering: The First Place of WSDM Cup 2024. 349-355 - Wenyu Tao, Xiaofen Xing, Yirong Chen, Linyi Huang, Xiangmin Xu:

TreeRAG: Unleashing the Power of Hierarchical Storage for Enhanced Knowledge Retrieval in Long Documents. 356-371 - Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo:

Attention with Dependency Parsing Augmentation for Fine-Grained Attribution. 372-387 - Yikuan Hu, Chen Huang, Wenqiang Lei:

ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues. 388-408 - Chen Xiong, Xiangyu Qi, Pin-Yu Chen, Tsung-Yi Ho:

Defensive Prompt Patch: A Robust and Generalizable Defense of Large Language Models against Jailbreak Attacks. 409-437 - Jessica Lin, Amir Zeldes:

GUM-SAGE: A Novel Dataset and Approach for Graded Entity Salience Prediction. 438-455 - Zacchary Sadeddine, Fabian M. Suchanek:

Verifying the Steps of Deductive Reasoning Chains. 456-475 - Pardis Sadat Zahraei, Ali Emami:

Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations. 476-501 - Benjamin C. Warner, Ziqi Xu, Simon Haroutounian, Thomas George Kannampallil, Chenyang Lu:

Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection. 502-520 - Runchu Tian, Yanghao Li, Yuepeng Fu, Siyang Deng, Qinyu Luo, Cheng Qian, Shuo Wang, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Huadong Wang, Xiaojiang Liu:

Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs. 521-533 - Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul-Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu:

Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs. 534-550 - Kazuki Irie:

Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? A Petroglyph Revisited. 551-559 - Guofeng Cui, Pichao Wang, Yang Liu, Zemian Ke, Zhu Liu, Vimal Bhat:

CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation. 560-574 - Nishanth Sridhar Nakshatri, Nikhil Mehta, Siyi Liu, Sihao Chen, Daniel Hopkins, Dan Roth, Dan Goldwasser:

Talking Point based Ideological Discourse Analysis in News Events. 575-594 - Runheng Liu, Xingchen Xiao, Heyan Huang, Zewen Chi, Zhijing Wu:

FlashBack: Efficient Retrieval-Augmented Language Modeling for Fast Inference. 595-608 - Guangya Yu, Yanhao Li, Zongying Jiang, Yuxiong Jin, Li Dai, Yupian Lin, Ruihui Hou, Weiyan Zhang, Yongqi Fan, Qi Ye, Jingping Liu, Tong Ruan:

CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation. 609-626 - Liyu Zhang, Weiqi Wang, Tianqing Fang, Yangqiu Song:

ConKE: Conceptualization-Augmented Knowledge Editing in Large Language Models for Commonsense Reasoning. 627-635 - ChengAo Shen, Zhengzhang Chen, Dongsheng Luo, Dongkuan Xu, Haifeng Chen, Jingchao Ni:

Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery. 636-660 - Yaxun Dai, Haiqin Yang, Hao Mou, Pingfu Chao:

PARSQL: Enhancing Text-to-SQL through SQL Parsing and Reasoning. 661-681 - Yuntai Bao, Xuhong Zhang, Tianyu Du, Xinkui Zhao, Zhengwen Feng, Hao Peng, Jianwei Yin:

Probing the Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks. 682-700 - Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover:

Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization. 701-723 - Junhao Yu, Yan Zhuang, Yuxuan Sun, Weibo Gao, Qi Liu, Mingyue Cheng, Zhenya Huang, Enhong Chen:

TestAgent: An Adaptive and Intelligent Expert for Human Assessment. 724-747 - Quan Ze Chen, Kevin Feng, Chan Young Park, Amy X. Zhang:

SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment. 748-765 - Kushal Jain, Moritz Miller, Niket Tandon, Kumar Shridhar:

First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning. 766-778 - Wei Xiang, Chuanhong Zhan, Qing Zhang, Bang Wang:

Evaluating Instructively Generated Statement by Large Language Models for Directional Event Causality Identification. 779-785 - Chengwei Wei, Bin Wang, Jung-Jae Kim, Guimei Liu, Nancy F. Chen:

CoinMath: Harnessing the Power of Coding Instruction for Math LLM. 786-797 - Zain Muhammad Mujahid, Dilshod Azizov, Maha Tufail Agro, Preslav Nakov:

Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts. 798-819 - Kun Zhang, Oana Balalau, Ioana Manolescu:

Structured Discourse Representation for Factual Consistency Verification. 820-838 - Chuyi Kong, Ziyang Luo, Hongzhan Lin, Zhiyuan Fan, Yaxin Fan, Yuxi Sun, Jing Ma:

SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing LLMs. 839-866 - Luke Gessler, Alexis Palmer, Katharina von der Wense:

Understanding the Gap: an Analysis of Research Collaborations in NLP and Language Documentation. 867-877 - Juntao Tan, Liangwei Yang, Zuxin Liu, Zhiwei Liu, Rithesh R. N., Tulika Manoj Awalgaonkar, Jianguo Zhang, Weiran Yao, Ming Zhu, Shirley Kokane, Silvio Savarese, Huan Wang, Caiming Xiong, Shelby Heinecke:

PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data. 878-893 - Simret Araya Gebreegziabher, Kuangshi Ai, Zheng Zhang, Elena L. Glassman, Toby Jia-Jun Li:

Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning. 894-906 - Eric Modesitt, Ke Yang, Spencer Hulsey, Xin Liu, ChengXiang Zhai, Volodymyr V. Kindratenko:

ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study. 907-926 - Xiaobo Guo, Soroush Vosoughi:

Serial Position Effects of Large Language Models. 927-953 - Zhiyin Yu, Chao Zheng, Chong Chen, Xian-Sheng Hua, Xiao Luo:

scRAG: Hybrid Retrieval-Augmented Generation for LLM-based Cross-Tissue Single-Cell Annotation. 954-970 - Abu Ubaida Akash, Ahmed Fahmy, Amine Trabelsi:

Can Large Language Models Address Open-Target Stance Detection? 971-985 - Congchi Yin, Yongpeng Zhang, Xuyun Wen, Piji Li:

Improve Language Model and Brain Alignment via Associative Memory. 986-999 - Ziyang Ma, Xiquan Li, Yakun Song, Wenxi Chen, Chenpeng Du, Jian Wu, Yuanzhe Chen, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xie Chen:

Towards Reliable Large Audio Language Model. 1000-1014 - Sho Takase, Ryokan Ri, Shun Kiyono, Takuya Kato:

Large Vocabulary Size Improves Large Language Models. 1015-1026 - Zihan Wang, Xiaocui Yang, Yongkang Liu, Shi Feng, Daling Wang, Yifei Zhang:

MUSE: A Multimodal Conversational Recommendation Dataset with Scenario-Grounded User Profiles. 1027-1053 - Michelle Wastl, Jannis Vamvas, Rico Sennrich:

Machine Translation Models are Zero-Shot Detectors of Translation Direction. 1054-1074 - Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Boxing Chen, Sarath Chandar:

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination. 1075-1096 - Jie He, Jennifer Neville, Mengting Wan, Longqi Yang, Hui Liu, Xiaofeng Xu, Xia Song, Jeff Z. Pan, Pei Zhou:

GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation. 1097-1122 - Chengxing Xie, Bowen Li, Chang Gao, He Du, Wai Lam, Difan Zou, Kai Chen:

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution. 1123-1139 - Zixuan Wu, Yoolim Kim, Carolyn Jane Anderson:

GlyphPattern: An Abstract Pattern Recognition for Vision-Language Models. 1140-1175 - Qianli Wang, Nils Feldhus, Simon Ostermann, Luis Felipe Villa-Arenas, Sebastian Möller, Vera Schmitt:

FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation. 1176-1191 - Guocong Li, Weize Liu, Yihang Wu, Ping Wang, Shuaihan Huang, Hongxia Xu, Jian Wu:

From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs. 1192-1209 - Di Wu, Xin Lu, Yanyan Zhao, Bing Qin:

Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models. 1210-1225 - Rongwu Xu, Xiaojian Li, Shuo Chen, Wei Xu:

Nuclear Deployed!: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents. 1226-1310 - Dacao Zhang, Kun Zhang, Shimao Chu, Le Wu, Xin Li, Si Wei:

MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning. 1311-1324 - Xin-Yu Xiao, Yalei Liu, Xiangyu Liu, Zengrui Li, Erwei Yin, Qianchen Xia:

Lunar Twins: We Choose to Go to the Moon with Large Language Models. 1325-1339 - Dora Zhao, Qianou Ma, Xinran Zhao, Chenglei Si, Chenyang Yang, Ryan Louie, Ehud Reiter, Diyi Yang, Tongshuang Wu:

SPHERE: An Evaluation Card for Human-AI Systems. 1340-1365 - Maximillian Chen, Ruoxi Sun, Sercan Ö. Arik:

Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling. 1366-1387 - Haochen Liu, Song Wang, Chen Chen, Jundong Li:

Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models. 1388-1400 - Huaizhi Qu, Xinyu Zhao, Jie Peng, Kwonjoon Lee, Behzad Dariush, Tianlong Chen:

UQ-Merge: Uncertainty Guided Multimodal Large Language Model Merging. 1401-1417 - Korbinian Q. Weidinger, T. Y. S. S. Santosh, Oana Ichim, Matthias Grabmair:

AQuAECHR: Attributed Question Answering for European Court of Human Rights. 1418-1447 - Yuhao Zhang, Xiangnan Ma, Kaiqi Kou, Peizhuo Liu, Weiqiao Shan, Benyou Wang, Tong Xiao, Yuxin Huang, Zhengtao Yu, JingBo Zhu:

Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation. 1448-1460 - Yiqin Wang, Haoji Zhang, Jingqi Tian, Yansong Tang:

Ponder & Press: Advancing Visual GUI Agent towards General Computer Control. 1461-1473 - Jiayi Gui, Yiming Liu, Jiale Cheng, Xiaotao Gu, Xiao Liu, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang:

LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models. 1474-1491 - Jiarui Ji, Runlin Lei, Jialing Bi, Zhewei Wei, Xu Chen, Yankai Lin, Xuchen Pan, Yaliang Li, Bolin Ding:

LLM-Based Multi-Agent Systems are Scalable Graph Generative Models. 1492-1523 - Tiankai Yang, Yi Nian, Li Li, Ruiyao Xu, Yuangang Li, Jiaqi Li, Zhuo Xiao, Xiyang Hu, Ryan A. Rossi, Kaize Ding, Xia Hu, Yue Zhao:

AD-LLM: Benchmarking Large Language Models for Anomaly Detection. 1524-1547 - Jie Liu, Guohua Wang, Ronghui Yang, Jiajie Zeng, Mengchen Zhao, Yi Cai:

RTADev: Intention Aligned Multi-Agent Framework for Software Development. 1548-1581 - Shivam Shandilya, Menglin Xia, Supriyo Ghosh, Huiqiang Jiang, Jue Zhang, Qianhui Wu, Victor Rühle, Saravan Rajmohan:

TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning. 1582-1597 - Kyeongman Park, Minbeom Kim, Kyomin Jung:

A Character-Centric Creative Story Generation via Imagination. 1598-1645 - Minghan Wang, Viet-Thanh Pham, Farhad Moghimifar, Thuy-Trang Vu:

Proverbs Run in Pairs: Evaluating Proverb Translation Capability of Large Language Model. 1646-1662 - Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li:

Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration. 1663-1699 - Chuanyuan Tan, Wenbiao Shao, Hao Xiong, Tong Zhu, Zhenhua Liu, Kai Shi, Wenliang Chen:

UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions. 1700-1715 - Minjie Qiang, Zhongqing Wang, Xiaoyi Bao, Haoyuan Ma, Shoushan Li, Guodong Zhou:

Exploring Knowledge Filtering for Retrieval-Augmented Discriminative Tasks. 1716-1729 - Chong Li, Yingzhuo Deng, Jiajun Zhang, Chengqing Zong:

Group then Scale: Dynamic Mixture-of-Experts Multilingual Language Model. 1730-1754 - Fangxu Yu, Junjie Guo, Zhen Wu, Xinyu Dai:

Beyond Verbal Cues: Emotional Contagion Graph Network for Causal Emotion Entailment. 1755-1767 - Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun:

Critic-CoT: Boosting the Reasoning Abilities of Large Language Model via Chain-of-Thought Critic. 1768-1806 - Sondre Wold, Lucas Georges Gabriel Charpentier, Étienne Simon:

Systematic Generalization in Language Models Scales with Information Entropy. 1807-1819 - Byung-Doh Oh, Hongao Zhu, William Schuler:

The Inverse Scaling Effect of Pre-Trained Language Model Surprisal Is Not Due to Data Leakage. 1820-1827 - Ganlin Xu, Zhoujia Zhang, Wangyi Mei, Jiaqing Liang, Weijia Lu, Xiaodong Zhang, Zhifei Yang, Xiaofeng Ma, Yanghua Xiao, Deqing Yang:

Logical Consistency is Vital: Neural-Symbolic Information Retrieval for Negative-Constraint Queries. 1828-1847 - Rena Wei Gao, Xuetong Wu, Siwen Luo, Caren Han, Feng Liu:

'No' Matters: Out-of-Distribution Detection in Multimodality Multi-Turn Interactive Dialogue Download PDF. 1848-1864 - Qizhi Wan, Liu Tao, Changxuan Wan, Rong Hu, Keli Xiao, Yuxin Shuai:

Event Pattern-Instance Graph: A Multi-Round Role Representation Learning Strategy for Document-Level Event Argument Extraction. 1865-1877 - Lukas Edman, Helmut Schmid, Alexander Fraser:

EXECUTE: A Multilingual Benchmark for LLM Token Understanding. 1878-1887 - Wei-Fan Chen, Zhixue Zhao, Akbar Karimi, Lucie Flek:

Explainable Hallucination through Natural Language Inference Mapping. 1888-1896 - Hao Liu, Zhengren Wang, Xi Chen, Zhiyu Li, Feiyu Xiong, Qinhan Yu, Wentao Zhang:

HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation. 1897-1913 - Markus Frohmann, Gabriel Meseguer-Brocal, Markus Schedl, Elena V. Epure:

Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion. 1914-1926 - Sangmin Woo, Donguk Kim, Jaehyuk Jang, Yubin Choi, Changick Kim:

Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models. 1927-1951 - Xiaoning Dong, Wenbo Hu, Wei Xu, Tianxing He:

SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage. 1952-1987 - Yifan Hu, Rui Liu, Yi Ren, Xiang Yin, Haizhou Li:

Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis. 1988-2003 - Aochuan Chen, Jiashun Cheng, Zijing Liu, Ziqi Gao, Fugee Tsung, Yu Li, Jia Li:

Parameter-Efficient Fine-Tuning via Circular Convolution. 2004-2019 - Jiahao Li, Zhendong Mao, Quan Wang:

Alleviating Hallucinations in Large Language Models via Truthfulness-driven Rank-adaptive LoRA. 2020-2031 - Xinye Li, Zunwen Zheng, Qian Zhang, Dekai Zhuang, Jiabao Kang, Liyan Xu, Qingbin Liu, Xi Chen, Zhiying Tu, Dianhui Chu, Dianbo Sui:

ScEdit: Script-based Assessment of Knowledge Editing. 2032-2052 - Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang:

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models. 2053-2069 - Rena Wei Gao, Ming-Bin Chen, Lea Frermann, Jey Han Lau:

Moderation Matters: Measuring Conversational Moderation Impact in English as a Second Language Group Discussion. 2070-2095 - Katherine Atwell, Mandy Simons, Malihe Alikhani:

Measuring Bias and Agreement in Large Language Model Presupposition Judgments. 2096-2107 - Jeonghun Baek, Akiko Aizawa, Kiyoharu Aizawa:

Harnessing PDF Data for Improving Japanese Large Multimodal Models. 2108-2123 - Pranaydeep Singh, Eneko Agirre, Gorka Azkune, Orphée De Clercq, Els Lefever:

EnerGIZAr: Leveraging GIZA++ for Effective Tokenizer Initialization. 2124-2137 - Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Guozhi Wang, Dingyu Zhang, Shuai Ren, Hongsheng Li:

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents. 2138-2156 - Houjun Liu, John Bauer, Christopher D. Manning:

Drop Dropout on Single Epoch Language Model Pretraining. 2157-2166 - Zongqi Wang, Baoyuan Wu, Jingyuan Deng, Yujiu Yang:

Robust and Minimally Invasive Watermarking for EaaS. 2167-2191 - Andrei Jarca, Florinel-Alin Croitoru, Radu Tudor Ionescu:

Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text. 2192-2201 - Taneesh Gupta, Shivam Shandilya, Xuchao Zhang, Rahul Madhavan, Supriyo Ghosh, Chetan Bansal, Huaxiu Yao, Saravan Rajmohan:

CARMO: Dynamic Criteria Generation for Context Aware Reward Modelling. 2202-2261 - Wenxi Chen, Ziyang Ma, Ruiqi Yan, Yuzhe Liang, Xiquan Li, Ruiyang Xu, Zhikang Niu, Yanqiao Zhu, Yifan Yang, Zhanxun Liu, Kai Yu, Yuxuan Hu, Jinyu Li, Yan Lu, Shujie Liu, Xie Chen:

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training. 2262-2282 - Yanyang Li, Tin Long Wong, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Ka Wai Liu, Michael R. Lyu, Liwei Wang:

C²LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation. 2283-2306 - Wei Zhou, Mohsen Mesgar, Heike Adel, Annemarie Friedrich:

Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering. 2307-2318 - Keyeun Lee, Seolhee Lee, Esther Hehsun Kim, Yena Ko, Jinsu Eun, Dahee Kim, Hyewon Cho, Haiyi Zhu, Robert E. Kraut, Eunyoung Suh, Eun-mee Kim, Hajin Lim:

Adaptive-VP: A Framework for LLM-Based Virtual Patients that Adapts to Trainees' Dialogue to Facilitate Nurse Communication Training. 2319-2352 - Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Minghui Fang, Jieming Zhu, Zhenhua Dong, Sashuai Zhou, Zhou Zhao:

Enhancing Multimodal Unified Representations for Cross Modal Generalization. 2353-2366 - Da Ju, Hagen Blix, Adina Williams:

Domain Regeneration: How well do LLMs match syntactic properties of text domains? 2367-2388 - Raphaël Mouravieff, Benjamin Piwowarski, Sylvain Lamprier:

Structural Deep Encoding for Table Question Answering. 2389-2402 - Bo Li, Gexiang Fang, Wei Ye, Zhenghua Xu, Jinglei Zhang, Hao Cheng, Shikun Zhang:

MPL: Multiple Programming Languages with Large Language Models for Information Extraction. 2403-2414 - Zheng Chu, Huiming Fan, Jingchang Chen, Qianyu Wang, Mingda Yang, Jiafeng Liang, Zhongjie Wang, Hao Li, Guo Tang, Ming Liu, Bing Qin:

Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering. 2415-2438 - Ruizhe Li, Yanjun Gao:

Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions. 2439-2465 - Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li:

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation. 2466-2482 - Ruikang Hu, Shaoyu Lin, Yeliang Xiu, Yongmei Liu:

LTRAG: Enhancing Autoformalization and Self-refinement for Logical Reasoning with Thought-Guided RAG. 2483-2493 - Giuseppe Ruggiero, Matteo Testa, Jurgen Van de Walle, Luigi Di Caro:

Eta-WavLM: Efficient Speaker Identity Removal in Self-Supervised Speech Representations Using a Simple Linear Equation. 2494-2504 - Ke Wang, Junting Pan, Linda Wei, Aojun Zhou, Weikang Shi, Zimu Lu, Han Xiao, Yunqiao Yang, Houxing Ren, Mingjie Zhan, Hongsheng Li:

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning. 2505-2534 - Boyang Xue, Hongru Wang, Rui Wang, Sheng Wang, Zezhong Wang, Yiming Du, Bin Liang, Wenxuan Zhang, Kam-Fai Wong:

MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models. 2535-2556 - Keyuan Cheng, Zijian Kan, Zhuoran Zhang, Muhammad Asif Ali, Lijie Hu, Di Wang:

COMPKE: Complex Question Answering under Knowledge Editing. 2557-2576 - Junhao Hu, Wenrui Huang, Weidong Wang, Zhenwen Li, Tiancheng Hu, Zhixia Liu, Xusheng Chen, Tao Xie, Yizhou Shan:

RaaS: Reasoning-Aware Attention Sparsity for Efficient LLM Reasoning. 2577-2590 - Rongguang Ye, Ming Tang:

One-for-All Pruning: A Universal Model for Customized Compression of Large Language Models. 2591-2604 - Shangda Wu, Zhancheng Guo, Ruibin Yuan, Junyan Jiang, Seungheon Doh, Gus Xia, Juhan Nam, Xiaobing Li, Feng Yu, Maosong Sun:

CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages. 2605-2625 - Ming Zhang, Yuhui Wang, Yujiong Shen, Tingyi Yang, Changhao Jiang, Yilong Wu, Shihan Dou, Qinhao Chen, Zhiheng Xi, Zhihao Zhang, Yi Dong, Zhen Wang, Zhihui Fei, Mingyang Wan, Tao Liang, Guojun Ma, Qi Zhang, Tao Gui, Xuanjing Huang:

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts. 2626-2649 - Lang Qin, Yao Zhang, Hongru Liang, Adam Jatowt, Zhenglu Yang:

Listening to Patients: Detecting and Mitigating Patient Misreport in Medical Dialogue System. 2650-2664 - Xiaoyang Hu, Richard L. Lewis:

Do Language Models Understand the Cognitive Tasks Given to Them? Investigations with the N-Back Paradigm. 2665-2677 - Yuxia Geng, Runkai Zhu, Jiaoyan Chen, Jintai Chen, Xiang Chen, Zhuo Chen, Shuofei Qiao, Yuxiang Wang, Xiaoliang Xu, Sheng-Jun Huang:

Graph-guided Cross-composition Feature Disentanglement for Compositional Zero-shot Learning. 2678-2690 - Wenhao Li, Yuxin Zhang, Gen Luo, Daohai Yu, Rongrong Ji:

Training Long-Context LLMs Efficiently via Chunk-wise Optimization. 2691-2700 - Jiashun Cheng, Aochuan Chen, Nuo Chen, Ziqi Gao, Yuhan Li, Jia Li, Fugee Tsung:

Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps. 2701-2718 - Keyuan Cheng, Xudong Shen, Yihao Yang, TengyueWang TengyueWang, Yang Cao, Muhammad Asif Ali, Hanbin Wang, Lijie Hu, Di Wang:

CODEMENV: Benchmarking Large Language Models on Code Migration. 2719-2744 - V. S. D. S. Mahesh Akavarapu, Hrishikesh Terdalkar, Pramit Bhattacharyya, Shubhangi Agarwal, Vishakha Deulgaonkar, Chaitali Dangarikar, Pralay Manna, Arnab Bhattacharya:

A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs. 2745-2761 - Jilong Li, Zhenxi Song, Jiaqi Wang, Meishan Zhang, Honghai Liu, Min Zhang, Zhiguo Zhang:

BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation. 2762-2778 - Yahan Yu, Duzhen Zhang, Yong Ren, Xuanle Zhao, Xiuyi Chen, Chenhui Chu:

Progressive LoRA for Multimodal Continual Instruction Tuning. 2779-2796 - Lukasz Borchmann:

ARC 'Challenge' Is Not That Challenging. 2797-2804 - Vera Neplenbroek, Arianna Bisazza, Raquel Fernández:

Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation. 2805-2830 - Tomás Vergara Browne, Alvaro Soto:

Tracr-Injection: Distilling Algorithms into Pre-trained Language Models. 2831-2843 - Ximing Dong, Shaowei Wang, Dayi Lin, Ahmed E. Hassan:

Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization. 2844-2859 - Wei Yao, Wenkai Yang, Ziqiao Wang, Yankai Lin, Yong Liu:

Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL. 2860-2888 - Felix Drinkall, Stefan Zohren, Michael McMahon, Janet B. Pierrehumbert:

Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups. 2889-2904 - Miao Yu, Shilong Wang, Guibin Zhang, Junyuan Mao, Chenlong Yin, Qijiong Liu, Kun Wang, Qingsong Wen, Yang Wang:

NetSafe: Exploring the Topological Safety of Multi-agent System. 2905-2938 - Qiji Zhou, Yifan Gong, Guangsheng Bao, Hongjie Qiu, Jinqiang Li, Xiangrong Zhu, Huajian Zhang, Yue Zhang:

Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation. 2939-2957 - Hanlun Zhu, Yunshi Lan, Xiang Li, Weining Qian:

Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing. 2958-2971 - Jiaqi Li, Yixuan Tang, Yi Yang:

Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning. 2972-2989 - Siqi Fan, Xuezhi Fang, Xingrun Xing, Peng Han, Shuo Shang, Yequan Wang:

Position-Aware Depth Decay Decoding (D³): Boosting Large Language Model Inference Efficiency. 2990-3001 - Anirudh Maiya, Razan Alghamdi, Maria Leonor Pacheco, Ashutosh Trivedi, Fabio Somenzi:

Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku. 3002-3009 - Andrea Pedrotti, Michele Papucci, Cristiano Ciaccio, Alessio Miaschi, Giovanni Puccetti, Felice Dell'Orletta, Andrea Esuli:

Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors. 3010-3031 - Siqi Ouyang, Xi Xu, Lei Li:

InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model. 3032-3046 - Jiahui Geng, Qing Li, Zongxiong Chen, Yuxia Wang, Derui Zhu, Zhuohan Xie, Chenyang Lyu, Xiuying Chen, Preslav Nakov, Fakhri Karray:

VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration. 3047-3059 - Haozhe Wang, Long Li, Chao Qu, Weidi Xu, Fengming Zhu, Wei Chu, Fangzhen Lin:

To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization. 3060-3075 - Soo Kyung Kim, Hyunsoo Cho:

GOODLIAR: A Reinforcement Learning-Based Deceptive Agent for Disrupting LLM Beliefs on Foundational Principles. 3076-3101 - James Xu Zhao, Jimmy Z. J. Liu, Bryan Hooi, See-Kiong Ng:

How Does Response Length Affect Long-Form Factuality. 3102-3125 - Guiyang Hou, Wenqi Zhang, Zhe Zheng, Yongliang Shen, Weiming Lu:

Scaling LLMs' Social Reasoning: Sprinkle Cognitive "Aha Moment" into Fundamental Long-thought Logical Capabilities. 3126-3138 - Yuzheng Cai, Zhenyue Guo, Yiwen Pei, Wanrui Bian, Weiguo Zheng:

SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation. 3139-3158 - Bihan Zhou, Haopeng Ren, Li Yuan, Yi Cai, Liuwen Cao, Zikun Deng:

RuleEdit: Towards Rule-Level Knowledge Generalization to Mitigate Over-Editing in Large Language Models. 3159-3175 - Yifu Qiu, Varun R. Embar, Yizhe Zhang, Navdeep Jaitly, Shay B. Cohen, Benjamin Han:

Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models. 3176-3192 - Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, Qi Zhang:

GeAR: Generation Augmented Retrieval. 3193-3207 - Yanzhen Shen, Yu Zhang, Yunyi Zhang, Jiawei Han:

A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion. 3208-3220 - Yuzhe Ding, Kang He, Bobo Li, Li Zheng, Haijun He, Fei Li, Chong Teng, Donghong Ji:

Zero-Shot Conversational Stance Detection: Dataset and Approaches. 3221-3235 - Cehao Yang, Xueyuan Lin, Chengjin Xu, Xuhui Jiang, Shengjie Ma, Aofan Liu, Hui Xiong, Jian Guo:

LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data. 3236-3256 - Rongwen Zhao, Jeffrey Flanigan:

SYNTHVERIFY: Enhancing Zero-Shot Claim Verification through Step-by-Step Synthetic Data Generation. 3257-3274 - Xu Chu, Zhijie Tan, Hanlin Xue, Guanyu Wang, Tong Mo, Weiping Li:

Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains. 3275-3293 - Zihao Wu, YongXiang Hua, Yongxin Zhu, Fang Zhang, Linli Xu:

Dynamic Prefix as Instructor for Incremental Named Entity Recognition: A Unified Seq2Seq Generation Framework. 3294-3306 - Somin Wadhwa, Chantal Shaib, Silvio Amir, Byron C. Wallace:

Who Taught You That? Tracing Teachers in Model Distillation. 3307-3315 - Grace Byun, Jinho D. Choi:

D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Models. 3316-3349 - Jun Wang, Jiamu Zhou, Xihuai Wang, Xiaoyun Mo, Haoyu Zhang, Qiqiang Lin, Jincheng Jincheng, Muning Wen, Weinan Zhang, Qiuying Peng:

HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Assistant Scenarios. 3350-3376 - Do Xuan Long, Duong Ngoc Yen, Do Xuan Trong, Anh Tuan Luu, Kenji Kawaguchi, Shafiq Joty, Min-Yen Kan, Nancy F. Chen:

Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines. 3377-3411 - Gabriele Tuccio, Luana Bulla, Maria Madonia, Aldo Gangemi, Misael Mongiovì:

GRAMMAR-LLM: Grammar-Constrained Natural Language Generation. 3412-3422 - Han Zhou, Qitong Xu, Yiheng Dong, Xin Yang:

MANBench: Is Your Multimodal Model Smarter than Human? 3423-3449 - Mahammed Kamruzzaman, Abdullah Al Monsur, Shrabon Kumar Das, Enamul Hassan, Gene Louis Kim:

BanStereoSet: A Dataset to Measure Stereotypical Social Biases in LLMs for Bangla. 3450-3460 - Matthieu Futeral, Armel Randy Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot:

mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus. 3461-3494 - Vladislav Mikhailov, Tita Ranveig Enstad, David Samuel, Hans Christian Farsethås, Andrey Kutuzov, Erik Velldal, Lilja Øvrelid:

NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark. 3495-3541 - Thang Le, Huy Huu Nguyen, Anh Tuan Luu, Thien Huu Nguyen:

Massively Multilingual Instruction-Following Information Extraction. 3542-3585 - Kang He, Yuzhe Ding, Haining Wang, Fei Li, Chong Teng, Donghong Ji:

DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning. 3586-3601 - Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan Chen, Xiangyu Shi, Yu Li:

Large Language Models in Bioinformatics: A Survey. 3602-3615 - Xuanle Zhao, Xuexin Liu, Haoyue Yang, Xianzhen Luo, Fanhu Zeng, Jianling Li, Qi Shi, Chi Chen:

ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing. 3616-3630 - Qin Liu, Chao Shang, Ling Liu, Nikolaos Pappas, Jie Ma, Neha Anna John, Srikanth Doss, Lluís Màrquez, Miguel Ballesteros, Yassine Benajiba:

Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models. 3631-3643 - Xiyue Zhu, Peng Tang, Haofu Liao, Srikar Appalaraju:

Turbocharging Web Automation: The Impact of Compressed History States. 3644-3651 - Weijie Shi, Hao Chen, Jiaming Li, Yao Zhao, Yazhong Zhang, Qijin Chen, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Xiaofang Zhou:

Making RALM Robust to Irrelevant Contexts via Layer Knowledge Guided Attention. 3652-3668 - Yuting Huang, Chengyuan Liu, Yifeng Feng, Yiquan Wu, Chao Wu, Fei Wu, Kun Kuang:

Rewrite to Jailbreak: Discover Learnable and Transferable Implicit Harmfulness Instruction. 3669-3690 - Mert Inan, Anthony Sicilia, Malihe Alikhani:

SignAlignLM: Integrating Multimodal Sign Language Processing into Large Language Models. 3691-3706 - Yuhui Zhang, Yuchang Su, Yiming Liu, Serena Yeung-Levy:

NegVQA: Can Vision Language Models Understand Negation? 3707-3716 - Debela Gemechu, Ramon Ruiz-Dolz, Henrike Beyer, Chris Reed:

Natural Language Reasoning in Large Language Models: Analysis and Evaluation. 3717-3741 - Haoran Wang, Zhenyu Hou, Yao Wei, Jie Tang, Yuxiao Dong:

SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling. 3742-3761 - Janek Bevendorff, Matti Wiegmann, Emmelie Richter, Martin Potthast, Benno Stein:

The Two Paradigms of LLM Detection: Authorship Attribution vs Authorship Verification. 3762-3787 - Yue Wan, Xiaowei Jia, Xiang Lorraine Li:

Unveiling Confirmation Bias in Chain-of-Thought Reasoning. 3788-3804 - Mufan Qiu, Xinyu Hu, Fengwei Zhan, Sukwon Yun, Jie Peng, Ruichen Zhang, Bhavya Kailkhura, Jiekun Yang, Tianlong Chen:

GRNFormer: A Biologically-Guided Framework for Integrating Gene Regulatory Networks into RNA Foundation Models. 3805-3819 - Yihang Cheng, Lan Zhang, Junyang Wang, Mu Yuan, Yunhao Yao:

RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service. 3820-3837 - Sharath Naganna, Saprativa Bhattacharjee, Biplab Banerjee, Pushpak Bhattacharyya:

"My life is miserable, have to sign 500 autographs everyday": Exposing Humblebragging, the Brags in Disguise. 3838-3858 - Xuanliang Zhang, Dingzirui Wang, Baoxin Wang, Longxu Dou, Xinyuan Lu, Keyan Xu, Dayong Wu, Qingfu Zhu:

SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types. 3859-3881 - Yingtai Xiao, Yuqing Zhu, Sirat Samyoun, Wanrong Zhang, Jiachen T. Wang, Jian Du:

TokenShapley: Token Level Context Attribution with Shapley Value. 3882-3894 - Jinghan Zhang, Xiting Wang, Fengran Mo, Yeyang Zhou, Wanfu Gao, Kunpeng Liu:

Entropy-based Exploration Conduction for Multi-step Reasoning. 3895-3906 - Emily Corvi, Hannah Washington, Stefanie Reed, Chad Atalla, Alexandra Chouldechova, P. Alex Dow, Jean Garcia-Gathright, Nicholas J. Pangakis, Emily Sheng, Dan Vann, Matthew Vogel, Hanna M. Wallach:

Taxonomizing Representational Harms using Speech Act Theory. 3907-3932 - Prafulla Kumar Choubey, Xiangyu Peng, Shilpa Bhagavath, Caiming Xiong, Shiva Kumar Pentyala, Chien-Sheng Wu:

Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents. 3933-3954 - Hayden S. Helm, Aranyak Acharyya, Youngser Park, Brandon Duderstadt, Carey E. Priebe:

Statistical inference on black-box generative models in the data kernel perspective space. 3955-3970 - Sohee Yang, Nora Kassner, Elena Gribovskaya, Sebastian Riedel, Mor Geva:

Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts? 3971-3992 - Zihan Liu, Yang Chen, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping:

AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling. 3993-4015 - Yongan Yu, Qingchen Hu, Xianda Du, Jiayin Wang, Fengran Mo, Renée Sieber:

WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models. 4016-4035 - Yun Zhang, Xue Geng, Lizi Liao, Jintong Sun, Minghe Yu, Ge Yu:

MeMoTune: A Measure and Moment-Driven Fine-Tuning Framework for Quantized Large Language Models. 4036-4050 - Sagi Shaier, George Arthur Baker, Chiranthan Sridhar, Lawrence Hunter, Katharina von der Wense:

MALAMUTE: A Multilingual, Highly-granular, Template-free, Education-based Probing Dataset. 4051-4069 - Xiaoyi Bao, Jinghang Gu, Zhongqing Wang, Chu-Ren Huang:

Sentimental Image Generation for Aspect-based Sentiment Analysis. 4070-4081 - Zihang Liu, Jiawei Guo, Hao Zhang, Hongyang Chen, Jiajun Bu, Haishuai Wang:

Long-form Hallucination Detection with Self-elicitation. 4082-4100 - Qing Zong, Zhaowei Wang, Tianshi Zheng, Xiyu Ren, Yangqiu Song:

ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty. 4101-4117 - Rui He, Zhongqing Wang, Minjie Qiang, Hongling Wang, Yifan Zhang, Hua Xu, Shuai Fan, Guodong Zhou:

One-Dimensional Object Detection for Streaming Text Segmentation of Meeting Dialogue. 4118-4130 - Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Zhenyu Wu, Shangbin Feng, Meng Jiang:

CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts. 4131-4144 - Yuqicheng Zhu, Daniel Hernández, Yuan He, Zifeng Ding, Bo Xiong, Evgeny Kharlamov, Steffen Staab:

Predicate-Conditional Conformalized Answer Sets for Knowledge Graph Embeddings. 4145-4167 - Yifan Zhang, Yifan Luo, Yang Yuan, Andrew C. Yao:

Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts. 4168-4189 - Zhuochun Li, Yuelyu Ji, Rui Meng, Daqing He:

Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review. 4190-4205 - Orchid Chetia Phukan, Drishti Singh, Swarup Ranjan Behera, Arun Balaji Buduru, Rajesh Sharma:

Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution. 4206-4214 - Bryan Li, Fiona Luo, Samar Haider, Adwait Agashe, Siyu Li, Runqi Liu, Miranda Muqing Miao, Shriya Ramakrishnan, Yuan Yuan, Chris Callison-Burch:

Multilingual Retrieval Augmented Generation for Culturally-Sensitive Tasks: A Benchmark for Cross-lingual Robustness. 4215-4241 - Pengyue Jia, Derong Xu, Xiaopeng Li, Zhaocheng Du, Xiangyang Li, Yichao Wang, Yuhao Wang, Qidong Liu, Maolin Wang, Huifeng Guo, Ruiming Tang, Xiangyu Zhao:

Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation. 4242-4256 - Yifei He, Alon Benhaim, Barun Patra, Praneetha Vaddamanu, Sanchit Ahuja, Parul Chopra, Vishrav Chaudhary, Han Zhao, Xia Song:

Scaling Laws for Multilingual Language Models. 4257-4273 - Jinyan Su, Preslav Nakov, Claire Cardie:

Corpus Poisoning via Approximate Greedy Gradient Descent. 4274-4294 - Huitong Pan, Qi Zhang, Mustapha Adamu, Eduard C. Dragut, Longin Jan Latecki:

Taxonomy-Driven Knowledge Graph Construction for Domain-Specific Scientific Applications. 4295-4320 - Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar:

Wanda++: Pruning Large Language Models via Regional Gradients. 4321-4333 - Vageesh Kumar Saxena, Benjamin Bashpole, Gijs van Dijck, Gerasimos Spanakis:

MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data. 4334-4373 - Shu Yang, Shenzhe Zhu, Zeyu Wu, Keyu Wang, Junchi Yao, Junchao Wu, Lijie Hu, Mengdi Li, Derek F. Wong, Di Wang:

Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements. 4374-4420 - Rafael Alberto Rivera Soto, Barry Y. Chen, Nicholas Andrews:

Mitigating Paraphrase Attacks on Machine-Text Detection via Paraphrase Inversion. 4421-4433 - Arijit Maji, Raghvendra Kumar, Akash Ghosh, Anushka, Sriparna Saha:

SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models' Knowledge of Indian Culture. 4434-4451 - Lu Yan, Siyuan Cheng, Xuan Chen, Kaiyuan Zhang, Guangyu Shen, Xiangyu Zhang:

System Prompt Hijacking via Permutation Triggers in LLM Supply Chains. 4452-4473 - Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney:

Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers. 4474-4489 - Gyeongeun Lee, Zhu Wang, Sathya N. Ravi, Natalie Parde:

From Heart to Words: Generating Empathetic Responses via Integrated Figurative Language and Semantic Context Signals. 4490-4502 - Nurul Fajrin Ariyani, Zied Bouraoui, Richard Booth, Steven Schockaert:

There's No Such Thing as Simple Reasoning for LLMs. 4503-4514 - Aaron Gluck, Katharina von der Wense, Maria Leonor Pacheco:

CLIX: Cross-Lingual Explanations of Idiomatic Expressions. 4515-4529 - Dang Nguyen, Ali Payani, Baharan Mirzasoleiman:

Beyond Semantic Entropy: Boosting LLM Uncertainty Quantification with Pairwise Semantic Similarity. 4530-4540 - Xiaoqiang Wang, Suyuchen Wang, Yun Zhu, Bang Liu:

R³Mem: Bridging Memory Retention and Retrieval via Reversible Compression. 4541-4557 - Tiejin Chen, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei:

Vision Language Model Helps Private Information De-Identification in Vision Data. 4558-4572 - Tiejin Chen, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei:

Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges. 4573-4586 - Yebowen Hu, Xiaoyang Wang, Wenlin Yao, Yiming Lu, Daoan Zhang, Hassan Foroosh, Dong Yu, Fei Liu:

DeFine: Decision-Making with Analogical Reasoning over Factor Profiles. 4587-4603 - Cheng Qian, Emre Can Acikgoz, Hongru Wang, Xiusi Chen, Avirup Sil, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji:

SMART: Self-Aware Agent for Tool Overuse Mitigation. 4604-4621 - Pablo Rodríguez, Silvia Paniagua Suárez, Pablo Gamallo, Susana Sotelo Docío:

Continued Pretraining and Interpretability-Based Evaluation for Low-Resource Languages: A Galician Case Study. 4622-4637 - Weixi Feng, Jiachen Li, Michael Saxon, Tsu-Jui Fu, Wenhu Chen, William Yang Wang:

TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation. 4638-4662 - Hanzhi Zhang, Heng Fan, Kewei Sha, Yan Huang, Yunhe Feng:

DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration. 4663-4676 - Bhaktipriya Radharapu, Manon Revel, Megan Ung, Sebastian Ruder, Adina Williams:

Arbiters of Ambivalence: Challenges of using LLMs in No-Consensus tasks. 4677-4731 - Sireesh Gururaja, Nupoor Gandhi, Jeremiah Milbauer, Emma Strubell:

Beyond Text: Characterizing Domain Expert Needs in Document Research. 4732-4745 - Murong Yue, Ziyu Yao:

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack. 4746-4761 - Shih-Han Chou, Shivam Chandhok, Jim Little, Leonid Sigal:

MM-R³: On (In-)Consistency of Vision-Language Models (VLMs). 4762-4788 - Yuepei Li, Kang Zhou, Qiao Qiao, Bach Nguyen, Qing Wang, Qi Li:

Investigating Context Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style. 4789-4807 - Ziyi Yin, Muchao Ye, Yuanpu Cao, Jiaqi Wang, Aofei Chang, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma:

Shadow-Activated Backdoor Attacks on Multimodal Large Language Models. 4808-4829 - Kung-Hsiang Huang, Can Qin, Haoyi Qiu, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu:

Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding. 4830-4843 - Shihao Cai, Chongming Gao, Yang Zhang, Wentao Shi, Jizhi Zhang, Keqin Bao, Qifan Wang, Fuli Feng:

K-order Ranking Preference Optimization for Large Language Models. 4844-4859 - Xuyuan Liu, Lei Hsiung, Yaoqing Yang, Yujun Yan:

Spectral Insights into Data-Oblivious Critical Layers in Large Language Models. 4860-4877 - Xunzhu Tang, Jiechao Gao, Jin Xu, Tiezhu Sun, Yewei Song, Saad Ezzini, Wendkûuni C. Ouédraogo, Jacques Klein, Tegawendé F. Bissyandé:

SynFix: Dependency-Aware Program Repair via RelationGraph Analysis. 4878-4894 - Taeho Hwang, Sukmin Cho, Soyeong Jeong, Hoyun Song, SeungYoon Han, Jong C. Park:

EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation. 4895-4924 - Zhihu Wang, Shiwan Zhao, Yu Wang, Heyuan Huang, Sitao Xie, Yubo Zhang, Jiaxin Shi, Zhixing Wang, Hongyan Li, Junchi Yan:

Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives. 4925-4936 - Shuai Zhao, Xiaobao Wu, Cong-Duy T. Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Anh Tuan Luu:

Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation. 4937-4952 - Shuhe Wang, Guoyin Wang, Yizhong Wang, Jiwei Li, Eduard H. Hovy, Chen Guo:

Packing Analysis: Packing Is More Appropriate for Large Models or Datasets in Supervised Fine-tuning. 4953-4967 - Yongkang Chen, Chongyang Zhao, Jianwentian Jianwentian, Guiling Cao, Hu Li, Xiaohui Kuang:

Better Red Teaming via Searching with Large Language Model. 4968-4984 - Jiayi Han, Liang Du, Yiwen Wu, Guanming Liang, Xiangguo Zhou, Weibo Zheng, Donghong Han, Zixun Sun:

AdaV: Adaptive Text-visual Redirection for Vision-Language Models. 4985-4997 - Qian Wang, Tianyu Wang, Zhenheng Tang, Qinbin Li, Nuo Chen, Jingsheng Liang, Bingsheng He:

MegaAgent: A Large-Scale Autonomous LLM-based Multi-Agent System Without Predefined SOPs. 4998-5036 - Xiaotian Zhang, Ruizhe Chen, Yang Feng, Zuozhu Liu:

Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment. 5037-5049 - Hongfei Xu, Zhuofei Liang, Qiuhui Liu, Lingling Mu:

A Self-Distillation Recipe for Neural Machine Translation. 5050-5064 - Longguang Zhong, Fanqi Wan, Ruijun Chen, Xiaojun Quan, Liangzhi Li:

BlockPruner: Fine-grained Pruning for Large Language Models. 5065-5080 - Yuchen Wen, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng:

Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective. 5081-5097 - Jiajie Zhang, Yushi Bai, Xin Lv, Wanjun Gu, Danqing Liu, Minhao Zou, Shulin Cao, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li:

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-Context QA. 5098-5122 - Min Choi, Keonwoo Kim, Sungwon Chae, Sangyeob Baek:

An Empirical Study of Group Conformity in Multi-Agent Systems. 5123-5139 - Zhanglin Wu, Daimeng Wei, Xiaoyu Chen, Hengchao Shang, Jiaxin Guo, Zongyao Li, Yuanchang Luo, Jinlong Yang, Zhiqiang Rao, Hao Yang:

Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation. 5140-5148 - Yeyuan Wang, Dehong Gao, Rujiao Long, Lei Yi, Linbo Jin, Libin Yang, Xiaoyan Cai:

ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning. 5149-5160 - Meihan Tong, Shuai Wang:

NovelCR: A Large-Scale Bilingual Dataset Tailored for Long-Span Coreference Resolution. 5161-5173 - Huangyw Huangyw, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao:

Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models. 5174-5193 - Weidong Wu, Qinlin Zhao, Hao Chen, Lexin Zhou, Defu Lian, Hong Xie:

Exploring the Choice Behavior of Large Language Models. 5194-5214 - Xueru Wen, Jie Lou, Xinyu Lu, Yuqiu Ji, Xinyan Guan, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Debing Zhang, Le Sun:

On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation. 5215-5231 - Yurun Song, Xiangqing Shen, Rui Xia:

From Phrases to Subgraphs: Fine-Grained Semantic Parsing for Knowledge Graph Question Answering. 5232-5246 - Zhicheng Guo, Sijie Cheng, Yuchen Niu, Hao Wang, Sicheng Zhou, Wenbing Huang, Yang Liu:

StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7, 000+ Real-World APIs. 5247-5270 - Hoang Pham, Thanh-Do Nguyen, Khac-Hoai Nam Bui:

ClaimPKG: Enhancing Claim Verification via Pseudo-Subgraph Generation with Lightweight Specialized LLM. 5271-5290 - Baizhou Huang, Xiaojun Wan:

TriEmbed: Bridge the Gap between Text and Token Indices with Embedding Reparameterization. 5291-5297 - Cong Liu, Jie Wu, Weigang Wu, Xu Chen, Liang Lin, Wei-Shi Zheng:

Chain of Methodologies: Scaling Test Time Computation without Training. 5298-5312 - Jian Guan, Junfei Wu, Jia-Nan Li, Chuanqi Cheng, Wei Wu:

A Survey on Personalized Alignment - The Missing Piece for Large Language Models in Real-World Applications. 5313-5333 - Chenhao Ding, Jiangyang Li, Songlin Dong, Xinyuan Gao, Yuhang He, Yihong Gong:

SuLoRA: Subspace Low-Rank Adaptation for Parameter-Efficient Fine-Tuning. 5334-5349 - Yeong-Joon Ju, Ho-Joong Kim, Seong-Whan Lee:

MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval. 5350-5363 - Ruilin Zhao, Feng Zhao, Hong Zhang:

Correcting on Graph: Faithful Semantic Parsing over Knowledge Graphs with Large Language Models. 5364-5376 - Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Zhuo Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu:

COPR: Continual Human Preference Learning via Optimal Policy Regularization. 5377-5398 - Jie Sun, Junkang Wu, Jiancan Wu, Zhibo Zhu, Xingyu Lu, Jun Zhou, Lintao Ma, Xiang Wang:

Robust Preference Optimization via Dynamic Target Margins. 5399-5416 - Xiao Wang, Qingyi Si, Shiyu Zhu, Jianlong Wu, Li Cao, Liqiang Nie:

AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding. 5417-5432 - Hongru Wang, Wenyu Huang, Yufei Wang, Yuanhao Xi, Jianqiao Lu, Huan Zhang, Nan Hu, Zeming Liu, Jeff Z. Pan, Kam-Fai Wong:

Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and Challenges. 5433-5453 - Xiaochong Lan, Jie Feng, Yizhou Sun, Chen Gao, Jiahuan Lei, Xinleishi Xinleishi, Hengliang Luo, Yong Li:

Open-Set Living Need Prediction with Large Language Models. 5454-5472 - Ziyang Huang, Wangtao Sun, Jun Zhao, Kang Liu:

Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate. 5473-5488 - Mehdi Jafari, Yuncheng Hua, Hao Xue, Flora D. Salim:

Beyond Words: Integrating Theory of Mind into Conversational Agents for Human-Like Belief, Desire, and Intention Alignment. 5489-5508 - Zhiyuan Li, Heng Wang, Dongnan Liu, Chaoyi Zhang, Ao Ma, Jieting Long, Weidong Cai:

Multimodal Causal Reasoning Benchmark: Challenging Multimodal Large Language Models to Discern Causal Links Across Modalities. 5509-5533 - Litu Ou, Mirella Lapata:

Context-Aware Hierarchical Merging for Long Document Summarization. 5534-5561 - Xiangqing Shen, Fanfan Wang, Siwei Wu, Rui Xia:

VCD: A Dataset for Visual Commonsense Discovery in Images. 5562-5577 - Hongru Wang, Deng Cai, Wanjun Zhong, Shijue Huang, Jeff Z. Pan, Zeming Liu, Kam-Fai Wong:

Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst. 5578-5596 - Yongsen Zheng, Mingjie Qian, Guohua Wang, Yang Liu, Ziliang Chen, Mingzhi Mao, Liang Lin, Kwok-Yan Lam:

HyperCRS: Hypergraph-Aware Multi-Grained Preference Learning to Burst Filter Bubbles in Conversational Recommendation System. 5597-5608 - Junyu Lu, Kai Ma, Kaichun Wang, Kelaiti Xiao, Roy Ka-Wei Lee, Bo Xu, Liang Yang, Hongfei Lin:

Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement. 5609-5626 - Kumara Kahatapitiya, Kanchana Ranasinghe, Jongwoo Park, Michael S. Ryoo:

Language Repository for Long Video Understanding. 5627-5646 - Jeonghyun Park, Hwanhee Lee:

Investigating Language Preference of Multilingual RAG Systems. 5647-5675 - Mei Guo, Chen Chen, Chunyan Hou, Yike Wu, Xiaojie Yuan:

FGDGNN: Fine-Grained Dynamic Graph Neural Network for Rumor Detection on Social Media. 5676-5687 - Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen M. Meng:

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching. 5688-5724 - Qingsong Zou, Jingyu Xiao, Qing Li, Zhi Yan, Yuhang Wang, Li Xu, Wenxuan Wang, Kuofeng Gao, Ruoyu Li, Yong Jiang:

QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query Language. 5725-5741 - Chengzhi Li, Heyan Huang, Ping Jian, Zhen Yang, Chenxu Wang, Yifan Wang:

Memory or Reasoning? Explore How LLMs Compute Mixed Arithmetic Expressions. 5742-5763 - Yunxiao Shi, Wujiang Xu, Zeqi Zhang, Xing Zi, Qiang Wu, Min Xu:

PersonaX: A Recommendation Agent-Oriented User Modeling Framework for Long Behavior Sequence. 5764-5787 - Shuliang Liu, Xinze Li, Zhenghao Liu, Yukun Yan, Cheng Yang, Zheni Zeng, Zhiyuan Liu, Maosong Sun, Ge Yu:

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models. 5788-5807 - Chiwei Zhu, Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Zhendong Mao:

Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability. 5808-5835 - Heng Yu, Junfeng Kang, Rui Li, Qi Liu, Liyang He, Zhenya Huang, Shuanghong Shen, Junyu Lu:

CA-GAR: Context-Aware Alignment of LLM Generation for Document Retrieval. 5836-5849 - Guhong Chen, Liyang Fan, Zihan Gong, Nan Xie, Zixuan Li, Ziqiang Liu, Chengming Li, Qiang Qu, Hamid Alinejad-Rokny, Shiwen Ni, Min Yang:

AgentCourt: Simulating Court with Adversarial Evolvable Lawyer Agents. 5850-5865 - Jinyang Huang, Xiachong Feng, Qiguang Chen, Hanjie Zhao, Zihui Cheng, Jiesong Bai, Jingxuan Zhou, Min Li, Libo Qin:

MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios. 5866-5879 - Hui Huang, Xingyuan Bu, Hongli Zhou, Yingqi Qu, Jing Liu, Muyun Yang, Bing Xu, Tiejun Zhao:

An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4. 5880-5895 - Xueyang Feng, Jingsen Zhang, Jiakai Tang, Wei Li, Guohao Cai, Xu Chen, Quanyu Dai, Yue Zhu, Zhenhua Dong:

Expectation Confirmation Preference Optimization for Multi-Turn Conversational Recommendation Agent. 5896-5914 - Shuai Niu, Jing Ma, Hongzhan Lin, Liang Bai, Zhihua Wang, Wei Bi, Richard Yi Da Xu, Guo Li, Xian Yang:

ProMedTS: A Self-Supervised, Prompt-Guided Multimodal Approach for Integrating Medical Text and Time Series. 5915-5928 - Yu Li, Qizhi Pei, Mengyuan Sun, Honglin Lin, Chenlin Ming, Xin Gao, Jiang Wu, Conghui He, Lijun Wu:

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge. 5929-5965 - Hwan Chang, Hwanhee Lee:

Which Retain Set Matters for LLM Unlearning? A Case Study on Entity Unlearning. 5966-5982 - Wenhao Liu, Siyu An, Junru Lu, Muling Wu, Tianlong Li, Xiaohua Wang, Changze Lv, Xiaoqing Zheng, Di Yin, Xing Sun, Xuanjing Huang:

Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing. 5983-6005 - Jianghao Chen, Zhenlin Wei, Zhenjiang Ren, Ziyong Li, Jiajun Zhang:

LR²Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems. 6006-6032 - Tian Lan, Xiangdong Su, Xu Liu, Ruirui Wang, Ke Chang, Jiang Li, Guanglai Gao:

McBE: A Multi-task Chinese Bias Evaluation Benchmark for Large Language Models. 6033-6056 - Yiwei Fu, Yuxing Zhang, Chunchun Chen, JianwenMa JianwenMa, Quan Yuan, Rong-Cheng Tu, Xinli Huang, Wei Ye, Xiao Luo, Minghua Deng:

MARK: Multi-agent Collaboration with Ranking Guidance for Text-attributed Graph Clustering. 6057-6072 - Jingbao Luo, Ming Liu, Ran Liu, Yongpan Sheng, Xin Hu, Gang Li, Peng Wu:

Can Language Models Capture Human Writing Preferences for Domain-Specific Text Summarization? 6073-6091 - Yijiong Yu, Huiqiang Jiang, Xufang Luo, Qianhui Wu, Chin-Yew Lin, Dongsheng Li, Yuqing Yang, Yongfeng Huang, Lili Qiu:

Mitigate Position Bias in LLMs via Scaling a Single Hidden States Channel. 6092-6111 - Ruiqiao Bai, Xue Han, Shuo Lei, Junlan Feng, Yanyan Luo, Chao Deng:

Self-attention-based Graph-of-Thought for Math Problem Solving. 6112-6125 - Weihong Du, Wenrui Liao, Binyu Yan, Hongru Liang, Anthony G. Cohn, Wenqiang Lei:

BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks. 6126-6149 - Jiakai Tang, Shiqi Shen, Zhipeng Wang, Gong Zhi, Xueyang Feng, Zexu Sun, Haoran Tan, Xu Chen:

KAPA: A Deliberative Agent Framework with Tree-Structured Knowledge Base for Multi-Domain User Intent Understanding. 6150-6166 - Guofeng Quan, Wenfeng Feng, Chuzhan Hao, Guochao Jiang, Yuewei Zhang, Hao Henry Wang:

RASD: Retrieval-Augmented Speculative Decoding. 6167-6177 - Zengyi Gao, Yukun Cao, Hairu Wang, Ao Ke, Yuan Feng, S. Kevin Zhou, Xike Xie:

FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs. 6178-6192 - Kening Zheng, Junkai Chen, Yibo Yan, Xin Zou, Huiyu Zhou, Xuming Hu:

Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models. 6193-6212 - Yilei Tu, Andrew Xue, Freda Shi:

Blessing of Multilinguality: A Systematic Analysis of Multilingual In-Context Learning. 6213-6248 - Lishui Fan, Mouxiang Chen, Zhongxin Liu:

SEK: Self-Explained Keywords Empower Large Language Models for Code Generation. 6249-6278 - Peng Ding, Jun Kuang, ZongYu Wang, Xuezhi Cao, Xunliang Cai, Jiajun Chen, Shujian Huang:

Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement. 6279-6299 - Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Hassan Awadallah:

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents. 6300-6323 - Zhanpeng Chen, Mingxiao Li, Ziyang Chen, Nan Du, Xiaolong Li, Yuexian Zou:

Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding. 6324-6341 - Yuhao Dan, Jie Zhou, Qin Chen, Junfeng Tian, Liang He:

P-React: Synthesizing Topic-Adaptive Reactions of Personality Traits via Mixture of Specialized LoRA Experts. 6342-6362 - Jiamin Su, Yibo Yan, Fangteng Fu, Zhang Han, Jingheng Ye, Xiang Liu, Jiahao Huo, Huiyu Zhou, Xuming Hu:

EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models. 6363-6389 - Yuanjie Lyu, Chao Zhang, Yuhao Chen, Yong Chen, Tong Xu:

Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks. 6390-6404 - Jiayi He, Hehai Lin, Qingyun Wang, Yi R. Fung, Heng Ji:

Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks. 6405-6421 - Chenkai Sun, Denghui Zhang, ChengXiang Zhai, Heng Ji:

Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation. 6422-6434 - Yunqiao Yang, Houxing Ren, Zimu Lu, Ke Wang, Weikang Shi, Aojun Zhou, Junting Pan, Mingjie Zhan, Hongsheng Li:

Probability-Consistent Preference Optimization for Enhanced LLM Reasoning. 6435-6448 - Hongcheng Guo, Wei Zhang, Junhao Chen, Yaonan Gu, Jian Yang, Junjia Du, Shaosheng Cao, Binyuan Hui, Tianyu Liu, Jianxin Ma, Chang Zhou, Zhoujun Li:

IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web. 6449-6466 - Fan Gao, Jieyang Peng, Xiaoming Tao, Youzheng Wang:

TDCSA: LLM-Guided Top-Down Approach for Robust Citation Sentiment Analysis. 6467-6484 - Yi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran Xu, Qiang Xu:

DeepRTL2: A Versatile Model for RTL-Related Tasks. 6485-6500 - Yutao Sun, Mingshuai Chen, Tiancheng Zhao, Ruochen Xu, Zilun Zhang, Jianwei Yin:

The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding? 6501-6512 - Long Chen, Shuoyu Guan, Xiaohua Huang, Wen-Jing Wang, Cai Xu, Ziyu Guan, Wei Zhao:

Cross-lingual Multimodal Sentiment Analysis for Low-Resource Languages via Language Family Disentanglement and Rethinking Transfer. 6513-6522 - Chengda Lu, Xiaoyu Fan, Yu Huang, Rongwu Xu, Jijie Li, Wei Xu:

Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking? 6523-6546 - Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Ziyu Liu, Shengyuan Ding, Shenxi Wu, Yubo Ma, Haodong Duan, Wenwei Zhang, Kai Chen, Dahua Lin, Jiaqi Wang:

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model. 6547-6563 - Junjie Li, Nan Zhang, Xiaoyang Qu, Kai Lu, Guokuan Li, Jiguang Wan, Jianzong Wang:

RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models. 6564-6574 - Zhentao Xie, Chengcheng Han, Jinxin Shi, Wenjun Cui, Xin Zhao, Xingjiao Wu, Jiabao Zhao:

RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation. 6575-6602 - Yuxin Jiang, Yufei Wang, Chuhan Wu, Xinyi Dai, Yan Xu, Weinan Gan, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Wei Wang:

Instruction-Tuning Data Synthesis from Scratch via Web Reconstruction. 6603-6618 - Lian Yan, Chen Tang, Yi Guan, Haotian Wang, Songyuan Wang, Haifeng Liu, Yang Yang, Jingchi Jiang:

RLKGF: Reinforcement Learning from Knowledge Graph Feedback Without Human Annotations. 6619-6633 - Baturay Saglam, Xinyang Hu, Zhuoran Yang, Dionysis Kalogerias, Amin Karbasi:

Learning Task Representations from In-Context Learning. 6634-6663 - Xiaohu Li, Yunfeng Ning, Zepeng Bao, Mayi Xu, Jianhao Chen, Tieyun Qian:

CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks on their Internal Representations. 6664-6678 - Yubo Li, Yidi Miao, Xueying Ding, Ramayya Krishnan, Rema Padman:

Firm or Fickle? Evaluating Large Language Models Consistency in Sequential Interactions. 6679-6700 - Pengzhou Cheng, Zheng Wu, Zongru Wu, Tianjie Ju, Aston Zhang, Zhuosheng Zhang, Gongshen Liu:

OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents. 6701-6725 - Pengfei He, Yuping Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu:

Red-Teaming LLM Multi-Agent Systems via Communication Attacks. 6726-6747 - Zhihong Zhu, Yunyan Zhang, Xianwei Zhuang, Fan Zhang, Zhongwei Wan, Yuyan Chen, Qingqing Long, Yefeng Zheng, Xian Wu:

Can We Trust AI Doctors? A Survey of Medical Hallucination in Large Language and Large Vision-Language Models. 6748-6769 - Jiaan Wang, Fandong Meng, Yunlong Liang, Jie Zhou:

DRT: Deep Reasoning Translation via Long Chain-of-Thought. 6770-6782 - Fuying Wang, Feng Wu, Yihan Tang, Lequan Yu:

CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis. 6783-6799 - Dong Zhang, Haiyan Tian, Qingying Sun, Shoushan Li:

Vision-aided Unsupervised Constituency Parsing with Multi-MLLM Debating. 6800-6810 - Bingsen Chen, Shenji Wan, Xi Ye

, Chen Zhao:
Inter-Passage Verification for Multi-evidence Multi-answer QA. 6811-6829 - Alan Chi-Man Lee, Wing-Sun Cheng, Calvin Chun-Kit Chan:

PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences. 6830-6842 - Haoqi Zheng, DongWang DongWang, Silin Yang, Yunpeng Qi, Ruochun Jin, Liyang Xu:

Logical DA: Enhancing Data Augmentation for Logical Reasoning via a Multi-Agent System. 6843-6855 - Yubai Wei, Jiale Han, Yi Yang:

Adapting General-Purpose Embedding Models to Private Datasets Using Keyword-based Retrieval. 6856-6870 - Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu:

SQL Injection Jailbreak: A Structural Disaster of Large Language Models. 6871-6891 - Jaewoo Lee, Keyang Xuan, Chanakya Ekbote, Sandeep Polisetty, Yi R. Fung, Paul Pu Liang:

TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models. 6892-6908 - Zihao Wang, Jiaxing Yu, Haoxuan Liu, Zehui Zheng, Yuhang Jin, Shuyu Li, Shulei Ji, Kejun Zhang:

Generative Music Models' Alignment with Professional and Amateur Users' Expectations. 6909-6920 - Xinrui He, Yikun Ban, Jiaru Zou, Tianxin Wei, Curtiss B. Cook, Jingrui He:

LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation. 6921-6936 - Yingjie Li, Yun Luo, Xiaotian Xie, Yue Zhang:

Task Calibration: Calibrating Large Language Models on Inference Tasks. 6937-6951 - Duy A. Nguyen, Rishi Kesav Mohan, Shimeng Yang, Pritom Saha Akash, Kevin Chen-Chuan Chang:

MiniELM: A Lightweight and Adaptive Query Rewriting Framework for E-Commerce Search Optimization. 6952-6964 - Ivory Yang, Chunhui Zhang, Yuxin Wang, Zhongyu Ouyang, Soroush Vosoughi:

Visibility as Survival: Generalizing NLP for Native Alaskan Language Identification. 6965-6979 - Zhangchen Xu, Yang Liu, Yueqin Yin, Mingyuan Zhou, Radha Poovendran:

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding. 6980-7008 - Xiaochuan Liu, Ruihua Song, Xiting Wang, Xu Chen:

Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work Generation. 7009-7028 - Pratik Rakesh Singh, Kritarth Prasad, Mohammadi Zaki, Pankaj Wasnik:

Graph-Assisted Culturally Adaptable Idiomatic Translation for Indic languages. 7029-7044 - Vincent Nguyen, Sarvnaz Karimi, Willow Hallgren, Mahesh Prakash:

Question Answering in Climate Adaptation for Agriculture: Model Development and Evaluation with Expert Feedback. 7045-7075 - Xinfeng Wang, Jin Cui, Fumiyo Fukumoto, Yoshimi Suzuki:

AGRec: Adapting Autoregressive Decoders with Graph Reasoning for LLM-based Sequential Recommendation. 7076-7090 - Jin Cui, Xinfeng Wang, Yoshimi Suzuki, Fumiyo Fukumoto:

Causal Denoising Prototypical Network for Few-Shot Multi-label Aspect Category Detection. 7091-7104 - Pengzuo Wu, Yuhang Yang, Guangcheng Zhu, Chao Ye, Hong Gu, Xu Lu, Ruixuan Xiao, Bowen Bao, Yijing He, Liangyu Zha, Wentao Ye, Junbo Zhao, Haobo Wang:

RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis. 7105-7137 - Zhiyang Zhang, Yaping Zhang, Yupu Liang, Zhiyuan Chen, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong:

A Query-Response Framework for Whole-Page Complex-Layout Document Image Translation with Relevant Regional Concentration. 7138-7149 - Junjia Du, Yadi Liu, Hongcheng Guo, Jiawei Wang, Haojian Huang, Yunyi Ni, Zhoujun Li:

DependEval: Benchmarking LLMs for Repository Dependency Understanding. 7150-7179 - Xu Zhang, Kun Zhang, Wenxin Ma, Rongsheng Wang, Chenxu Wu, Yingtai Li, S. Kevin Zhou:

A General Knowledge Injection Framework for ICD Coding. 7180-7189 - Jiahao Huo, Yibo Yan, Xu Zheng, Yuanhuiyi Lyu, Xin Zou, Zhihua Wei, Xuming Hu:

MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models. 7190-7206 - Wenjian Ding, Yao Zhang, Jun Wang, Adam Jatowt, Zhenglu Yang:

Generating Questions, Answers, and Distractors for Videos: Exploring Semantic Uncertainty of Object Motions. 7207-7220 - Xuan Luo, Weizhi Wang, Xifeng Yan:

DiffSkip: Differential Layer Skipping in Large Language Models. 7221-7231 - Zihao Jiang, Ben Liu, Miao Peng, Wenjie Xu, Yao Xiao, Zhenyan Shan, Min Peng:

Towards Explainable Temporal Reasoning in Large Language Models: A Structure-Aware Generative Framework. 7232-7251 - Jinghui Lu, Haiyang Yu, Yanjie Wang, Yongjie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao Liu, Can Huang:

A Bounding Box is Worth One Token - Interleaving Layout and Text in a Large Language Model for Document Understanding. 7252-7273 - Mingzhe Li, Xin Lu, Yanyan Zhao:

Self-Foveate: Enhancing Diversity and Difficulty of Synthesized Instructions from Unsupervised Text via Multi-Level Foveation. 7274-7289 - Mingyu Zheng, Zhifan Feng, Jia Wang, Lanrui Wang, Zheng Lin, Hao Yang, Weiping Wang:

TableDreamer: Progressive and Weakness-guided Data Synthesis from Scratch for Table Instruction Tuning. 7290-7315 - Nagham Hamad, Mohammed Khalilia, Mustafa Jarrar:

Konooz: Multi-domain Multi-dialect Corpus for Named Entity Recognition. 7316-7331 - Hongji Yang, Yucheng Zhou, Wencheng Han, Jianbing Shen:

Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation. 7332-7349 - Linhao Zhang, Daoguang Zan, Quanshun Yang, Zhirong Huang, Dong Chen, Bo Shen, Tianyu Liu, Yongshun Gong, Pengjie Huang, Xudong Lu, Guangtai Liang, Lizhen Cui, Qianxiang Wang:

CodeV: Issue Resolving with Visual Data. 7350-7361 - Hongbin Na, Yining Hua, Zimu Wang, Tao Shen, Beibei Yu, Lilin Wang, Wei Wang, John B. Torous, Ling Chen:

A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions. 7362-7376 - Tao He, Hao Li, Jingchang Chen, Runxuan Liu, Yixin Cao, Lizi Liao, Zihao Zheng, Zheng Chu, Jiafeng Liang, Ming Liu, Bing Qin:

Breaking the Reasoning Barrier A Survey on LLM Complex Reasoning through the Lens of Self-Evolution. 7377-7417 - Zhilin Wang, Yafu Li, Xiaoye Qu, Yu Cheng:

SEE: Continual Fine-tuning with Sequential Ensemble of Experts. 7418-7432 - Chi-Min Chan, Chunpu Xu, Junqi Zhu, Jiaming Ji, Donghai Hong, Pengcheng Wen, Chunyang Jiang, Zhen Ye, Yaodong Yang, Wei Xue, Sirui Han, Yike Guo:

Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA. 7433-7451 - Rui Hu, Delai Qiu, Shuyu Wei, Jiaming Zhang, Yining Wang, Shengping Liu, Jitao Sang:

Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language Models. 7452-7463 - Haote Yang, Xingjian Wei, Jiang Wu, Noémi Ligeti-Nagy, Jiaxing Sun, Yinfan Wang, Zijian Gyozo Yang, Junyuan Gao, Jingchao Wang, Bowen Jiang, Shasha Wang, Nanjun Yu, Zihao Zhang, Shixin Hong, Hongwei Liu, Wei Li, Songyang Zhang, Dahua Lin, Lijun Wu, Gábor Prószéky, Conghui He:

OpenHuEval: Evaluating Large Language Model on Hungarian Specifics. 7464-7520 - Sirui Huang, Yanggan Gu, Zhonghao Li, Xuming Hu, Qing Li, Guandong Xu:

StructFact: Reasoning Factual Knowledge from Structured Data with Large Language Models. 7521-7552 - Sirui Chen, Shu Yu, Shengjie Zhao, Chaochao Lu:

From Imitation to Introspection: Probing Self-Consciousness in Language Models. 7553-7583 - Mingxu Chai, Ziyu Shen, Chong Zhang, Yue Zhang, Xiao Wang, Shihan Dou, Jihua Kang, Jiazheng Zhang, Qi Zhang:

DocFusion: A Unified Framework for Document Parsing Tasks. 7584-7599 - Yue Li, Xin Yi, Dongsheng Shi, Gerard de Melo, Xiaoling Wang, Linlin Wang:

Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models. 7600-7612 - Bowen Ping, Jiali Zeng, Fandong Meng, Shuo Wang, Jie Zhou, Shanghang Zhang:

LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information. 7613-7632 - Quanyu Long, Jianda Chen, Zhengyuan Liu, Nancy F. Chen, Wenya Wang, Sinno Jialin Pan:

Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts. 7633-7651 - Bofei Gao, Yejie Wang, Yibo Miao, Ruoyu Wu, Feifan Song, Longhui Yu, Tianyu Liu, Baobao Chang:

Towards A Better Initial Policy Model For Scalable Long-CoT Reinforcement Learning. 7652-7665 - Tu Vu, Manh Do, Tung Nguyen, Ngo Van Linh, Sang Dinh, Thien Huu Nguyen:

Topic Modeling for Short Texts via Optimal Transport-Based Clustering. 7666-7680 - Colin Swaelens, Ilse De Vos, Els Lefever:

Lemmatisation & Morphological Analysis of Unedited Greek: Do Simple Tasks Need Complex Solutions? 7681-7689 - Chengzhang Yu, Yiming Zhang, Zhixin Liu, Zenghui Ding, Yining Sun, Zhanpeng Jin:

FRAME: Feedback-Refined Agent Methodology for Enhancing Medical Research Insights. 7690-7704 - Xi Li, Ruofan Mao, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang:

Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models. 7705-7727 - Pavel Posokhov, Sergei Masliukhin, Skrylnikov Stepan, Danil Tirskikh, Olesia Makhnytkina:

Relevance Scores Calibration for Ranked List Truncation via TMP Adapter. 7728-7734 - Chaona Kong, Jianyi Liu, Yifan Tang, Ru Zhang:

Neuron Activation Modulation for Text Style Transfer: Guiding Large Language Models. 7735-7747 - Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, An-Lan Wang, Chunhui Lin, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang:

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. 7748-7763 - Xinyan Jiang, Hang Ye, Yongxin Zhu, Xiaoying Zheng, Zikang Chen, Jun Gong:

HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models. 7764-7786 - Junchi Yao, Shu Yang, Jianhua Xu, Lijie Hu, Mengdi Li, Di Wang:

Understanding the Repeat Curse in Large Language Models from a Feature Perspective. 7787-7815 - Haneul Yoo, Cheonbok Park, Sangdoo Yun, Alice Oh, Hwaran Lee:

Code-Switching Curriculum Learning for Multilingual Transfer in LLMs. 7816-7836 - Yang Yao, Xuan Tong, Ruofan Wang, Yixu Wang, Lujundong Li, Liang Liu, Yan Teng, Yingchun Wang:

A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos. 7837-7855 - Yixuan Wang, Shiqi Zhou, Chuanzhe Guo, Qingfu Zhu:

Tag-Evol: Achieving Efficient Instruction Evolving via Tag Injection. 7856-7869 - Yao Huang, Yitong Sun, Shouwei Ruan, Yichi Zhang, Yinpeng Dong, Xingxing Wei:

Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space. 7870-7888 - Enzo Doyen, Amalia Todirascu:

GeNRe: A French Gender-Neutral Rewriting System Using Collective Nouns. 7889-7909 - Christian Jaumann, Andreas Wiedholz, Annemarie Friedrich:

LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature Reviews. 7910-7927 - Ehud Malul, Oriel Perets, Ziv Mor, Yigal Kassel, Elior Sulem:

LCHAIM - Investigating Long Context Reasoning in Hebrew. 7928-7939 - Jiayuan Li, Lei Cui, Sen Zhao, Yun Yang, Lun Li, Hongsong Zhu:

CLeVeR: Multi-modal Contrastive Learning for Vulnerability Code Representation. 7940-7951 - Zilu Dong, Xiangqing Shen, Rui Xia:

MEMIT-Merge: Addressing MEMIT's Key-Value Conflicts in Same-Subject Batch Editing for LLMs. 7952-7960 - Qin Chen, Yuanyi Ren, Xiaojun Ma, Yuyang Shi:

Large Language Models for Predictive Analysis: How Far Are They? 7961-7978 - Xiaoxue Cheng, Junyi Li, Xin Zhao, Ji-Rong Wen:

Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking. 7979-7990 - Qitao Qin, Yucong Luo, Yihang Lu, Zhibo Chu, Xiaoman Liu, Xianwei Meng:

Towards Adaptive Memory-Based Optimization for Enhanced Retrieval-Augmented Generation. 7991-8004 - Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou:

Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping. 8005-8018 - Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang:

A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models. 8019-8033 - Lingxiao Wei, He Yan, Xiangju Lu, Junmin Zhu, Jun Wang, Wei Zhang:

CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels. 8034-8062 - Zhitong Wang, Cheng Gao, Chaojun Xiao, Yufei Huang, Shuzheng Si, Kangyang Luo, Yuzhuo Bai, Wenhao Li, Tangjian Duan, Chuancheng Lv, Guoshan Lu, Gang Chen, Fanchao Qi, Maosong Sun:

Document Segmentation Matters for Retrieval-Augmented Generation. 8063-8075 - Xunzhi Wang, Zhuowei Zhang, Gaonan Chen, Qiongyu Li, Bitong Luo, Zhixin Han, Haotian Wang, Zhiyu Li, Hang Gao, Mengting Hu:

UBench: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions. 8076-8107 - Yusheng Zhao, Xiao Luo, Haomin Wen, Zhiping Xiao, Wei Ju, Ming Zhang:

Embracing Large Language Models in Traffic Flow Forecasting. 8108-8123 - Mengliang He, Jiayi Zeng, Yankai Jiang, Wei Zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou:

Flow2Code: Evaluating Large Language Models for Flowchart-based Code Generation Capability. 8124-8146 - Romain Storaï, Jaeseong Lee, Seung-won Hwang:

Smarter, Not Harder: Training-Free Adaptive Computation for Transformers. 8147-8155 - Zhenhe Wu, Zhongqiu Li, Jie Zhang, Zhongjiang He, Jian Yang, Yu Zhao, Ruiyu Fang, Bing Wang, Hongyan Xie, Shuangyong Song, Zhoujun Li:

UCS-SQL: Uniting Content and Structure for Enhanced Semantic Bridging In Text-to-SQL. 8156-8168 - Qingyao Li, Xinyi Dai, Xiangyang Li, Weinan Zhang, Yasheng Wang, Ruiming Tang, Yong Yu:

CodePRM: Execution Feedback-enhanced Process Reward Model for Code Generation. 8169-8182 - Jiaru Zou, Qing Wang, Pratyush Thakur, Nickvash Kani:

STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing. 8183-8199 - Jihoon Lee, Min Song:

Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models. 8200-8219 - Pramit Bhattacharyya, Arnab Bhattacharya:

Leveraging LLMs for Bangla Grammar Error Correction: Error Categorization, Synthetic Data, and Model Evaluation. 8220-8239 - Yimiao Qiu, Yang Deng, Quanming Yao, Zhimeng Zhang, Zhiang Dong, Chang Yao, Jingyuan Chen:

Think Both Ways: Teacher-Student Bidirectional Reasoning Enhances MCQ Generation and Distractor Quality. 8240-8253 - Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou:

mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data. 8254-8275 - Jeonghwan Choi, Minjeong Ban, Minseok Kim, Hwanjun Song:

Word2Passage: Word-level Importance Re-weighting for Query Expansion. 8276-8296 - Yangbo Wei, Zhen Huang, Fangzhou Zhao, Qi Feng, Wei W. Xing:

MECoT: Markov Emotional Chain-of-Thought for Personality-Consistent Role-Playing. 8297-8314 - Yuan Sui, Yufei He, Nian Liu, Xiaoxin He, Kun Wang, Bryan Hooi:

FiDeLiS: Faithful Reasoning in Large Language Models for Knowledge Graph Question Answering. 8315-8330 - Jingwen Cheng, Kshitish Ghate, Wenyue Hua, William Yang Wang, Hong Shen, Fei Fang:

REALM: A Dataset of Real-World LLM Use Cases. 8331-8341 - Tommaso Green, Félix Gaschi, Fabian David Schmidt, Simone Paolo Ponzetto, Goran Glavas:

BABELEDITS: A Benchmark and a Modular Approach for Robust Cross-lingual Knowledge Editing of Large Language Models. 8342-8369 - Haokun Zhao, Jinyi Han, Jiaqing Liang, Yanghua Xiao, Xiaojun Meng, Jiansheng Wei:

CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory. 8370-8393 - Xuetao Ma, Wenbin Jiang, Hua Huang:

Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning. 8394-8412 - Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia:

BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English. 8413-8429 - Zihan Wang, Yaohui Zhu, Gim Hee Lee, Yachun Fan:

NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM. 8430-8440 - Yu Guo, Dong Jin, Shenghao Ye, Shuangwu Chen, Jianyang Jianyang, Xiaobin Tan:

SQLForge: Synthesizing Reliable and Diverse Data to Enhance Text-to-SQL Reasoning in LLMs. 8441-8452 - Jiachen Zhu, Congmin Zheng, Jianghao Lin, Kounianhua Du, Ying Wen, Yong Yu, Jun Wang, Weinan Zhang:

Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning. 8453-8468 - Maike Züfle, Jan Niehues:

Contrastive Learning for Task-Independent SpeechLLM-Pretraining. 8469-8490 - Qirui Zhou, Shaohui Peng, Weiqiang Xiong, Haixin Chen, Yuanbo Wen, Haochen Li, Ling Li, Qi Guo, Yongwei Zhao, Ke Gao, Ruizhi Chen, Yanjun Wu, Zhao Chen, Yunji Chen:

QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm. 8491-8505 - Yuechi Zhou, Chuyue Zhou, Jianxin Zhang, Juntao Li, Min Zhang:

ALW: Adaptive Layer-Wise contrastive decoding enhancing reasoning ability in Large Language Models. 8506-8524 - Xinlong Chen, Yuanxing Zhang, Qiang Liu, Junfei Wu, Fuzheng Zhang, Tieniu Tan:

Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models. 8525-8542 - Xinlong Chen, Yuanxing Zhang, Chongling Rao, Yushuo Guan, Jiaheng Liu, Fuzheng Zhang, Chengru Song, Qiang Liu, Di Zhang, Tieniu Tan:

VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation. 8543-8563 - Chuan Gou, Bangwei Li, Jianhua Dai, Xiaoyang Han, Ming Cai:

Mitigating Demonstration Bias through Global Coevolutionary Reasoning. 8564-8578 - Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis:

A Representation Level Analysis of NMT Model Robustness to Grammatical Errors. 8579-8601 - Han Lin, Xiu Tang, Huan Li, Wenxue Cao, Sai Wu, Chang Yao, Lidan Shou, Gang Chen:

T²DR: A Two-Tier Deficiency-Resistant Framework for Incomplete Multimodal Learning. 8602-8616 - Shixin Jiang, Jiafeng Liang, Jiyuan Wang, Xuan Dong, Heng Chang, Weijiang Yu, Jinhua Du, Ming Liu, Bing Qin:

From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalities. 8617-8652 - Verena Blaschke, Masha Fedzechkina, Maartje ter Hoeve:

Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter. 8653-8684 - Kristina Kobrock, Xenia Ohmer, Elia Bruni, Nicole Gotzner:

Agents generalize to novel levels of abstraction by using adaptive linguistic strategies. 8685-8699 - Dan Wang, Boxi Cao, Ning Bian, Xuanang Chen, Yaojie Lu, Hongyu Lin, Jia Zheng, Le Sun, Shanshan Jiang, Bin Dong, Xianpei Han:

The Linguistic Connectivities Within Large Language Models. 8700-8714 - Zhihan Zhang, Yixin Cao, Lizi Liao:

XFinBench: Benchmarking LLMs in Complex Financial Problem Solving and Reasoning. 8715-8758 - Hongzhe Huang, Jiang Liu, Zhewen Yu, Li Cai, Dian Jiao, Wenqiao Zhang, Siliang Tang, Juncheng Li, Hao Jiang, Haoyuan Li, Yueting Zhuang:

Align²LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation. 8759-8781 - Siqing Song, Chuang Wang, Rui-Qi Wang, Yi Yang, Xu-Yao Zhang:

Achieving binary weight and activation for LLMs using Post-Training Quantization. 8782-8795 - Wei Sun, Tingyu Qu, Mingxiao Li, Jesse Davis, Marie-Francine Moens:

Mitigating Negative Interference in Multilingual Knowledge Editing through Null-Space Constraints. 8796-8810 - Wenjing Xie, Xiaobo Liang, Juntao Li, Wanfu Wang, Kehai Chen, Qiaoming Zhu, Min Zhang:

From Awareness to Adaptability: Enhancing Tool Utilization for Scientific Reasoning. 8811-8831 - Qi Liu, Jingqing Ruan, Hao Li, Haodong Zhao, Desheng Wang, Jiansong Chen, Guanglu Wan, Xunliang Cai, Zhi Zheng, Tong Xu:

AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models. 8832-8866 - Junjie Zhang, Rushuai Yang, Shunyu Liu, Ting-En Lin, Fei Huang, Yi Chen, Yongbin Li, Dacheng Tao:

Supervised Optimism Correction: Be Confident When LLMs Are Sure. 8867-8880 - Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu:

Offline Reinforcement Learning for LLM Multi-step Reasoning. 8881-8893 - Masahiro Kaneko, Youmi Ma, Yuki Wata, Naoaki Okazaki:

Sampling-based Pseudo-Likelihood for Membership Inference Attacks. 8894-8907 - Chengyou Jia, Minnan Luo, Zhuohang Dang, Qiushi Sun, Fangzhi Xu, Junlin Hu, Tianbao Xie, Zhiyong Wu:

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant. 8908-8934 - Xin-Cheng Wen, Yijun Yang, Cuiyun Gao, Yang Xiao, Deheng Ye:

Boosting Vulnerability Detection of LLMs via Curriculum Preference Optimization with Synthetic Reasoning Data. 8935-8949 - Yunyao Zhang, Zikai Song, Hang Zhou, Wenfeng Ren, Yi-Ping Phoebe Chen, Junqing Yu, Wei Yang:

GA-S³: Comprehensive Social Network Simulation with Group Agents. 8950-8970 - Kaijie Jiao, Quan Wang, Licheng Zhang, Zikang Guo, Zhendong Mao:

M-RangeDetector: Enhancing Generalization in Machine-Generated Text Detection through Multi-Range Attention Masks. 8971-8983 - Heeseung Kim, Che Hyun Lee, Sangkwon Park, Jiheum Yeom, Nohil Park, Sangwon Yu, Sungroh Yoon:

Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models. 8984-9014 - Wangyun Gu, Qianghua Gao, Li-Xin Zhang, Xu Shen, Jieping Ye:

NeuronMerge: Merging Models via Functional Neuron Groups. 9015-9037 - Xiaoyuan Li, Moxin Li, Rui Men, Yichang Zhang, Keqin Bao, Wenjie Wang, Fuli Feng, Dayiheng Liu, Junyang Lin:

HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning. 9038-9072 - Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou, Junyang Lin:

Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models. 9073-9085 - King Zhu, Qianbo Zang, Shian Jia, Siwei Wu, Feiteng Fang, Yizhi Li, Shuyue Guo, Tianyu Zheng, Jiawei Guo, Bo Li, Haoning Wu, Xingwei Qu, Jian Yang, Ruibo Liu, Xiang Yue, Jiaheng Liu, Chenghua Lin, Hamid Alinejad-Rokny, Min Yang, Shiwen Ni, Wenhao Huang, Ge Zhang:

LIME: Less Is More for MLLM Evaluation. 9086-9121 - Xiaofeng Zhou, Heyan Huang, Lizi Liao:

Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement. 9122-9137 - Hong Yi Lin, Chunhua Liu, Haoyu Gao, Patanamon Thongtanunam, Christoph Treude:

CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models. 9138-9166 - Yulia Otmakhova, Lea Frermann:

Narrative Media Framing in Political Discourse. 9167-9196 - Yishuo Cai, Renjie Gu, Jiaxu Li, Xuancheng Huang, Junzhe Chen, Xiaotao Gu, Minlie Huang:

MHALO: Evaluating MLLMs as Fine-grained Hallucination Detectors. 9197-9222 - Barbara Scalvini, Alireza Mashaghi:

Semantic Topology: a New Perspective for Communication Style Characterization. 9223-9233 - Xiaoyu Li, Haoran Shi, Zengyi Yu, Yukun Tu, Chanjin Zheng:

Decoding LLM Personality Measurement: Forced-Choice vs. Likert. 9234-9247 - Koki Horiguchi, Tomoyuki Kajiwara, Takashi Ninomiya, Shoko Wakamiya, Eiji Aramaki:

MultiMSD: A Corpus for Multilingual Medical Text Simplification from Online Medical References. 9248-9258 - Ruyi Zhang, Songlei Jian, Yusong Tan, Heng Gao, Haifang Zhou, Kai Lu:

BadWindtunnel: Defending Backdoor in High-noise Simulated Training with Confidence Variance. 9259-9273 - Yue Gao, Jing Zhao, Shiliang Sun, Xiaosong Qiao, Tengfei Song, Hao Yang:

Multimodal Machine Translation with Text-Image In-depth Questioning. 9274-9287 - Xiaozhuang Song, Shufei Zhang, Tianshu Yu:

ReKG-MCTS: Reinforcing LLM Reasoning on Knowledge Graphs via Training-Free Monte Carlo Tree Search. 9288-9306 - Aziguli Wulamu, Lyu Zhengyu, Kaiyuan Gong, Yu Han, Zewen Wang, Zhihong Zhu, Bowen Xing:

HTML: Hierarchical Topology Multi-task Learning for Semantic Parsing in Knowledge Base Question Answering. 9307-9321 - Jinnan Li, Jinzhe Li, Yue Wang, Yi Chang, Yuan Wu:

StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following. 9322-9341 - Fanxiao Li, Jiaying Wu, Canyuan He, Wei Zhou:

CMIE: Combining MLLM Insights with External Evidence for Explainable Out-of-Context Misinformation Detection. 9342-9354 - Ashutosh Dwivedi, Siddhant Singh, Ashutosh Modi:

EtiCor++: Towards Understanding Etiquettical Bias in LLMs. 9355-9376 - Yuanjian Xu, Jianing Hao, Kunsheng Tang, Jingnan Chen, Anxian Liu, Peng Liu, Guang Zhang:

FinRipple: Aligning Large Language Models with Financial Market for Event Ripple Effect Awareness. 9377-9398 - Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, JingBo Zhu:

Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation. 9399-9431 - Nopporn Lekuthai, Nattawit Pewngam, Supitcha Sokrai, Titipat Achakulvisut:

EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning. 9432-9444 - Elena Stringli, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou:

Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models. 9445-9469 - Tianhe Lin, Jian Xie, Siyu Yuan, Deqing Yang:

Implicit Reasoning in Transformers is Reasoning through Shortcuts. 9470-9487 - Kaishuai Xu, Tiezheng Yu, Yi Cheng, Wenjun Hou, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li:

Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework. 9488-9502 - Yiliu Sun, Zicheng Zhao, Sheng Wan, Chen Gong:

CortexDebate: Debating Sparsely and Equally for Multi-Agent Debate. 9503-9523 - Valentin Knappich, Anna Hätty, Simon Razniewski, Annemarie Friedrich:

PAP2PAT: Benchmarking Outline-Guided Long-Text Patent Generation with Patent-Paper Pairs. 9524-9554 - Xiaofeng Wang, Zhixin Zhang, Jin Guang Zheng, Yiming Ai, Rui Wang:

Debt Collection Negotiations with Large Language Models: An Evaluation System and Optimizing Decision Making with Multi-Agent. 9555-9577 - Kechi Zhang, Ge Li, Jia Li, Yihong Dong, Zhi Jin:

Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points. 9578-9591 - Elke Vandermeerschen, Miryam de Lhoneux:

Supervised and Unsupervised Probing of Shortcut Learning: Case Study on the Emergence and Evolution of Syntactic Heuristics in BERT. 9592-9604 - Florian Schneider, Carolin Holtermann, Chris Biemann, Anne Lauscher:

GIMMICK: Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking. 9605-9668 - Joonhyung Park, Peng Tang, Sagnik Das, Srikar Appalaraju, Kunwar Yashraj Singh, R. Manmatha, Shabnam Ghadar:

R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding. 9669-9685 - Xiaolong Wang, Yuanchi Zhang, Ziyue Wang, Yuzhuang Xu, Fuwen Luo, Yile Wang, Peng Li, Yang Liu:

Perspective Transition of Large Language Models for Solving Subjective Tasks. 9686-9704 - Kaimin Wang, Yuanzhe Shen, Changze Lv, Xiaoqing Zheng, Xuanjing Huang:

TripTailor: A Real-World Benchmark for Personalized Travel Planning. 9705-9723 - Florian Babl, Moritz Hennen, Jakob Murauer, Michaela Geierhos:

Random Splitting Negatively Impacts NER Evaluation: Quantifying and Eliminating the Overestimation of NER Performance. 9724-9738 - Lingwei Wei, Dou Hu, Wei Zhou, Philip S. Yu, Songlin Hu:

Structure-adaptive Adversarial Contrastive Learning for Multi-Domain Fake News Detection. 9739-9752 - Zhiting Fan, Ruizhe Chen, Zuozhu Liu:

BiasGuard: A Reasoning-Enhanced Bias Detection Tool for Large Language Models. 9753-9764 - Maiya Goloburda, Nurkhan Laiyk, Diana Turmakhan, Yuxia Wang, Mukhammed Togmanov, Jonibek Mansurov, Askhat Sametov, Nurdaulet Mukhituly, Minghan Wang, Daniil Orel, Zain Muhammad Mujahid, Fajri Koto, Timothy Baldwin, Preslav Nakov:

Qorǵau: Evaluating Safety in Kazakh-Russian Bilingual Contexts. 9765-9784 - Linjie Mu, Zhongzhen Huang, Shengqian Qin, Yakun Zhu, Shaoting Zhang, Xiaofan Zhang:

MMXU: A Multi-Modal and Multi-X-ray Understanding Dataset for Disease Progression. 9785-9803 - Ziyi Ni, Yifan Li, Ning Yang, Dou Shen, Pin Lyu, Daxiang Dong:

Tree-of-Code: A Self-Growing Tree Framework for End-to-End Code Generation and Execution in Complex Tasks. 9804-9819 - David Sasu, Zehui Wu, Ziwei Gong, Run Chen, Pengyuan Shi, Lin Ai, Julia Hirschberg, Natalie Schluter:

Akan Cinematic Emotions (ACE): A Multimodal Multi-party Dataset for Emotion Recognition in Movie Dialogues. 9820-9831 - Kaiyang Wan, Honglin Mu, Rui Hao, Haoran Luo, Tianle Gu, Xiuying Chen:

A Cognitive Writing Perspective for Constrained Long-Form Text Generation. 9832-9844 - You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li, Ruixuan Li, Maosong Sun:

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models. 9845-9867 - Shivam Adarsh, Kumar Shridhar, Caglar Gulcehre, Nicholas Monath, Mrinmaya Sachan:

SIKeD: Self-guided Iterative Knowledge Distillation for Mathematical Reasoning. 9868-9880 - Xikang Yang, Biyu Zhou, Xuehai Tang, Jizhong Han, Songlin Hu:

Chain of Attack: Hide Your Intention through Multi-Turn Interrogation. 9881-9901 - Yicheng Chen, Yining Li, Kai Hu, Zerun Ma, Haochen Ye, Kai Chen:

MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space. 9902-9915 - Yongchan Chun, Minhyuk Kim, Dongjun Kim, Chanjun Park, Heuiseok Lim:

Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval. 9916-9926 - Linhai Zhang, Ziyang Gao, Deyu Zhou, Yulan He:

Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation. 9927-9944 - Zheheng Luo, Chenhan Yuan, Qianqian Xie, Sophia Ananiadou:

EMPEC: A Comprehensive Benchmark for Evaluating Large Language Models Across Diverse Healthcare Professions. 9945-9958 - Fanzeng Xia, Hao Liu, Yisong Yue, Tongxin Li:

Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents. 9959-9988 - Hyunbin Jin, Je Won Yeom, Seunghyun Bae, Taesup Kim:

"Well, Keep Thinking": Enhancing LLM Reasoning with Adaptive Injection Decoding. 9989-10018 - Xiangyu Zhang, Hexin Liu, Qiquan Zhang, Beena Ahmed, Julien Epps:

SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information. 10019-10030 - Jingxuan Han

, Zhendong Mao, Yi Liu, Yexuan Che, Zheren Fu, Quan Wang:
Fine-grained Knowledge Enhancement for Retrieval-Augmented Generation. 10031-10044 - Chengkun Cai, Haoliang Liu, Xu Zhao, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, John Lee, Jenq-Neng Hwang, Lei Li:

Bayesian Optimization for Controlled Image Editing via LLMs. 10045-10056 - Francesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni:

SPOT: Zero-Shot Semantic Parsing Over Property Graphs. 10057-10073 - Geonhee Kim, Marco Valentino, André Freitas:

Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference. 10074-10095 - Maodong Li, Longyin Zhang, Fang Kong:

Multi-Hop Question Generation via Dual-Perspective Keyword Guidance. 10096-10112 - Harsh Bihany, Shubham Patel, Ashutosh Modi:

LoRMA: Low-Rank Multiplicative Adaptation for LLMs. 10113-10133 - Linghao Zhang, Junhao Wang, Shilin He, Chaoyun Zhang, Yu Kang, Bowen Li, Jiaheng Wen, Chengxing Xie, Maoquan Wang, Yufan Huang, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:

DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale. 10134-10153 - Yunfan Xie, Lixin Zou, Dan Luo, Min Tang, Chenliang Li:

Weak-to-Strong Honesty Alignment via Learning-to-Rank Supervision. 10154-10168 - Mohammadamin Shafiei, Hamidreza Saffari, Nafise Sadat Moosavi:

MultiHoax: A Dataset of Multi-hop False-premise questions. 10169-10187 - Jinming Zhang, Yunfei Long:

Learning to Play Like Humans: A Framework for LLM Adaptation in Interactive Fiction Games. 10188-10205 - Zewen Bai, Liang Yang, Shengdi Yin, Junyu Lu, Jingjie Zeng, Haohao Zhu, Yuanyuan Sun, Hongfei Lin:

STATE ToxiCN: A Benchmark for Span-level Target-Aware Toxicity Extraction in Chinese Hate Speech Detection. 10206-10219 - Yifan Niu, Miao Peng, Nuo Chen, Yatao Bian, Tingyang Xu, Jia Li:

RelEdit: Evaluating Conceptual Knowledge Editing in Language Models via Relational Reasoning. 10220-10238 - Yonghua Hei, Yibo Yan, Shuliang Liu, Huiyu Zhou, Linfeng Zhang, Xuming Hu:

Unlocking Speech Instruction Data Potential with Query Rewriting. 10239-10260 - Tianle Gu, Kexin Huang, Ruilin Luo, Yuanqi Yao, Xiuying Chen, Yujiu Yang, Yan Teng, Yingchun Wang:

From Evasion to Concealment: Stealthy Knowledge Unlearning for LLMs. 10261-10279 - Baolong Bi, Shaohan Huang, Yiwei Wang, Tianchi Yang, Zihan Zhang, Haizhen Huang, Lingrui Mei, Junfeng Fang, Zehao Li, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Shenghua Liu:

Context-DPO: Aligning Language Models for Context-Faithfulness. 10280-10300 - Xiachong Feng, Longxu Dou, Lingpeng Kong:

Reasoning Does Not Necessarily Improve Role-Playing Ability. 10301-10314 - Xiaokang Zhang, Sijia Luo, Bohan Zhang, Zeyao Ma, Jing Zhang, Yang Li, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-Li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang:

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios. 10315-10344 - Wenxuan Wang, Zizhan Ma, Zheng Wang, Chenghan Wu, Jiaming Ji, Wenting Chen, Xiang Li, Yixuan Yuan:

A Survey of LLM-based Agents in Medicine: How far are we from Baymax? 10345-10359 - Haewon Park, Gyubin Choi, Minjun Kim, Yohan Jo:

Context-Robust Knowledge Editing for Language Models. 10360-10385 - Zhuoyun Du, Chen Qian, Wei Liu, Zihao Xie, Yifei Wang, Rennai Qiu, Yufan Dang, Weize Chen, Cheng Yang, Ye Tian, Xuantang Xiong, Lei Han:

Multi-Agent Collaboration via Cross-Team Orchestration. 10386-10406 - William Soto Martinez, Yannick Parmentier, Claire Gardent:

Semantic Evaluation of Multilingual Data-to-Text Generation via NLI Fine-Tuning: Precision, Recall and F1 scores. 10407-10427 - Kidist Amde Mekonnen, Yosef Worku Alemneh, Maarten de Rijke:

Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval. 10428-10445 - Yue Fang, Zhi Jin, Jie An, Hongshen Chen, Xiaohong Chen, Naijun Zhan:

Enhancing Transformation from Natural Language to Signal Temporal Logic Using LLMs with Diverse External Knowledge. 10446-10458 - Puli Chen, Cheng Yang, Xingmao Zhang, Qingbao Huang:

DAGS: A Dependency-Based Dual-Attention and Global Semantic Improvement Framework for Metaphor Recognition. 10459-10476 - Xiaofan Bai, Pingyi Hu, Xiaojing Ma, Linchen Yu, Dongmei Zhang, Qi Zhang, Bin Benjamin Zhu:

ESF: Efficient Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models. 10477-10494 - Zhenru Zhang, Chujie Zheng, Yangzhen Wu, Beichen Zhang, Runji Lin, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin:

The Lessons of Developing Process Reward Models in Mathematical Reasoning. 10495-10516 - Yongqi Fan, Yating Wang, Guandong Wang, Jie Zhai, Jingping Liu, Qi Ye, Tong Ruan:

MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs. 10517-10548 - Osman Alperen Koras, Rabi Bahnan, Jens Kleesiek, Amin Dada:

Towards Conditioning Clinical Text Generation for User Control. 10549-10569 - Daniil Orel, Dilshod Azizov, Preslav Nakov:

CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings. 10570-10593 - Tianqi Chen, Yuanteng Chen, Peisong Wang, Weixiang Xu, Zeyu Zhu, Jian Cheng:

Q-Mamba: Towards more efficient Mamba models via post-training quantization. 10594-10610 - Kaiwen Wei, Jie Yao, Jiang Zhong, Yangyang Kang, Jingyuan Zhang, Changlong Sun, Xin Zhang, Fengmao Lv, Li Jin:

P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts. 10611-10626 - Liyang He, Chenglong Liu, Rui Li, Zhenya Huang, Shulan Ruan, Jun Zhou, Enhong Chen:

Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models. 10627-10643 - Tianqi Chen, Peisong Wang, Weixiang Xu, Zeyu Zhu, Jian Cheng:

RQT: Hierarchical Residual Quantization for Multi-Model Compression. 10644-10660 - Stefanie Urchs, Veronika Thurner, Matthias Aßenmacher, Christian Heumann, Stephanie Thiemichen:

taz2024full: Analysing German Newspapers for Gender Bias and Discrimination across Decades. 10661-10671 - Marta R. Costa-jussà, Pierre Andrews, Mariano Coria Meglioli, Joy Chen, Joe Chuang, David Dale, Christophe Ropers, Alexandre Mourachko, Eduardo Sánchez, Holger Schwenk, Tuan Tran, Arina Turkatenko, Carleigh Wood:

LCFO: Long Context and Long Form Output Dataset and Benchmarking. 10672-10700 - Yang Hou, Zhenghua Li:

Span-based Semantic Role Labeling as Lexicalized Constituency Tree Parsing. 10701-10713 - Chanhwi Kim, Hyunjae Kim, Sihyeon Park, Jiwoo Lee, Mujeen Sung, Jaewoo Kang:

Learning from Negative Samples in Biomedical Generative Entity Linking. 10714-10730 - Tautvydas Misiunas, Hassan Mansoor, Jasper Uijlings, Oriana Riva, Victor Carbune:

Self-play through Computational Runtimes improves Chart Reasoning. 10731-10746 - Jiachun Li, Pengfei Cao, Yubo Chen, Jiexin Xu, Huaijun Li, Xiaojian Jiang, Kang Liu, Jun Zhao:

Towards Better Chain-of-Thought: A Reflection on Effectiveness and Faithfulness. 10747-10765 - Sinan Kurtyigit, Diego Frassinelli, Carina Silberer, Sabine Schulte im Walde:

A Couch Potato is not a Potato on a Couch: Prompting Strategies, Image Generation, and Compositionality Prediction for Noun Compounds. 10766-10776 - Beiduo Chen, Siyao Peng, Anna Korhonen, Barbara Plank:

A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI. 10777-10802 - Angelina Parfenova, Jürgen Pfeffer:

Measuring What Matters: Evaluating Ensemble LLMs with Label Refinement in Inductive Coding. 10803-10816 - Cong Gao, Bo Zhang, Linkang Yang, Minghao Hu, Zhunchen Luo, Xiaoying Bai, Guotong Geng, Jun Zhang, Yunhua Xue:

Dynamic Evil Score-Guided Decoding: An Efficient Decoding Framework For Red-Team Model. 10817-10833 - Divyaksh Shukla, Ritesh Baviskar, Dwijesh Gohil, Aniket Tiwari, Atul Shree, Ashutosh Modi:

CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations. 10834-10849 - Chris W. Jenkins, Filip Miletic, Sabine Schulte im Walde:

Multi-word Measures: Modeling Semantic Change in Compound Nouns. 10850-10864 - Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang:

Bridge-Coder: Transferring Model Capabilities from High-Resource to Low-Resource Programming Language. 10865-10882 - Yan Yang, Dongxu Li, Haoning Wu, Bei Chen, Liu Liu, Liyuan Pan, Junnan Li:

ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks. 10883-10892 - Marta R. Costa-jussà, Bokai Yu, Pierre Andrews, Belen Alastruey, Necati Cihan Camgöz, Joe Chuang, Jean Maillard, Christophe Ropers, Arina Turkatenko, Carleigh Wood:

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset Download PDF. 10893-10904 - Naomi Baes, Raphaël Merx, Nick Haslam, Ekaterina Vylomova, Haim Dubossarsky:

LSC-Eval: A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data. 10905-10939 - Wenxuan Wang, Kuiyi Gao, Youliang Yuan, Jen-tse Huang, Qiuzhi Liu, Shuai Wang, Wenxiang Jiao, Zhaopeng Tu:

Chain-of-Jailbreak Attack for Image Generation Models via Step by Step Editing. 10940-10957 - Anna Wegmann, Dong Nguyen, David Jurgens:

Tokenization is Sensitive to Language Variation. 10958-10983 - Xin Li, Mengbing Liu, Li Wei, Jiancheng An, Mérouane Abdelkader Debbah, Chau Yuen:

WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications. 10984-11009 - Moxin Li, Yuantao Zhang, Wenjie Wang, Wentao Shi, Zhuo Liu, Fuli Feng, Tat-Seng Chua:

Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment. 11010-11031 - Zhijun Wang, Jiahuan Li, Hao Zhou, Rongxiang Weng, Jingang Wang, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang:

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training. 11032-11046 - Sougata Saha, Monojit Choudhury:

User Behavior Prediction as a Generic, Robust, Scalable, and Low-Cost Evaluation Strategy for Estimating Generalization in LLMs. 11047-11065 - Yueqi Song, Frank F. Xu, Shuyan Zhou, Graham Neubig:

Beyond Browsing: API-Based Web Agents. 11066-11085 - Chen Zhang, Mingxu Tao, Zhiyuan Liao, Yansong Feng:

MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages. 11086-11102 - Maja Stahl, Timon Ziegenbein, Joonsuk Park, Henning Wachsmuth:

ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation. 11103-11127 - Yuanhe Zhang, Zhenhong Zhou, Wei Zhang, Xinyue Wang, Xiaojun Jia, Yang Liu, Sen Su:

Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings. 11128-11150 - Chenchen Yuan, Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci:

Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models. 11151-11168 - Haoke Zhang, Xiaobo Liang, Cunxiang Wang, Juntao Li, Min Zhang:

Unlocking Recursive Thinking of LLMs: Alignment via Refinement. 11169-11182 - Kepu Zhang, Weijie Yu, Sunhao Dai, Jun Xu:

CitaLaw: Enhancing LLM with Citations in Legal Domain. 11183-11196 - Jiyang Qiu, Xinbei Ma, Zhuosheng Zhang, Hai Zhao, Yun Li, Qianren Wang:

MEGen: Generative Backdoor into Large Language Models via Model Editing. 11197-11214 - Jiho Jin, Woosung Kang, Junho Myung, Alice Oh:

Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations. 11215-11228 - Junling Wang, Anna Rutkiewicz, April Yi Wang, Mrinmaya Sachan:

Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models. 11229-11257 - Baixuan Li, Yunlong Fan, Tianyi Ma, Miao Gao, Chuanqi Shi, Zhiqiang Gao:

RASPberry: Retrieval-Augmented Monte Carlo Tree Self-Play with Reasoning Consistency for Multi-Hop Question Answering. 11258-11276 - Jia Hao, Chunhong Zhang, Jiarun Liu, Haiyu Zhao, Zhiqiang Zhan, Zheng Hu:

All That Glitters is Not Gold: Improving Robust Retrieval-Augmented Language Models with Fact-Centric Preference Alignment. 11277-11292 - Yichen Li, Zhiting Fan, Ruizhe Chen, Xiaotang Gai, Luqi Gong, Yan Zhang, Zuozhu Liu:

FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering. 11293-11312 - Zhuofan Wen, Zheng Lian, Shun Chen, Hailiang Yao, Longjiang Yang, Bin Liu, Jianhua Tao:

Listen, Watch, and Learn to Feel: Retrieval-Augmented Emotion Reasoning for Compound Emotion Generation. 11313-11327 - Kangyang Luo, Yuzhuo Bai, Cheng Gao, Shuzheng Si, Zhu Liu, Yingli Shen, Zhitong Wang, Cunliang Kong, Wenhao Li, Yufei Huang, Ye Tian, Xuantang Xiong, Lei Han, Maosong Sun:

GLTW: Joint Improved Graph Transformer and LLM via Three-Word Language for Knowledge Graph Completion. 11328-11344 - Zheng Zhang, Shaocheng Lan, Lei Song, Jiang Bian, Yexin Li, Kan Ren:

Learning to Select In-Context Demonstration Preferred by Large Language Model. 11345-11360 - Cristiano Ciaccio, Marta Sartor, Alessio Miaschi, Felice Dell'Orletta:

Beyond the Spelling Miracle: Investigating Substring Awareness in Character-Blind Language Models. 11361-11372 - Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li:

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling. 11373-11401 - Bowen Cao, Deng Cai, Wai Lam:

InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation. 11402-11415 


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID