


default search action
Dong Yu 0001
俞栋
Person information
- unicode name: 俞栋
- affiliation: Tencent AI Lab, China
- affiliation (1998 - 2017): Microsoft Research, Redmond, WA, USA
- affiliation (PhD): University of Idaho, Moscow, ID, USA
Other persons with the same name
- Dong Yu — disambiguation page
- Dong Yu 0002
— Xi'an Jiaotong University, Institution of Advanced Manufacturing and Technology, China - Dong Yu 0003 — Beijing Language and Culture University, Beijing, China
- Dong Yu 0004 — University of Chinese Academy of Sciences, Beijing, China (and 2 more)
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
[j71]Duzhen Zhang, Yong Ren
, Chenxing Li, Dong Yu
, Tielin Zhang
:
Information-theoretic complementary prompts for improved continual text classification. Neural Networks 190: 107676 (2025)
[j70]Mengzhao Jia, Wenhao Yu, Kaixin Ma, Tianqing Fang, Zhihan Zhang, Siru Ouyang, Hongming Zhang, Dong Yu, Meng Jiang:
Leopard: A Vision Language Model for Text-Rich Multi- Image Tasks. Trans. Mach. Learn. Res. 2025 (2025)
[c357]Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Dian Yu, Haitao Mi, Jinsong Su, Dong Yu:
LiteSearch: Efficient Tree Search with Dynamic Exploration Budget for Math Reasoning. AAAI 2025: 25318-25326
[c356]Yebowen Hu, Xiaoyang Wang, Wenlin Yao, Yiming Lu, Daoan Zhang, Hassan Foroosh, Dong Yu, Fei Liu:
DeFine: Decision-Making with Analogical Reasoning over Factor Profiles. ACL (Findings) 2025: 4587-4603
[c355]Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Xinting Huang, Dong Yu, Zhicheng Dou:
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression. ACL (1) 2025: 4861-4879
[c354]Zhisong Zhang, Yan Wang, Xinting Huang, Tianqing Fang, Hongming Zhang, Chenlong Deng, Shuaiyi Li, Dong Yu:
Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models. ACL (1) 2025: 9840-9855
[c353]Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen Yang, Nigel Collier, Dong Yu, Deqing Yang:
LoGU: Long-form Generation with Uncertainty Expressions. ACL (1) 2025: 18947-18968
[c352]Ante Wang, Linfeng Song, Ye Tian, Dian Yu, Haitao Mi, Xiangyu Duan, Zhaopeng Tu, Jinsong Su, Dong Yu:
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls. ACL (1) 2025: 23946-23959
[c351]Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Hongming Zhang, Tianqing Fang, Zhenzhong Lan, Dong Yu:
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization. ACL (1) 2025: 27545-27564
[c350]Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu:
Low-Bit Quantization Favors Undertrained LLMs. ACL (1) 2025: 32338-32348
[c349]Yue Qiao, Vinay Kothapally, Meng Yu, Dong Yu:
Neural Ambisonic Encoding For Multi-Speaker Scenarios Using A Circular Microphone Array. ICASSP 2025: 1-5
[c348]Anton Ratnarajah, Shi-Xiong Zhang, Dong Yu:
BANC: Towards Efficient Binaural Audio Neural Codec for Overlapping Speech. ICASSP 2025: 1-5
[c347]Yong Ren, Chenxing Li, Manjie Xu, Wei Liang, Yu Gu, Rilin Chen, Dong Yu:
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment. ICASSP 2025: 1-5
[c346]Jinchuan Tian, Chunlei Zhang, Jiatong Shi, Hao Zhang, Jianwei Yu, Shinji Watanabe
, Dong Yu:
Preference Alignment Improves Language Model-Based TTS. ICASSP 2025: 1-5
[c345]Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu:
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis. ICASSP 2025: 1-5
[c344]Liqiang Jing, Zhehui Huang, Xiaoyang Wang, Wenlin Yao, Wenhao Yu, Kaixin Ma, Hongming Zhang, Xinya Du, Dong Yu:
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts? ICLR 2025
[c343]Siru Ouyang, Wenhao Yu, Kaixin Ma, Zilin Xiao, Zhihan Zhang, Mengzhao Jia, Jiawei Han, Hongming Zhang, Dong Yu:
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph. ICLR 2025
[c342]Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, Dong Yu:
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory. ICLR 2025
[c341]Murong Yue, Wenlin Yao, Haitao Mi, Dian Yu, Ziyu Yao, Dong Yu:
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search. ICLR 2025
[c340]Yuheng Zhang, Dian Yu, Baolin Peng, Linfeng Song, Ye Tian, Mingyue Huo, Nan Jiang, Haitao Mi, Dong Yu:
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning. ICLR 2025
[c339]Xingyu Chen, Jiahao Xu, Tian Liang, Zhiwei He, Jianhui Pang, Dian Yu, Linfeng Song, Qiuzhi Liu, Mengfei Zhou, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu:
Do NOT Think That Much for 2+3=? On the Overthinking of Long Reasoning Models. ICML 2025
[c338]Tong Lei, Zhiyu Zhang, Rilin Chen, Meng Yu, Jing Lu, Chengshi Zheng, Dong Yu, Andong Li:
BridgeVoC: Neural Vocoder with Schrödinger Bridge. IJCAI 2025: 8122-8130
[c337]Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu:
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer. INTERSPEECH 2025
[c336]Yuchen Hu, Yu Gu, Chenxing Li, Rilin Chen, Dong Yu:
Video-to-Audio Generation with Fine-grained Temporal Semantics. INTERSPEECH 2025
[c335]Jiahong Li, Yiwen Shao, Jianheng Zhuo, Chenda Li, Liliang Tang, Dong Yu, Yanmin Qian:
Efficient Multilingual ASR Finetuning via LoRA Language Experts. INTERSPEECH 2025
[c334]Yong Ren, Chenxing Li, Le Xu, Hao Gu, Duzhen Zhang, Yujie Chen, Manjie Xu, Ruibo Fu, Shan Yang, Dong Yu:
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model. INTERSPEECH 2025
[c333]Zhihang Sun, Andong Li, Tong Lei, Rilin Chen, Meng Yu, Chengshi Zheng, Yi Zhou, Dong Yu:
Scaling beyond Denoising: Submitted System and Findings in URGENT Challenge 2025. INTERSPEECH 2025
[c332]Le Xu, Chenxing Li, Yong Ren, Yujie Chen, Yu Gu, Ruibo Fu, Shan Yang, Dong Yu:
Mitigating Audiovisual Mismatch in Visual-Guide Audio Captioning. INTERSPEECH 2025
[c331]Manjie Xu, Chenxing Li, Yong Ren, Xinyi Tu, Ruibo Fu, Wei Liang, Dong Yu:
Towards Diverse and Efficient Audio Captioning via Diffusion Models. INTERSPEECH 2025
[c330]Yaoxun Xu, Jianwei Yu, Hangting Chen, Zhiyong Wu, Xixin Wu, Dong Yu, Rongzhi Gu, Yi Luo:
WAKE: Watermarking Audio with Key Enrichment. INTERSPEECH 2025
[c329]Jianheng Zhuo, Yifan Yang, Yiwen Shao, Yong Xu, Dong Yu, Kai Yu, Xie Chen:
VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining. INTERSPEECH 2025
[c328]Zhaoxi Mu, Rilin Chen, Andong Li, Meng Yu, Xinyu Yang, Dong Yu:
From Continuous to Discrete: Cross-Domain Collaborative General Speech Enhancement via Hierarchical Language Models. ACM Multimedia 2025: 219-228
[i312]Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT. CoRR abs/2501.01102 (2025)
[i311]Junhao Zheng, Chengming Shi, Xidi Cai, Qiuke Li, Duzhen Zhang, Chenxing Li, Dong Yu, Qianli Ma:
Lifelong Learning of Large Language Model based Agents: A Roadmap. CoRR abs/2501.07278 (2025)
[i310]Xiaoyang Wang, Hongming Zhang, Tao Ge, Wenhao Yu, Dian Yu, Dong Yu:
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas. CoRR abs/2501.15427 (2025)
[i309]Yue Wang, Qiuzhi Liu, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Linfeng Song, Dian Yu, Juntao Li, Zhuosheng Zhang
, Rui Wang
, Zhaopeng Tu, Haitao Mi, Dong Yu:
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs. CoRR abs/2501.18585 (2025)
[i308]Ante Wang, Linfeng Song, Ye Tian, Dian Yu, Haitao Mi, Xiangyu Duan, Zhaopeng Tu, Jinsong Su, Dong Yu:
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls. CoRR abs/2502.11183 (2025)
[i307]Hao Zhang, Weiwei Li, Rilin Chen, Vinay Kothapally, Meng Yu, Dong Yu:
LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems. CoRR abs/2502.14145 (2025)
[i306]Yuheng Zhang, Dian Yu, Tao Ge, Linfeng Song, Zhichen Zeng, Haitao Mi, Nan Jiang, Dong Yu:
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent. CoRR abs/2502.16852 (2025)
[i305]Hongming Zhang, Ruixin Hong, Dong Yu:
Streaming Looking Ahead with Token-level Self-reward. CoRR abs/2503.00029 (2025)
[i304]Ke Ji, Jiahao Xu, Tian Liang, Qiuzhi Liu, Zhiwei He, Xingyu Chen, Xiaoyuan Liu, Zhijie Wang, Junying Chen, Benyou Wang, Zhaopeng Tu, Haitao Mi, Dong Yu:
The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models. CoRR abs/2503.02875 (2025)
[i303]Yansi Li, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Qiuzhi Liu, Rui Wang
, Zhuosheng Zhang, Zhaopeng Tu, Haitao Mi, Dong Yu:
Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique. CoRR abs/2503.17363 (2025)
[i302]Yi Su, Dian Yu, Linfeng Song, Juntao Li, Haitao Mi, Zhaopeng Tu, Min Zhang, Dong Yu:
Crossing the Reward Bridge: Expanding RL with Verifiable Rewards Across Diverse Domains. CoRR abs/2503.23829 (2025)
[i301]Zhiwei He, Tian Liang, Jiahao Xu, Qiuzhi Liu, Xingyu Chen, Yue Wang, Linfeng Song, Dian Yu, Zhenwen Liang, Wenxuan Wang, Zhuosheng Zhang, Rui Wang
, Zhaopeng Tu, Haitao Mi, Dong Yu:
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning. CoRR abs/2504.11456 (2025)
[i300]Zhisong Zhang
, Tianqing Fang, Kaixin Ma, Wenhao Yu, Hongming Zhang, Haitao Mi, Dong Yu:
Enhancing Web Agents with Explicit Rollback Mechanisms. CoRR abs/2504.11788 (2025)
[i299]Tianqing Fang, Hongming Zhang, Zhisong Zhang
, Kaixin Ma, Wenhao Yu, Haitao Mi, Dong Yu:
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model. CoRR abs/2504.21024 (2025)
[i298]Junyu Ma, Tianqing Fang, Zhisong Zhang
, Hongming Zhang, Haitao Mi, Dong Yu:
Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation. CoRR abs/2505.03320 (2025)
[i297]Yong Ren, Chenxing Li, Le Xu, Hao Gu, Duzhen Zhang, Yujie Chen, Manjie Xu, Ruibo Fu, Shan Yang, Dong Yu:
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model. CoRR abs/2505.13062 (2025)
[i296]Xiaoyuan Liu, Tian Liang, Zhiwei He, Jiahao Xu, Wenxuan Wang, Pinjia He, Zhaopeng Tu, Haitao Mi, Dong Yu:
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards. CoRR abs/2505.13445 (2025)
[i295]Mengru Wang, Xingyu Chen, Yue Wang, Zhiwei He, Jiahao Xu, Tian Liang, Qiuzhi Liu, Yunzhi Yao, Wenxuan Wang, Ruotian Ma, Haitao Mi, Ningyu Zhang, Zhaopeng Tu, Xiaolong Li, Dong Yu:
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training. CoRR abs/2505.14681 (2025)
[i294]Ruihan Yang, Caiqi Zhang, Zhisong Zhang
, Xinting Huang, Dong Yu, Nigel Collier, Deqing Yang:
UNCLE: Uncertainty Expressions in Long-Form Generation. CoRR abs/2505.16922 (2025)
[i293]Minda Hu, Tianqing Fang, Jianshu Zhang, Junyu Ma, Zhisong Zhang
, Jingyan Zhou, Hongming Zhang, Haitao Mi, Dong Yu, Irwin King:
WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback. CoRR abs/2505.20013 (2025)
[i292]Duzhen Zhang, Yong Ren, Chenxing Li, Dong Yu, Tielin Zhang:
Information-Theoretic Complementary Prompts for Improved Continual Text Classification. CoRR abs/2505.20933 (2025)
[i291]Jianheng Zhuo, Yifan Yang, Yiwen Shao, Yong Xu, Dong Yu, Kai Yu, Xie Chen:
VietASR: Achieving Industry-level Vietnamese ASR with 50-hour labeled data and Large-Scale Speech Pretraining. CoRR abs/2505.21527 (2025)
[i290]Le Xu, Chenxing Li, Yong Ren, Yujie Chen, Yu Gu, Ruibo Fu, Shan Yang, Dong Yu:
Mitigating Audiovisual Mismatch in Visual-Guide Audio Captioning. CoRR abs/2505.22045 (2025)
[i289]Shuaiyi Li, Zhisong Zhang, Yang Deng, Chenlong Deng, Tianqing Fang, Hongming Zhang, Haitao Mi, Dong Yu, Wai Lam:
InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing. CoRR abs/2505.22156 (2025)
[i288]Ce Zhang, Kaixin Ma, Tianqing Fang, Wenhao Yu, Hongming Zhang, Zhisong Zhang
, Yaqi Xie, Katia P. Sycara
, Haitao Mi, Dong Yu:
VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models. CoRR abs/2505.22654 (2025)
[i287]Ziyin Zhang, Jiahao Xu, Zhiwei He, Tian Liang, Qiuzhi Liu, Yansi Li, Linfeng Song, Zhengwen Liang, Zhuosheng Zhang, Rui Wang
, Zhaopeng Tu, Haitao Mi, Dong Yu:
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning. CoRR abs/2505.23754 (2025)
[i286]Yaoxun Xu, Jianwei Yu, Hangting Chen, Zhiyong Wu, Xixin Wu, Dong Yu, Rongzhi Gu, Yi Luo:
WAKE: Watermarking Audio with Key Enrichment. CoRR abs/2506.05891 (2025)
[i285]Hongyan Zhi, Peihao Chen, Siyuan Zhou, Dong Yu, Quanxi Wu, Lei Han, Mingkui Tan:
3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model. CoRR abs/2506.06199 (2025)
[i284]Shun Lei, Yaoxun Xu, Zhiwei Lin, Huaicheng Zhang, Wei Tan, Hangting Chen, Jianwei Yu, Yixuan Zhang, Chenyu Yang, Haina Zhu, Shuai Wang, Zhiyong Wu, Dong Yu:
LeVo: High-Quality Song Generation with Multi-Preference Alignment. CoRR abs/2506.07520 (2025)
[i283]Jiahong Li, Yiwen Shao, Jianheng Zhuo, Chenda Li, Liliang Tang, Dong Yu, Yanmin Qian:
Efficient Multilingual ASR Finetuning via LoRA Language Experts. CoRR abs/2506.21555 (2025)
[i282]Yucheng Shi, Wenhao Yu, Zaitang Li, Yonglin Wang, Hongming Zhang, Ninghao Liu, Haitao Mi, Dong Yu:
MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment. CoRR abs/2507.05720 (2025)
[i281]Zhenwen Liang, Linfeng Song, Yang Li, Tao Yang, Feng Zhang, Haitao Mi, Dong Yu:
Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving. CoRR abs/2507.06804 (2025)
[i280]Yulai Zhao, Haolin Liu, Dian Yu, S. Y. Kung, Haitao Mi, Dong Yu:
One Token to Fool LLM-as-a-Judge. CoRR abs/2507.08794 (2025)
[i279]Zhaoxi Mu, Rilin Chen, Andong Li, Meng Yu, Xinyu Yang, Dong Yu:
From Continuous to Discrete: Cross-Domain Collaborative General Speech Enhancement via Hierarchical Language Models. CoRR abs/2507.19062 (2025)
[i278]Tianqing Fang, Zhisong Zhang, Xiaoyang Wang, Rui Wang, Can Qin, Yuxuan Wan, Jun-Yu Ma, Ce Zhang, Jiaqi Chen, Xiyun Li, Hongming Zhang, Haitao Mi, Dong Yu:
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training. CoRR abs/2508.00414 (2025)
[i277]Duzhen Zhang, Chenxing Li, Jiahua Dong, Qi Liu, Dong Yu:
Exploring Stability-Plasticity Trade-offs for Continual Named Entity Recognition. CoRR abs/2508.03259 (2025)
[i276]Tianxin Xie, Shan Yang, Chenxing Li, Dong Yu, Li Liu:
EmoSteer-TTS: Fine-Grained and Training-Free Emotion-Controllable Text-to-Speech via Activation Steering. CoRR abs/2508.03543 (2025)
[i275]Bingshen Mu, Yiwen Shao, Kun Wei, Dong Yu, Lei Xie:
Efficient Scaling for LLM-based ASR. CoRR abs/2508.04096 (2025)
[i274]Chengsong Huang, Wenhao Yu, Xiaoyang Wang, Hongming Zhang, Zongxia Li, Ruosen Li, Jiaxin Huang, Haitao Mi, Dong Yu:
R-Zero: Self-Evolving Reasoning LLM from Zero Data. CoRR abs/2508.05004 (2025)
[i273]Huaicheng Zhang, Wei Tan, Guangzheng Li, Yixuan Zhang, Hangting Chen, Shun Lei, Chenyu Yang, Zhiyong Wu, Shuai Wang, Qijun Huang, Dong Yu:
Towards Hallucination-Free Music: A Reinforcement Learning Preference Optimization Framework for Reliable Song Generation. CoRR abs/2508.05011 (2025)
[i272]Shu Wu, Chenxing Li, Wenfu Wang, Hao Zhang, Hualei Wang, Meng Yu, Dong Yu:
Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning. CoRR abs/2508.08039 (2025)
[i271]Yisu Liu, Chenxing Li, Wanqian Zhang, Wenfu Wang, Meng Yu, Ruibo Fu, Zheng Lin, Weiping Wang, Dong Yu:
DegDiT: Controllable Audio Generation with Dynamic Event Graph Guided Diffusion Transformer. CoRR abs/2508.13786 (2025)
[i270]Zongxia Li, Wenhao Yu, Chengsong Huang, Rui Liu, Zhenwen Liang, Fuxiao Liu, Jingxi Che, Dian Yu, Jordan L. Boyd-Graber, Haitao Mi, Dong Yu:
Self-Rewarding Vision-Language Model via Reasoning Decomposition. CoRR abs/2508.19652 (2025)
[i269]Taihui Wang, Rilin Chen, Tong Lei, Andong Li, Jinzheng Zhao, Meng Yu, Dong Yu:
Target matching based generative model for speech enhancement. CoRR abs/2509.07521 (2025)
[i268]Tong Zheng, Hongming Zhang, Wenhao Yu, Xiaoyang Wang, Runpeng Dai, Rui Liu, Huiwen Bao, Chengsong Huang, Heng Huang, Dong Yu:
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning. CoRR abs/2509.07980 (2025)
[i267]Runpeng Dai, Linfeng Song, Haolin Liu, Zhenwen Liang, Dian Yu, Haitao Mi, Zhaopeng Tu, Rui Liu, Tong Zheng, Hongtu Zhu, Dong Yu:
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models. CoRR abs/2509.09675 (2025)
[i266]Mukai Li, Linfeng Song, Zhenwen Liang, Jiahao Xu, Shansan Gong, Qi Liu, Haitao Mi, Dong Yu:
EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving. CoRR abs/2509.12603 (2025)
[i265]Yujun Zhou, Zhenwen Liang, Haolin Liu, Wenhao Yu, Kishan Panaganti, Linfeng Song, Dian Yu, Xiangliang Zhang, Haitao Mi, Dong Yu:
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation. CoRR abs/2509.15194 (2025)
[i264]Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Tianqing Fang, Hongming Zhang, Haitao Mi, Dong Yu, Zhicheng Dou:
UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression. CoRR abs/2509.15763 (2025)
[i263]Yan Rong, Chenxing Li, Dong Yu, Li Liu:
AudioGenie-Reasoner: A Training-Free Multi-Agent Framework for Coarse-to-Fine Audio Deep Reasoning. CoRR abs/2509.16971 (2025)
[i262]Wei Tan, Shun Lei, Huaicheng Zhang, Guangzheng Li, Yixuan Zhang, Hangting Chen, Jianwei Yu, Rongzhi Gu, Dong Yu:
SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Transcription. CoRR abs/2509.17404 (2025)
[i261]Zeyu Xie, Chenxing Li, Xuenan Xu, Mengyue Wu, Wenfu Wang, Ruibo Fu, Meng Yu, Dong Yu, Yuexian Zou:
When Audio Generators Become Good Listeners: Generative Features for Understanding Tasks. CoRR abs/2509.24635 (2025)
[i260]Rui Liu, Dian Yu, Tong Zheng, Runpeng Dai, Zongxia Li, Wenhao Yu, Zhenwen Liang, Linfeng Song, Haitao Mi, Pratap Tokekar, Dong Yu:
VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning. CoRR abs/2510.01444 (2025)
[i259]Zhenwen Liang, Ruosen Li, Yujun Zhou, Linfeng Song, Dian Yu, Xinya Du, Haitao Mi, Dong Yu:
CLUE: Non-parametric Verification from Experience via Hidden-State Clustering. CoRR abs/2510.01591 (2025)
[i258]Jiliang Hu, Wenfu Wang, Zuchao Li, Chenxing Li, Yiyang Zhao, Hanzhao Li, Liqiang Zhang, Meng Yu, Dong Yu:
VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents. CoRR abs/2510.11098 (2025)
[i257]Rui Wang, Ce Zhang, Jun-Yu Ma, Jianshu Zhang, Hongru Wang, Yi Chen, Boyang Xue, Tianqing Fang, Zhisong Zhang, Hongming Zhang, Haitao Mi, Dong Yu, Kam-Fai Wong:
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents. CoRR abs/2510.14438 (2025)
[i256]Xusheng Yang, Long Zhou, Wenfu Wang, Kai Hu, Shulin Feng, Chenxing Li, Meng Yu, Dong Yu, Yuexian Zou:
U-Codec: Ultra Low Frame-rate Neural Speech Codec for Fast High-fidelity Speech Generation. CoRR abs/2510.16718 (2025)
[i255]Dian Yu, Yulai Zhao, Kishan Panaganti, Linfeng Song, Haitao Mi, Dong Yu:
Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values. CoRR abs/2510.20187 (2025)
[i254]Tian Liang, Wenxiang Jiao, Zhiwei He, Jiahao Xu, Haitao Mi, Dong Yu:
DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains. CoRR abs/2510.27419 (2025)
[i253]Junqi Zhao, Chenxing Li, Jinzheng Zhao, Rilin Chen, Dong Yu, Mark D. Plumbley, Wenwu Wang:
Feedback-driven Retrieval-augmented Audio Generation with Large Audio Language Models. CoRR abs/2511.01091 (2025)
[i252]Andong Li, Tong Lei, Rilin Chen, Kai Li, Meng Yu, Xiaodong Li, Dong Yu, Chengshi Zheng:
BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective. CoRR abs/2511.07116 (2025)
[i251]Mingyue Huo, Wei-Cheng Tseng, Yiwen Shao, Hao Zhang, Dong Yu:
Auden-Voice: General-Purpose Voice Encoder for Speech and Language Understanding. CoRR abs/2511.15145 (2025)
[i250]Wenhao Yu, Zhenwen Liang, Chengsong Huang, Kishan Panaganti, Tianqing Fang, Haitao Mi, Dong Yu:
Guided Self-Evolving LLMs with Minimal Human Supervision. CoRR abs/2512.02472 (2025)
[i249]Zhenwen Liang, Sidi Lu, Wenhao Yu, Kishan Panaganti, Yujun Zhou, Haitao Mi, Dong Yu:
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning. CoRR abs/2512.15687 (2025)
[i248]Rui Liu, Dian Yu, Lei Ke, Haolin Liu, Yujun Zhou, Zhenwen Liang, Haitao Mi, Pratap Tokekar, Dong Yu:
Stable and Efficient Single-Rollout RL for Multimodal Reasoning. CoRR abs/2512.18215 (2025)
[i247]Tianxin Xie, Wentao Lei, Guanjie Huang, Pengfei Zhang, Kai Jiang, Chunhui Zhang, Fengji Ma, Haoyu He, Han Zhang, Jiangshan He, Jinting Wang, Linghan Fang, Lufei Gao, Orkesh Ablet, Peihua Zhang, Ruolin Hu, Shengyu Li, Weilin Lin, Xiaoyang Feng, Xinyue Yang, Yan Rong, Yanyun Wang, Zihang Shao, Zelin Zhao, Chenxing Li, Shan Yang, Wenfu Wang, Meng Yu, Dong Yu, Li Liu:
PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation. CoRR abs/2512.23994 (2025)- 2024
[j69]Hao Zhang
, Yixuan Zhang
, Meng Yu
, Dong Yu
:
Enhanced Acoustic Howling Suppression via Hybrid Kalman Filter and Deep Learning Models. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2828-2840 (2024)
[c327]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu:
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs. ACL (1) 2024: 267-278
[c326]Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu:
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer. ACL (1) 2024: 1764-1775
[c325]Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu:
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models. ACL (1) 2024: 6864-6890
[c324]Ante Wang
, Linfeng Song, Baolin Peng, Lifeng Jin, Ye Tian, Haitao Mi, Jinsong Su, Dong Yu:
Improving LLM Generations via Fine-Grained Self-Endorsement. ACL (Findings) 2024: 8424-8436
[c323]Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Tongshuang Wu, Jianshu Chen:
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models. ACL (Findings) 2024: 8702-8718
[c322]Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang, Ziyue Jiang, Xuankai Chang, Jiatong Shi, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners. ACL (1) 2024: 10929-10942
[c321]Yinya Huang
, Ruixin Hong, Hongming Zhang, Wei Shao
, Zhicheng Yang, Dong Yu, Changshui Zhang, Xiaodan Liang, Linqi Song
:
CLOMO: Counterfactual Logical Modification with Large Language Models. ACL (1) 2024: 11012-11034
[c320]Duzhen Zhang, Yahan Yu, Jiahua Dong, Chenxing Li, Dan Su, Chenhui Chu, Dong Yu:
MM-LLMs: Recent Advances in MultiModal Large Language Models. ACL (Findings) 2024: 12401-12430
[c319]Yiwei Qin, Kaiqiang Song, Yebowen Hu, Wenlin Yao, Sangwoo Cho, Xiaoyang Wang, Xuansheng Wu, Fei Liu, Pengfei Liu, Dong Yu:
InFoBench: Evaluating Instruction Following Ability in Large Language Models. ACL (Findings) 2024: 13025-13048
[c318]Chenxing Li, Manjie Xu, Dong Yu:
SRC-gAudio: Sampling-Rate-Controlled Audio Generation. APSIPA 2024: 1-6
[c317]Xiangci Li, Linfeng Song, Lifeng Jin, Haitao Mi, Jessica Ouyang, Dong Yu:
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation. LREC/COLING 2024: 666-676
[c316]Zhenwen Liang, Dian Yu, Xiaoman Pan, Wenlin Yao, Qingkai Zeng, Xiangliang Zhang, Dong Yu:
MinT: Boosting Generalization in Mathematical Reasoning via Multi-view Fine-tuning. LREC/COLING 2024: 11307-11318
[c315]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu:
Inconsistent dialogue responses and how to recover from them. EACL (Findings) 2024: 220-230
[c314]Haoyu Wang, Hongming Zhang, Kaiqiang Song, Dong Yu, Dan Roth:
Event Semantic Classification in Context. EACL (Findings) 2024: 1395-1407
[c313]Ruixin Hong, Hongming Zhang, Xiaoman Pan, Dong Yu, Changshui Zhang:
Abstraction-of-Thought Makes Language Models Better Reasoners. EMNLP (Findings) 2024: 1993-2027
[c312]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Wenlin Yao, Hassan Foroosh, Dong Yu, Fei Liu:
When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives. EMNLP 2024: 4293-4308
[c311]Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen:
Skills-in-Context: Unlocking Compositionality in Large Language Models. EMNLP (Findings) 2024: 13838-13890
[c310]Wenhao Yu, Hongming Zhang, Xiaoman Pan, Peixin Cao, Kaixin Ma, Jian Li, Hongwei Wang, Dong Yu:
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models. EMNLP 2024: 14672-14685
[c309]Zhihan Zhang, Tao Ge, Zhenwen Liang, Wenhao Yu, Dian Yu, Mengzhao Jia, Dong Yu, Meng Jiang:
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning. EMNLP 2024: 14720-14738
[c308]Tong Chen, Hongwei Wang, Sihao Chen, Wenhao Yu, Kaixin Ma, Xinran Zhao, Hongming Zhang, Dong Yu:
Dense X Retrieval: What Retrieval Granularity Should We Use? EMNLP 2024: 15159-15177
[c307]Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu:
SPATIALCODEC: Neural Spatial Speech Coding. ICASSP 2024: 1131-1135
[c306]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement And Editing with Conditional Diffusion Models. ICASSP 2024: 7125-7129
[c305]Zili Huang, Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
UniX-Encoder: A Universal X-Channel Speech Encoder for AD-HOC Microphone Array Speech Processing. ICASSP 2024: 11991-11995
[c304]Lingfeng Shen, Sihao Chen, Linfeng Song, Lifeng Jin, Baolin Peng, Haitao Mi, Daniel Khashabi, Dong Yu:
The Trickle-down Impact of Reward Inconsistency on RLHF. ICLR 2024
[c303]Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen:
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment. ICML 2024
[c302]Manjie Xu, Chenxing Li, Duzhen Zhang, Dan Su, Wei Liang, Dong Yu:
Prompt-guided Precise Audio Editing with Diffusion Models. ICML 2024
[c301]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
RIR-SF: Room Impulse Response Based Spatial Feature for Target Speech Recognition in Multi-Channel Multi-Speaker Scenarios. INTERSPEECH 2024
[c300]Yiwen Shao, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Daniel Povey, Sanjeev Khudanpur:
Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment. INTERSPEECH 2024
[c299]Yaoxun Xu, Shi-Xiong Zhang, Jianwei Yu, Zhiyong Wu, Dong Yu:
Comparing Discrete and Continuous Space LLMs for Speech Recognition. INTERSPEECH 2024
[c298]Ruixin Hong, Hongming Zhang, Xinyu Pang, Dong Yu, Changshui Zhang:
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning. NAACL-HLT 2024: 900-925
[c297]Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu:
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning. NAACL-HLT 2024: 1287-1310
[c296]Sihao Chen, Hongming Zhang, Tong Chen, Ben Zhou, Wenhao Yu, Dian Yu, Baolin Peng, Hongwei Wang, Dan Roth, Dong Yu:
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations. NAACL-HLT 2024: 1596-1609
[c295]Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu:
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning. NAACL-HLT 2024: 2341-2369
[c294]Yuanyuan Lei
, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu:
Polarity Calibration for Opinion Summarization. NAACL-HLT 2024: 5211-5224
[c293]Ye Tian, Baolin Peng, Linfeng Song, Lifeng Jin, Dian Yu, Lei Han, Haitao Mi, Dong Yu:
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing. NeurIPS 2024
[c292]Mohan Shi, Zengrui Jin, Yaoxun Xu, Yong Xu, Shi-Xiong Zhang, Kun Wei, Yiwen Shao, Chunlei Zhang, Dong Yu:
Advancing Multi-Talker ASR Performance With Large Language Models. SLT 2024: 14-21
[c291]Yiwen Shao, Yong Xu, Sanjeev Khudanpur, Dong Yu:
Spatialemb: Extract and Encode Spatial Information for 1-Stage Multi-Channel Multi-Speaker ASR on Arbitrary Microphone Arrays. SLT 2024: 56-63
[c290]Zhihang Sun, Andong Li, Rilin Chen, Hao Zhang, Meng Yu, Yi Zhou, Dong Yu:
SMRU: Split-And-Merge Recurrent-Based UNet For Acoustic Echo Cancellation And Noise Suppression. SLT 2024: 317-324
[i246]Yiwei Qin, Kaiqiang Song, Yebowen Hu, Wenlin Yao, Sangwoo Cho, Xiaoyang Wang, Xuansheng Wu, Fei Liu, Pengfei Liu, Dong Yu:
InFoBench: Evaluating Instruction Following Ability in Large Language Models. CoRR abs/2401.03601 (2024)
[i245]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu:
Inconsistent dialogue responses and how to recover from them. CoRR abs/2401.10353 (2024)
[i244]Duzhen Zhang, Yahan Yu, Chenxing Li, Jiahua Dong, Dan Su, Chenhui Chu, Dong Yu:
MM-LLMs: Recent Advances in MultiModal Large Language Models. CoRR abs/2401.13601 (2024)
[i243]


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID