


default search action
Tong Zhang 0001
张潼
Person information
- unicode name: 张潼
- affiliation: University of Illinois Urbana-Champaign, IL, USA
- affiliation (former): Hong Kong University of Science and Technology, China
- affiliation (former): Tencent AI Lab, Shenzhen, China
- affiliation (former): Rutgers University, Department of Statistics, NJ, USA
- affiliation (former): Baidu Inc. Beijing, China
- affiliation (former): Yahoo
- affiliation (former): IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
- affiliation (PhD): Stanford University, CA, USA
Other persons with the same name
- Tong Zhang — disambiguation page
- Tong Zhang 0002
— Rensselaer Polytechnic Institute, Troy, NY, USA (and 1 more) - Tong Zhang 0003 — Shanghai Jiaotong University, Department of Computer Science and Technology, China
- Tong Zhang 0004 — University of Southern California, Department of Electrical Engineering Systems, Los Angeles, CA, USA
- Tong Zhang 0005
— University of Illinois, Beckman Institute, Urbana, IL, USA - Tong Zhang 0006 — SRA Corporation, Arlington, VA, USA
- Tong Zhang 0007 — Hewlett-Packard Labs, Palo Alto, CA, USA
- Tong Zhang 0008 — National University of Defense Technology, Changsha, China
- Tong Zhang 0009
— Wuhan University, State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, China (and 2 more) - Tong Zhang 0010 — Henan Polytechnic University, School of Mathematics and Information Science, Jiaozuo, China (and 1 more)
- Tong Zhang 0011 — Wuhan University of Technology, Intelligent Transportation Systems Research Center, China
- Tong Zhang 0012 — Tsinghua University, Department of Engineering Mechanics, School of Aerospace, Beijing, China
- Tong Zhang 0013 — Kyushu University, Graduate School of Information Science and Electrical Engineering, Fukuoka, Japan
- Tong Zhang 0014 — Southeast University, School of Electronic Science and Engineering, MOE Key Laboratory of Micro-Inertial Instrument and Advanced Navigation Technology, Nanjing, China
- Tong Zhang 0015
— South China University of Technology, School of Electronics and Information, Guangzhou, China (and 1 more) - Tong Zhang 0016 — Fuzhou University, College of Mathematics & Computer Science, China
- Tong Zhang 0017
— Peng Cheng Laboratory, Shenzhen, China (and 2 more) - Tong Zhang 0018
— Tsinghua University, Department of Computer Science and Technology, TNList, Beijing, China - Tong Zhang 0019
— University of Washington, Department of Electrical Engineering, Seattle, WA, USA - Tong Zhang 0020
— University of Edinburgh, Institute for Digital Communications, Edinburgh, UK - Tong Zhang 0021
— Nanjing University of Science and Technology, School of Computer Science and Technology, China (and 1 more) - Tong Zhang 0023
— EPFL, Lausanne, Switzerland (and 1 more) - Tong Zhang 0024
— Peking University, Institute of Computer Science and Technology, Beijing, China - Tong Zhang 0025
— University of Windsor, ON, Canada - Tong Zhang 0026
— Harbin Institute of Technology, Guangdong Provincial Key Laboratory of Aerospace Communication and Networking Technology, Shenzhen, China (and 3 more) - Tong Zhang 0027
— Jilin University, College of Computer Science and Technology, Changchun, China - Tong Zhang 0028
— Beijing Institute of Technology, Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, School of Information and Electronics, China - Tong Zhang 0029
— Southern University of Science and Technology, Department of Computer Science and Engineering, Shenzhen, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2026
[j98]Haishan Ye
, Wei Xiong
, Tong Zhang
:
PMGT-VR: A Decentralized Proximal-Gradient Algorithmic Framework With Variance Reduction. IEEE Trans. Pattern Anal. Mach. Intell. 48(1): 408-420 (2026)
[i286]Hao Bai, Alexey Taymanov, Tong Zhang, Aviral Kumar, Spencer Whitehead:
WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks. CoRR abs/2601.02439 (2026)
[i285]Jiarui Yao, Ruida Wang, Tong Zhang:
PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary. CoRR abs/2601.10201 (2026)
[i284]Hanning Zhang, Ruida Wang, Rui Pan, Wenyuan Wang, BingXu Meng, Tong Zhang:
PhysProver: Advancing Automatic Theorem Proving for Physics. CoRR abs/2601.15737 (2026)
[i283]Shivanshu Shekhar, Uttaran Bhattacharya, Raghavendra Addanki, Md. Mehrab Tanjim, Somdeb Sarkhel, Tong Zhang:
GT-SVJ: Generative-Transformer-Based Self-Supervised Video Judge For Efficient Video Reward Modeling. CoRR abs/2602.05202 (2026)- 2025
[j97]Chuangwei Xu, Jie Liu, Shiyuan Han, Xiaoqi Duan, Lei Xiang, Tong Zhang:
FourCastLSTM: A precipitation nowcasting model integrating global and local spatiotemporal features. Comput. Geosci. 204: 105966 (2025)
[j96]Yong Lin, Chen Liu, Chenlu Ye, Qing Lian, Yuan Yao, Tong Zhang:
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning. J. Mach. Learn. Res. 26: 128:1-128:47 (2025)
[j95]Haishan Ye
, Zhichao Huang
, Cong Fang
, Chris Junchi Li
, Tong Zhang
:
Hessian-Aware Zeroth-Order Optimization. IEEE Trans. Pattern Anal. Mach. Intell. 47(6): 4869-4877 (2025)
[j94]Zicheng Zhang
, Wei Ke
, Yi Zhu, Xiaodan Liang
, Jianzhuang Liu
, Qixiang Ye
, Tong Zhang:
Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation. IEEE Trans. Circuits Syst. Video Technol. 35(4): 3185-3195 (2025)
[j93]Yifei He, Yuzheng Hu, Yong Lin, Tong Zhang, Han Zhao:
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic. Trans. Mach. Learn. Res. 2025 (2025)
[j92]Shivanshu Shekhar, Shreyas Singh, Tong Zhang:
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization. Trans. Mach. Learn. Res. 2025 (2025)
[j91]Hanning Zhang, Pengcheng Wang, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, Tong Zhang:
Entropy-Regularized Process Reward Model. Trans. Mach. Learn. Res. 2025 (2025)
[j90]Xiaoyu Wang
, Xuxing Chen
, Shiqian Ma
, Tong Zhang
:
Fully First-Order Methods for DecentralizedBilevel Optimization. IEEE Trans. Signal Process. 73: 4734-4747 (2025)
[c273]Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang:
Bridge-Coder: Transferring Model Capabilities from High-Resource to Low-Resource Programming Language. ACL (Findings) 2025: 10865-10882
[c272]Xuanchang Zhang, Wei Xiong, Lichang Chen, Tianyi Zhou, Heng Huang, Tong Zhang:
From Lists to Emojis: How Format Bias Affects Model Alignment. ACL (1) 2025: 26940-26961
[c271]Rui Pan, Dylan Zhang, Hanning Zhang, Xingyuan Pan, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang:
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting. ACL (1) 2025: 31959-31982
[c270]Ruida Wang, Yuxin Li, Yi R. Fung, Tong Zhang:
Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability. EMNLP 2025: 16783-16809
[c269]Jingyan Shen, Jiarui Yao, Rui Yang, Yifan Sun, Feng Luo
, Rui Pan, Tong Zhang, Han Zhao:
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning. EMNLP 2025: 17447-17463
[c268]Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu:
Building Math Agents with Multi-Turn Iterative Preference Learning. ICLR 2025
[c267]Yuxing Liu, Rui Pan, Tong Zhang:
AdaGrad under Anisotropic Smoothness. ICLR 2025
[c266]Renjie Pi, Jianshu Zhang, Tianyang Han, Jipeng Zhang, Rui Pan, Tong Zhang:
Personalized Visual Instruction Tuning. ICLR 2025
[c265]Yifan Hao, Xingyuan Pan, Hanning Zhang, Chenlu Ye, Rui Pan, Tong Zhang:
Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods. ICML 2025
[c264]Rui Yang, Hanyang Chen, Junyu Zhang, Mark Zhao, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, Huan Zhang, Tong Zhang:
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents. ICML 2025
[c263]Ruida Wang, Rui Pan, Yuxin Li, Jipeng Zhang, Yizhen Jia, Shizhe Diao, Renjie Pi, Junjie Hu, Tong Zhang:
MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving. ICML 2025
[c262]Chenlu Ye, Yujia Jin, Alekh Agarwal, Tong Zhang:
Catoni Contextual Bandits are Robust to Heavy-tailed Rewards. ICML 2025
[c261]Heyang Zhao, Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang:
Logarithmic Regret for Online KL-Regularized Reinforcement Learning. ICML 2025
[c260]Jipeng Zhang, Yaxuan Qin, Renjie Pi, Weizhong Zhang, Rui Pan, Tong Zhang:
TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data. NAACL (Findings) 2025: 4671-4686
[i282]Hanning Zhang, Juntong Song, Juno Zhu, Yuanhao Wu, Tong Zhang, Cheng Niu:
RAG-Reward: Optimizing RAG with Reward Modeling and RLHF. CoRR abs/2501.13264 (2025)
[i281]Chenlu Ye, Yujia Jin, Alekh Agarwal, Tong Zhang:
Catoni Contextual Bandits are Robust to Heavy-tailed Rewards. CoRR abs/2502.02486 (2025)
[i280]Boyao Wang, Rui Pan, Shizhe Diao, Xingyuan Pan, Jipeng Zhang, Renjie Pi, Tong Zhang:
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training. CoRR abs/2502.03460 (2025)
[i279]Qingyue Zhao, Kaixuan Ji, Heyang Zhao, Tong Zhang, Quanquan Gu:
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability. CoRR abs/2502.06051 (2025)
[i278]Heyang Zhao, Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang:
Logarithmic Regret for Online KL-Regularized Reinforcement Learning. CoRR abs/2502.07460 (2025)
[i277]Rui Yang, Hanyang Chen, Junyu Zhang, Mark Zhao, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, Huan Zhang, Tong Zhang:
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents. CoRR abs/2502.09560 (2025)
[i276]Xinwei Shen, Nicolai Meinshausen, Tong Zhang:
Reverse Markov Learning: Multi-Step Generative Models for Complex Distributions. CoRR abs/2502.13747 (2025)
[i275]Wei Xiong, Hanning Zhang, Chenlu Ye, Lichang Chen, Nan Jiang, Tong Zhang:
Self-rewarding correction for mathematical reasoning. CoRR abs/2502.19613 (2025)
[i274]Ruida Wang, Rui Pan, Yuxin Li, Jipeng Zhang, Yizhen Jia, Shizhe Diao, Renjie Pi, Junjie Hu, Tong Zhang:
MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving. CoRR abs/2503.03205 (2025)
[i273]Shivanshu Shekhar, Tong Zhang:
ROCM: RLHF on consistency models. CoRR abs/2503.06171 (2025)
[i272]Jerry Huang, Siddarth Madala, Risham Sidhu, Cheng Niu, Julia Hockenmaier, Tong Zhang:
RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning. CoRR abs/2503.12759 (2025)
[i271]Kang An, Yuxing Liu, Rui Pan, Shiqian Ma, Donald Goldfarb, Tong Zhang:
ASGO: Adaptive Structured Gradient Optimization. CoRR abs/2503.20762 (2025)
[i270]Wei Xiong, Jiarui Yao, Yuhui Xu, Bo Pang, Lei Wang, Doyen Sahoo, Junnan Li, Nan Jiang, Tong Zhang, Caiming Xiong, Hanze Dong:
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce. CoRR abs/2504.11343 (2025)
[i269]Xunpeng Huang, Yujin Han, Difan Zou, Yian Ma, Tong Zhang:
Capturing Conditional Dependence via Auto-regressive Diffusion Models. CoRR abs/2504.21314 (2025)
[i268]Nishant Jain, Xunpeng Huang, Yian Ma, Tong Zhang:
Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees. CoRR abs/2505.01049 (2025)
[i267]Xiusi Chen, Gaotang Li, Ziqi Wang, Bowen Jin, Cheng Qian, Yu Wang, Hongru Wang, Yu Zhang, Denghui Zhang, Tong Zhang, Hanghang Tong, Heng Ji:
RM-R1: Reward Modeling as Reasoning. CoRR abs/2505.02387 (2025)
[i266]Jiarui Yao, Yifan Hao, Hanning Zhang, Hanze Dong, Wei Xiong, Nan Jiang, Tong Zhang:
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL. CoRR abs/2505.02391 (2025)
[i265]Yifei He, Siqi Zeng, Yuzheng Hu, Rui Yang, Tong Zhang, Han Zhao:
MergeBench: A Benchmark for Merging Domain-Specialized LLMs. CoRR abs/2505.10833 (2025)
[i264]Xunpeng Huang, Yingyu Lin, Nikki Lijing Kuang, Hanze Dong, Difan Zou, Yian Ma, Tong Zhang:
Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion. CoRR abs/2505.21892 (2025)
[i263]Xingyuan Pan, Chenlu Ye, Joseph Melkonian, Jiaqi W. Ma, Tong Zhang:
Daunce: Data Attribution through Uncertainty Estimation. CoRR abs/2505.23223 (2025)
[i262]Ruida Wang, Yuxin Li, Yi R. (May) Fung, Tong Zhang:
Let's Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM's Math Capability. CoRR abs/2505.23703 (2025)
[i261]Jingyan Shen, Jiarui Yao, Rui Yang, Yifan Sun, Feng Luo, Rui Pan, Tong Zhang, Han Zhao:
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning. CoRR abs/2505.24846 (2025)
[i260]Tong Zhang, Juan Carlos León Alcázar, Bernard Ghanem:
Motion-Aware Concept Alignment for Consistent Video Editing. CoRR abs/2506.01004 (2025)
[i259]Yifan Hao, Xingyuan Pan, Hanning Zhang, Chenlu Ye, Rui Pan, Tong Zhang:
Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods. CoRR abs/2506.01901 (2025)
[i258]Yifan Hao, Chenlu Ye, Chi Han, Tong Zhang:
Transformers as Multi-task Learners: Decoupling Features in Hidden Markov Models. CoRR abs/2506.01919 (2025)
[i257]Qianhui Wu, Kanzhi Cheng, Rui Yang, Chaoyun Zhang, Jianwei Yang, Huiqiang Jiang, Jian Mu, Baolin Peng, Bo Qiao, Reuben Tan, Si Qin, Lars Liden, Qingwei Lin, Huan Zhang, Tong Zhang, Jianbing Zhang, Dongmei Zhang
, Jianfeng Gao:
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents. CoRR abs/2506.03143 (2025)
[i256]Yifan Hao, Yanxin Lu, Xinwei Shen, Tong Zhang:
Towards Better Generalization via Distributional Input Projection Network. CoRR abs/2506.04690 (2025)
[i255]Junhong Shen, Hao Bai, Lunjun Zhang, Yifei Zhou, Amrith Setlur, Shengbang Tong, Diego Caples, Nan Jiang, Tong Zhang, Ameet Talwalkar, Aviral Kumar:
Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction. CoRR abs/2506.07976 (2025)
[i254]Jipeng Zhang, Kehao Miao, Renjie Pi, Zhaowei Wang, Runtao Liu, Rui Pan, Tong Zhang:
VL-GenRM: Enhancing Vision-Language Verification via Vision Experts and Iterative Training. CoRR abs/2506.13888 (2025)
[i253]Yuanhao Wu, Juntong Song, Hanning Zhang, Tong Zhang, Cheng Niu:
DuaShepherd: Integrating Stepwise Correctness and Potential Rewards for Mathematical Reasoning. CoRR abs/2506.17533 (2025)
[i252]Zihan Wang, Rui Pan, Jiarui Yao, Robert Csordas, Linjie Li, Lu Yin, Jiajun Wu, Tong Zhang, Manling Li, Shiwei Liu:
Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models. CoRR abs/2506.18945 (2025)
[i251]Yaowenqi Liu, BingXu Meng, Rui Pan, Jerry Huang, Tong Zhang:
GUIDE: Towards Scalable Advising for Research Ideas. CoRR abs/2507.08870 (2025)
[i250]Nishant Jain, Tong Zhang:
A Sharp KL-Convergence Analysis for Diffusion Models under Minimal Assumptions. CoRR abs/2508.16306 (2025)
[i249]Andrew Ferguson, Marisa Lafleur, Lars Ruthotto, Jesse Thaler, Yuan-Sen Ting
, Pratyush Tiwary, Soledad Villar, E. Paulo Alves, Jeremy Avigad, Simon Billinge, Camille L. Bilodeau, Keith Brown, Emmanuel J. Candès, Arghya Chattopadhyay, Bingqing Cheng, Jonathan Clausen, Connor W. Coley, Andrew J. Connolly, Fred Daum, Sijia S. Dong
, Chrisy Xiyu Du, Cora Dvorkin, Cristiano Fanelli, Eric B. Ford, Luis Manuel Frutos, Nicolás García Trillos, Cecilia Garraffo, Robert Ghrist, Rafael Gómez-Bombarelli, Gianluca Guadagni, Sreelekha Guggilam, Sergei Gukov, Juan B. Gutierrez
, Salman Habib, Johannes Hachmann, Boris Hanin
, Philip C. Harris, Murray Holland, Elizabeth Holm, Hsin-Yuan Huang, Shih-Chieh Hsu, Nick Jackson, Olexandr Isayev, Heng Ji, Aggelos K. Katsaggelos, Jeremy Kepner, Yannis G. Kevrekidis, Michelle P. Kuchera, J. Nathan Kutz, Branislava Lalic, Ann Lee, Matt LeBlanc, Josiah Lim, Rebecca Lindsey, Yongmin Liu, Peter Y. Lu, Sudhir Malik, Vuk Mandic, Vidya B. Manian, Emeka P. Mazi, Pankaj Mehta, Peter Melchior, Brice Ménard, Jennifer Ngadiuba, Stella Offner, Elsa Olivetti, Shyue Ping Ong, Christopher Rackauckas, Philippe Rigollet, Chad Risko, Philip Romero, Grant M. Rotskoff, Brett Savoie, Uros Seljak, David Shih, Gary Shiu, Dima Shlyakhtenko, Eva Silverstein, Taylor Sparks, Thomas Strohmer, Christopher Stubbs, Stephen Thomas, Suriyanarayanan Vaikuntanathan, René Vidal, Francisco Villaescusa-Navarro, Gregory Voth, Benjamin Wandelt, Rachel Ward, Melanie Weber, Risa Wechsler, Stephen Whitelam, Olaf Wiest, Mike Williams, Zhuoran Yang, Yaroslava G. Yingling, Bin Yu, Shuwen Yue, Ann Zabludoff, Huimin Zhao, Tong Zhang:
The Future of Artificial Intelligence and the Mathematical and Physical Sciences (AI+MPS). CoRR abs/2509.02661 (2025)
[i248]Chenlu Ye, Zhou Yu, Ziji Zhang, Hao Chen, Narayanan Sadagopan, Jing Huang, Tong Zhang, Anurag Beniwal:
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training. CoRR abs/2509.03403 (2025)
[i247]Yuxing Liu, Yuze Ge, Rui Pan, An Kang, Tong Zhang:
Theoretical Analysis on how Learning Rate Warmup Accelerates Convergence. CoRR abs/2509.07972 (2025)
[i246]Yue Xin, Wenyuan Wang, Rui Pan, Ruida Wang, Howard Meng, Renjie Pi, Shizhe Diao, Tong Zhang:
Generalizable Geometric Image Caption Synthesis. CoRR abs/2509.15217 (2025)
[i245]Xunpeng Huang, Yingyu Lin, Nishant Jain, Kaibo Wang, Difan Zou, Yian Ma, Tong Zhang:
On the Complexity Theory of Masked Discrete Diffusion: From poly(1/ε) to Nearly ε-Free. CoRR abs/2509.21835 (2025)
[i244]Wei Xiong, Chenlu Ye, Baohao Liao, Hanze Dong, Xinxing Xu, Christof Monz, Jiang Bian, Nan Jiang, Tong Zhang:
Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training. CoRR abs/2510.04996 (2025)
[i243]Ruida Wang, Jiarui Yao, Rui Pan, Shizhe Diao, Tong Zhang:
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving. CoRR abs/2510.11769 (2025)
[i242]Hanyang Chen, Mark Zhao, Rui Yang, Qinwei Ma, Ke Yang, Jiarui Yao, Kangrui Wang, Hao Bai, Zhenhailong Wang, Rui Pan, Mengchao Zhang, Jose A. Barreiros, Aykut Özgün Önol, ChengXiang Zhai, Heng Ji, Manling Li, Huan Zhang, Tong Zhang:
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning. CoRR abs/2510.12693 (2025)
[i241]Rui Pan, Yang Luo, Yuxing Liu, Yang You, Tong Zhang:
Unbiased Gradient Low-Rank Projection. CoRR abs/2510.17802 (2025)
[i240]Yuxin Li, Minghao Liu, Ruida Wang, Wenzhao Ji, Zhitao He, Rui Pan, Junming Huang, Tong Zhang, Yi R. Fung:
Lean4Physics: Comprehensive Reasoning Framework for College-level Physics in Lean4. CoRR abs/2510.26094 (2025)
[i239]Jerry Huang, Siddarth Madala, Cheng Niu, Julia Hockenmaier, Tong Zhang:
Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking. CoRR abs/2511.01208 (2025)
[i238]Tong Zhang, Carlos Hinojosa, Bernard Ghanem:
CAPTAIN: Semantic Feature Injection for Memorization Mitigation in Text-to-Image Diffusion Models. CoRR abs/2512.10655 (2025)- 2024
[j89]Binnie Wai-Keung Yiu
, Tong Zhang
, Cheuk-Wing Lee
:
Short-Term Load Forecasting Using Regularized Greedy Forest-Based Ensemble Model. IEEE Access 12: 112426-112439 (2024)
[j88]Claudio Gentile, Zhilei Wang, Tong Zhang:
Fast Rates in Pool-Based Batch Active Learning. J. Mach. Learn. Res. 25: 262:1-262:42 (2024)
[j87]Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang:
PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium. J. Mach. Learn. Res. 25: 327:1-327:48 (2024)
[j86]Shixiang Chen
, Shiqian Ma
, Anthony Man-Cho So
, Tong Zhang
:
Nonsmooth Optimization over the Stiefel Manifold and Beyond: Proximal Gradient Method and Recent Variants. SIAM Rev. 66(2): 319-352 (2024)
[j85]Hanze Dong, Wei Xiong, Bo Pang, Haoxiang Wang, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang:
RLHF Workflow: From Reward Modeling to Online RLHF. Trans. Mach. Learn. Res. 2024 (2024)
[j84]Jianyu Wang, Rudrajit Das, Gauri Joshi, Satyen Kale, Zheng Xu, Tong Zhang:
On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data. Trans. Mach. Learn. Res. 2024 (2024)
[c259]Cheng Niu, Yang Guan, Yuanhao Wu, Juno Zhu, Juntong Song, Randy Zhong, Kaihua Zhu, Siliang Xu, Shizhe Diao, Tong Zhang:
VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning. ACL (3) 2024: 266-277
[c258]Shizhe Diao, Pengcheng Wang
, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang:
Active Prompting with Chain-of-Thought for Large Language Models. ACL (1) 2024: 1330-1350
[c257]Rui Pan, Shuo Xing
, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Jipeng Zhang, Renjie Pi, Tong Zhang:
Plum: Prompt Learning using Metaheuristics. ACL (Findings) 2024: 2177-2197
[c256]Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang:
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards. ACL (1) 2024: 8642-8655
[c255]Cheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang:
Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation. ACL (1) 2024: 8724-8741
[c254]Cheng Niu, Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Randy Zhong, Juntong Song, Tong Zhang:
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models. ACL (1) 2024: 10862-10878
[c253]Xunpeng Huang, Difan Zou, Hanze Dong, Yi-An Ma, Tong Zhang:
Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo. COLT 2024: 2438-2493
[c252]Renjie Pi, Lewei Yao, Jiahui Gao, Jipeng Zhang, Tong Zhang:
PerceptionGPT: Effectively Fusing Visual Perception Into LLM. CVPR 2024: 27114-27123
[c251]Jiaqi Tang, Hao Lu, Xiaogang Xu, Ruizheng Wu, Sixing Hu, Tong Zhang, Tsz Wa Cheng, Ming Ge, Ying-Cong Chen, Fugee Tsung
:
An Incremental Unified Framework for Small Defect Inspection. ECCV (31) 2024: 307-324
[c250]Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang:
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization. ECCV (33) 2024: 382-398
[c249]Dimitris Stripelis, Zhaozhuo Xu, Zijian Hu, Alay Dilipbhai Shah, Han Jin, Yuhang Yao, Jipeng Zhang, Tong Zhang, Salman Avestimehr, Chaoyang He:
TensorOpera Router: A Multi-Model Router for Efficient LLM Inference. EMNLP (Industry Track) 2024: 452-462
[c248]Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang:
Mitigating the Alignment Tax of RLHF. EMNLP 2024: 580-606
[c247]Haoxiang Wang, Wei Xiong, Tengyang Xie, Han Zhao, Tong Zhang:
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts. EMNLP (Findings) 2024: 10582-10592
[c246]Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang:
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts. EMNLP 2024: 11953-11974
[c245]Renjie Pi, Tianyang Han, Jianshu Zhang, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang:
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance. EMNLP 2024: 16012-16027
[c244]Yong Lin, Skyler Seto, Maartje ter Hoeve, Katherine Metcalf, Barry-John Theobald, Xuan Wang, Yizhe Zhang, Chen Huang, Tong Zhang:
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization. EMNLP (Findings) 2024: 16015-16026
[c243]Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang:
The Instinctive Bias: Spurious Images lead to Illusion in MLLMs. EMNLP 2024: 16163-16177
[c242]Rui Yang, Han Zhong, Jiawei Xu, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang:
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption. ICLR 2024
[c241]Xunpeng Huang, Hanze Dong, Yifan Hao, Yian Ma, Tong Zhang:
Reverse Diffusion Monte Carlo. ICLR 2024
[c240]Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang:
Spurious Feature Diversification Improves Out-of-distribution Generalization. ICLR 2024
[c239]Rui Pan, Yuxing Liu, Xiaoyu Wang, Tong Zhang:
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise. ICLR 2024
[c238]Alekh Agarwal, Jian Qian, Alexander Rakhlin, Tong Zhang:
The Non-linear F-Design and Applications to Interactive Learning. ICML 2024: 362-396
[c237]Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang:
Faster Sampling via Stochastic Gradient Proximal Sampler. ICML 2024: 20559-20596
[c236]Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang:
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint. ICML 2024: 54715-54754
[c235]Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang:
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption. ICML 2024: 56982-57017
[c234]Dake Zhang, Boxiang Lyu, Shuang Qiu, Mladen Kolar, Tong Zhang:
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning. ICML 2024: 59459-59489
[c233]Helbert Paat, Qing Lian, Weilong Yao, Tong Zhang:
MEDL-U: Uncertainty-aware 3D Automatic Annotation based on Evidential Deep Learning. ICRA 2024: 13976-13982
[c232]Shizhe Diao, Rui Pan, Hanze Dong, Kashun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang:
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. NAACL (Demonstrations) 2024: 116-127
[c231]Zixuan Zhang, Revanth Gangi Reddy, Kevin Small, Tong Zhang, Heng Ji:
Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization. NAACL-HLT (Findings) 2024: 742-753
[c230]Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang:
R-Tuning: Instructing Large Language Models to Say 'I Don't Know'. NAACL-HLT 2024: 7113-7139
[c229]Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yian Ma, Tong Zhang:
Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference. NeurIPS 2024
[c228]Miao Lu, Han Zhong, Tong Zhang, Jose H. Blanchet:
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithms. NeurIPS 2024
[c227]Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang:
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning. NeurIPS 2024
[c226]Renjie Pi, Jianshu Zhang, Jipeng Zhang, Rui Pan, Zhekai Chen, Tong Zhang:
Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions. NeurIPS 2024
[c225]Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang:
A Sober Look at the Robustness of CLIPs to Spurious Features. NeurIPS 2024
[c224]Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang:
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs. NeurIPS 2024
[c223]Chenlu Ye, Wei Xiong, Yuheng Zhang, Hanze Dong, Nan Jiang, Tong Zhang:
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model. NeurIPS 2024
[c222]Ying Su, Jipeng Zhang, Yangqiu Song, Tong Zhang:
PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs. *SEM@NAACL 2024: 360-371
[i237]Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Cheng Niu, Randy Zhong, Juntong Song, Tong Zhang:
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models. CoRR abs/2401.00396 (2024)
[i236]Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting
, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith
, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciuca, UniverseTBD:
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets. CoRR abs/2401.01916 (2024)
[i235]Renjie Pi, Tianyang Han, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang:
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance. CoRR abs/2401.02906 (2024)
[i234]Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang:
Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo. CoRR abs/2401.06325 (2024)
[i233]Yifan Hao, Tong Zhang:
The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness. CoRR abs/2401.12236 (2024)
[i232]Ying Su, Jipeng Zhang, Yangqiu Song, Tong Zhang:
PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs. CoRR abs/2401.17536 (2024)
[i231]Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang:
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs. CoRR abs/2402.03757 (2024)
[i230]Chenlu Ye, Wei Xiong, Yuheng Zhang, Nan Jiang, Tong Zhang:
A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference. CoRR abs/2402.07314 (2024)
[i229]Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang:
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption. CoRR abs/2402.08991 (2024)
[i228]Ying Su, Tianqing Fang, Huiru Xiao, Weiqi Wang, Yangqiu Song, Tong Zhang, Lei Chen:
EntailE: Introducing Textual Entailment in Commonsense Knowledge Graph Completion. CoRR abs/2402.09666 (2024)
[i227]Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang:
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards. CoRR abs/2402.18571 (2024)
[i226]Xunpeng Huang, Hanze Dong, Difan Zou, Tong Zhang:
An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling. CoRR abs/2403.06183 (2024)
[i225]


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID