|
News
[Dec 2025] We released our paper on Adaptation of Agentic AI, with public repository here. Hope you enjoy reading it!
[Dec 2025] Honored to receive the Best Paper Honorable Mention Award @ NeurIPS LAW Workshop, for our Personality Illusion paper.
[Dec 2025] I am attending NeurIPS 2025 in San Diego, CA, from Dec 2 to Dec 7. Excited to catch up with old and new friends!
[Sep 2025] 🔥 We released our work discovering The Personality Illusion: LLMs do not have personalities in the way humans do.
[Jul 2025] Our paper on LLM Reasoning Failures is accepted to ICML AI for Math Workshop. Stay tuned for our full release!
|
|
Research
My research aims to advance scientific understanding of AI (especially neural models like LLMs), and more broadly, the general principles of intelligence and intelligent behavior. I approach this goal across three interconnected levels:
-
Behavioral level — analyzing how models and humans reason, generalize, and solve problems, including studies of
alignment, limitations, and trustworthy reasoning.
-
Mechanistic level — interpreting model internals to understand the
circuits, representations, and algorithms that give rise to intelligent behavior and drive observable performance and failures.
-
Social level — investigating how intelligence emerges and interacts in
multi-agent systems and human–AI collaborations.
One can think of these as three levels, paralleling psychology, neuroscience, and social science in their study of human intelligence and behavior.
If any of this resonates with your interests, feel free to reach out and let's connect/collaborate!
|
Selected Publications
|
|
The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs
Pengrui Han*, Rafal D. Kocielnik*, Peiyang Song, Ramit Debnath, Dean Mobbs, Anima Anandkumar, and R. Michael Alvarez (* Equal Contribution)
NeurIPS LAW Workshop: Bridging Language, Agent, and World Models, 2025, Oral Presentation + Best Paper Honorable Mention
NeurIPS Workshop on LLM Persona Modeling (PersonaNLP), 2025, Oral Presentation
arXiv
/
project
/
code
LLMs say they have personalities, but they don’t act like it. Alignment today shapes language, not behavior. This linguistic–behavioral dissociation cautions against equating coherent self-reports with cognitive depth.
|
|
|
Large Language Model Reasoning Failures
Peiyang Song*, Pengrui Han*, and Noah Goodman (* Equal Contribution)
ICML AI for Math Workshop, 2025
preprint
/
full release coming soon
We present the first comprehensive survey dedicated to reasoning failures in LLMs. By unifying fragmented research efforts, our survey provides a structured perspective on systemic weaknesses in LLM reasoning, offering valuable insights and guiding future research towards building stronger, more reliable, and robust reasoning capabilities.
|
|
|
Adaptation of Agentic AI
Pengcheng Jiang*, Jiacheng Lin*, Zhiyi Shi*, Zifeng Wang, Luxi He, Yichen Wu, Ming Zhong, Peiyang Song, Qizheng Zhang, Heng Wang, Xueqiang Xu, Hanwen Xu, Pengrui Han, Dylan Zhang, Jiashuo Sun, Chaoqi Yang, Kun Qian, Tian Wang, Changran Hu, Manling Li, Quanzheng Li, Hao Peng, Sheng Wang, Jingbo Shang, Chao Zhang, Jiaxuan You, Liyuan Liu, Pan Lu, Yu Zhang, Heng Ji, Yejin Choi, Dawn Song, Jimeng Sun, Jiawei Han (* Equal Contribution)
Preprint, 2025
arXiv
Cutting-edge agentic AI systems are built on foundation models that can be adapted to plan, reason, and interact with external tools to perform increasingly complex and specialized tasks. As these systems grow in capability and scope, adaptation becomes a central mechanism for improving performance, reliability, and generalization. In this paper, we unify the rapidly expanding research landscape into a systematic framework that spans both agent adaptations and tool adaptations.
|
|
|
In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
Pengrui Han*, Peiyang Song*, Haofei Yu, and Jiaxuan You (* Equal Contribution)
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024
code
Motivated by the crucial cognitive phenomenon of A-not-B errors, we present the first systematic evaluation on the surprisingly vulnerable inhibitory control abilities of LLMs. We reveal that this weakness undermines LLMs' trustworthy reasoning capabilities across diverse domains, and introduce various mitigations.
|
|
|
ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs
Pengrui Han*, Rafal Kocielnik*, Adhithya Saravanan,Roy Jiang, Or Sharir,and Anima Anandkumar (* Equal Contribution)
Conference On Language Modeling (COLM), 2024
code
We propose a light and efficient pipeline that enables both domain and non-domain experts to quickly generate synthetic debiasing data to mitigate specific or general bias in their models with parameter-efficient fine-tuning.
|
|
|
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance
Guanyu Lin*, Tao Feng*, Pengrui Han*, Ge Liu, Jiaxuan You (* Equal Contribution)
System Demonstration Track of Empirical Methods in Natural Language Processing (EMNLP), 2024
Huggingface Live Demo: Link
We propose a light and efficient pipeline that enables both domain and non-domain experts to quickly generate synthetic debiasing data to mitigate specific or general bias in their models with parameter-efficient fine-tuning.
|
Selected Awards
- NeurIPS LAW Workshop Best Paper Honorable Mention Award (2025)
- Phi Beta Kappa Honor Society (2025)
- Carleton College Chang-Lan Award (2024)
- Caltech SURF Award (2023)
- Carleton College Dean's List (2023)
|
Teaching
- CS 512: Data Mining Principles, Teaching Assistant @ UIUC, Fall 2025
- MATH 241: Ordinary Differential Equations, Teaching Assistant @ Carleton College, Fall 2024
- MATH 321: Real Analysis, Teaching Assistant @ Carleton College, Spring 2024
- MATH 232: Linear Algebra, Teaching Assistant @ Carleton College, Spring 2023
- MATH 232: Linear Algebra, Teaching Assistant @ Carleton College, Winter 2023
|
Academic Services
- Reviewer for conferences: ICLR, ICML, COLM, COLING.
- Reviewer for workshops: Re-Align, LLM-Cognition, BehaviorML, LTEDI, INTERPLAY, AI4Math, LatinX, Assessing World Models
|
|