This repository contains the official code for the NPJ AI paper “How Large Language Models Encode Theory-of-Mind: A Study on Sparse Parameter Patterns.”
create_gradient.py→ compute squared gradients and save as modelchunk_gradient.py→ split gradient into per-layer chunksToM_and_perplexity_evaluation.py→ build masked models for eachm, run ToM & perplexity evalsummarize.py→ aggregate ToM results
Replace every [] with your own paths.
# TOM dataset (full sequence, last-token only supervision)
python create_gradient.py \
--model [MODEL_ID] \
--dataset tom \
--data_path [/path/to/tom_training_data.json] \
--nsamples 100 \
--seqlen 0 \
--seed 0 \
--cache_dir [CACHE_DIR] \
--out [OUT_DIR_FOR_TOM_GRAD]# C4 dataset (random 128-token windows)
python create_gradient.py \
--model [MODEL_ID] \
--dataset c4 \
--nsamples 100 \
--seqlen 128 \
--seed 0 \
--cache_dir [CACHE_DIR] \
--out [OUT_DIR_FOR_C4_GRAD]python chunk_gradient.py \
--model [OUT_DIR_FOR_TOM_GRAD] \
--output_path [TOM_CHUNKS_DIR] \
--cache_dir [CACHE_DIR] \
--device_map autopython chunk_gradient.py \
--model [OUT_DIR_FOR_C4_GRAD] \
--output_path [C4_CHUNKS_DIR] \
--cache_dir [CACHE_DIR] \
--device_map autoParts of the gradient extraction and chunking code are adapted from SqueezeLLM: Dense-and-Sparse Quantization (ICML 2024).
python ToM_and_perplexity_evaluation.py \
--model [MODEL_ID] \
--grad_tom_chunks [TOM_CHUNKS_DIR] \
--grad_c4_chunks [C4_CHUNKS_DIR] \
--tom_tasks [/path/to/tom_tasks.py] \
--out_dir [EVAL_OUT_DIR] \
--cache_dir [CACHE_DIR] \
--tensor_parallel_size 1 \
--max_model_len 1024 \
--batch_size 64 \
--reps 5 \
--m_start 0.0 --m_end 5e-5 --m_step 2e-6python summarize.py \
--root [EVAL_OUT_DIR]/tom \
--reps 5 \
--out_csv [EVAL_OUT_DIR]/tom_summary.csvIf you find this work useful, please cite our paper:
@article{wu2025large,
title={How large language models encode theory-of-mind: a study on sparse parameter patterns},
author={Wu, Yuheng and Guo, Wentao and Liu, Zirui and Ji, Heng and Xu, Zhaozhuo and Zhang, Denghui},
journal={npj Artificial Intelligence},
volume={1},
number={1},
pages={20},
year={2025},
publisher={Nature Publishing Group UK London}
}For questions or issues, please contact Yuheng Wu or open an issue in this repository.