How Large Language Models Encode Theory-of-Mind: A Study on Sparse Parameter Patterns

This repository contains the official code for the NPJ AI paper “How Large Language Models Encode Theory-of-Mind: A Study on Sparse Parameter Patterns.”

create_gradient.py → compute squared gradients and save as model
chunk_gradient.py → split gradient into per-layer chunks
ToM_and_perplexity_evaluation.py → build masked models for each m, run ToM & perplexity eval
summarize.py → aggregate ToM results

Replace every [] with your own paths.

1. Create Gradients

# TOM dataset (full sequence, last-token only supervision)
python create_gradient.py \
  --model [MODEL_ID] \
  --dataset tom \
  --data_path [/path/to/tom_training_data.json] \
  --nsamples 100 \
  --seqlen 0 \
  --seed 0 \
  --cache_dir [CACHE_DIR] \
  --out [OUT_DIR_FOR_TOM_GRAD]

# C4 dataset (random 128-token windows)
python create_gradient.py \
  --model [MODEL_ID] \
  --dataset c4 \
  --nsamples 100 \
  --seqlen 128 \
  --seed 0 \
  --cache_dir [CACHE_DIR] \
  --out [OUT_DIR_FOR_C4_GRAD]

2. Chunk Gradients

python chunk_gradient.py \
  --model [OUT_DIR_FOR_TOM_GRAD] \
  --output_path [TOM_CHUNKS_DIR] \
  --cache_dir [CACHE_DIR] \
  --device_map auto

python chunk_gradient.py \
  --model [OUT_DIR_FOR_C4_GRAD] \
  --output_path [C4_CHUNKS_DIR] \
  --cache_dir [CACHE_DIR] \
  --device_map auto

Parts of the gradient extraction and chunking code are adapted from SqueezeLLM: Dense-and-Sparse Quantization (ICML 2024).

3. ToM + Perplexity Evaluation

python ToM_and_perplexity_evaluation.py \
  --model [MODEL_ID] \
  --grad_tom_chunks [TOM_CHUNKS_DIR] \
  --grad_c4_chunks [C4_CHUNKS_DIR] \
  --tom_tasks [/path/to/tom_tasks.py] \
  --out_dir [EVAL_OUT_DIR] \
  --cache_dir [CACHE_DIR] \
  --tensor_parallel_size 1 \
  --max_model_len 1024 \
  --batch_size 64 \
  --reps 5 \
  --m_start 0.0 --m_end 5e-5 --m_step 2e-6

4. Summarize ToM Scores

python summarize.py \
  --root [EVAL_OUT_DIR]/tom \
  --reps 5 \
  --out_csv [EVAL_OUT_DIR]/tom_summary.csv

Citation

If you find this work useful, please cite our paper:

@article{wu2025large,
  title={How large language models encode theory-of-mind: a study on sparse parameter patterns},
  author={Wu, Yuheng and Guo, Wentao and Liu, Zirui and Ji, Heng and Xu, Zhaozhuo and Zhang, Denghui},
  journal={npj Artificial Intelligence},
  volume={1},
  number={1},
  pages={20},
  year={2025},
  publisher={Nature Publishing Group UK London}
}

For questions or issues, please contact Yuheng Wu or open an issue in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
README.md		README.md
ToM_and_perplexity_evaluation.py		ToM_and_perplexity_evaluation.py
ToM_tasks.py		ToM_tasks.py
chunk_gradient.py		chunk_gradient.py
create_gradient.py		create_gradient.py
summarize.py		summarize.py
tom_training_data.json		tom_training_data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How Large Language Models Encode Theory-of-Mind: A Study on Sparse Parameter Patterns

1. Create Gradients

2. Chunk Gradients

3. ToM + Perplexity Evaluation

4. Summarize ToM Scores

Citation

About

Uh oh!

Releases

Packages

Languages

joel-wu/how-large-language-models-encode-theory-of-mind

Folders and files

Latest commit

History

Repository files navigation

How Large Language Models Encode Theory-of-Mind: A Study on Sparse Parameter Patterns

1. Create Gradients

2. Chunk Gradients

3. ToM + Perplexity Evaluation

4. Summarize ToM Scores

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages