Skip to content
/ TURN Public

[ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"

License

Notifications You must be signed in to change notification settings

StigLidu/TURN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TURN: Optimizing Temperature for Language Models with Multi-Sample Inference

License: MIT

Table of Contents

  1. Overview
  2. Installation
  3. Data & Model Preparation
  4. Usage
  5. Reproducing Results
  6. License
  7. Citation
  8. Acknowledgements

Overview

TURN is an entropy-based algorithm for automatic temperature optimization in multi-sample inference strategies such as Majority Voting and Best-of-N.

Multi-sample strategies achieve state-of-the-art performance but there is little understanding about the role of temerapture in these strategies. TURN provides an automatic temperature selection algorithm.

This repository contains the official implementation of our paper:

Weihua Du, Yiming Yang, & Sean Welleck
“Optimizing Temperature for Language Models with Multi-Sample Inference.” (2025)

Highlights

  • High Correlation: TURN’s predicted temperature closely matches the best temperature from grid search in terms of accuracy.
  • No Labels Needed: The approach is purely entropy-driven, removing reliance on labeled validation sets.
Correlation of accuracies between TURN's predicted temperature and the best grid-search temperature.

The accuracies between TURN-predicted temperatures and the best grid-search temperatures show high correlation.


Installation

  1. Clone the Repository:

    git clone https://github.com/StigLidu/TURN.git
    cd TURN
  2. (Optional) Create a Conda Environment:

    conda create -n TURN python=3.11
    conda activate TURN
  3. Install Dependencies:

    pip install -r requirements.txt

    Note: For GPU-based inference, ensure the necessary CUDA libraries and drivers are installed.


Data and Model Preparation

Prepare your test data in JSONL format, with one entry per line. For instance:

{"problem": "What is 1+1? Provide the answer in detail."}
{"problem": "Explain the concept of derivatives in calculus."}
{"problem": "Prove the Pythagorean theorem."}
  • Each JSON object must include a "problem" key.

Our implementation works with Hugging Face models or local checkpoints.


Usage

Run the main script predict.py to automatically infer an optimal temperature for a given aggregation strategy:

python predict.py \
    --model_path [LLM_PATH] \
    --data_path [DATA_PATH] \
    --aggregation_strategy [MJ/BofN] \
    [--num_samples 32 --batch_size 16 ...]

Example

python predict.py \
    --model_path nvidia/OpenMath2-Llama3.1-8B \
    --data_path data/test_data.jsonl \
    --aggregation_strategy MJ

Output:

Predicted temperature:  [predicted temperature]

Arguments

  • --model_path: Path to or name of the model (e.g., a Hugging Face model like nvidia/OpenMath2-Llama3.1-8B, or a local checkpoint).
  • --data_path: Path to the JSONL file containing the test data.
  • --aggregation_strategy: Currently supports MJ (Majority Voting) or BofN (Best-of-N).
  • --num_samples (optional): Number of samples to estimate entropy (default: N=32).
  • --batch_size (optional): Batch size for inference (default: 16). Adjust if you face memory constraints.

Reproducing Results

To replicate the experiments reported in our paper:

  1. MBPP (Code Generation)

  2. MATH (Mathematical Reasoning)


License

This project is released under the MIT License.

Citation

If you find our work useful in your research, please use the following BibTeX reference:

@article{du2025optimizing,
  title={Optimizing Temperature for Language Models with Multi-Sample Inference},
  author={Du, Weihua and Yang, Yiming and Welleck, Sean},
  journal={arXiv preprint arXiv:2502.05234},
  year={2025}
}

Acknowledgements

We extend our gratitude to the following open-source projects for their foundational contributions:


Contact

For any questions or inquiries, please contact:

About

[ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published