GitHub - horseee/CoT-Valve: CoT-Valve: Length-Compressible Chain-of-Thought Tuning

CoT-Valve: Length-Compressible Chain-of-Thought Tuning

The reasoning model, after the length-compressible CoT tuning, can generate reasoning paths from long to short, leveraging LoRA as a `Valve'.

Xinyin Ma*, Guangnian Wan*, Runpeng Yu, Gongfan Fang, Xinchao Wang
Learning and Vision Lab, National University of Singapore
🥯[Arxiv] 🎄[Dataset] 🤖[Models] (coming soon)
* Equal Contribution

Introduction

We propose a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths.

We propose to identify a direction in the parameter space that, when manipulated, can effectively control the length of generated CoT.
We construct datasets with chains from long to short for the same questions and explore two enhanced strategies for CoT-Valve: (1) a precise length-compressible CoT tuning method, and (2) a progressive chain length compression approach.
CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control.
We applied this method to QwQ-32B-Preview, reducing reasoning chains on GSM8K from 741 to 225 tokens with a minor performance drop (95.07% to 94.92%) and on AIME from 6827 to 4629 tokens, with only one additional incorrect answer.

TODO

Release the dataset
Release the model
Release the trainng code

🤗Datasets

We release the following datasets on Huggingface:

Dataset Name	Link	Description
MixChain-Z-GSM8K	Link	MixChain-Z-GSM8K is a dataset containing 6,863 samples, with each sample containing five different solutions.
MixChain-Z-PRM12K	Link	MixChain-Z-PRM12K is a dataset containing 12,000 samples (unfiltered), with each sample containing five different solutions
MixChain-C-LIMO	Link	MixChain-C-LIMO contains two distinct solutions for each question from the LIMO dataset. These solutions vary in the number of samples and the average length of their CoT.

Training Code

To be released

Models

To be released

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
figures		figures
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CoT-Valve: Length-Compressible Chain-of-Thought Tuning

Introduction

TODO

🤗Datasets

Training Code

Models

About

Uh oh!

Releases

Packages

horseee/CoT-Valve

Folders and files

Latest commit

History

Repository files navigation

CoT-Valve: Length-Compressible Chain-of-Thought Tuning

Introduction

TODO

🤗Datasets

Training Code

Models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages