Caco: Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

🎉🎉 Caco is accepted by NeurIPS 2025!

We introduce Caco, a code-driven framework for generating diverse and verifiable reasoning data at scale. Unlike conventional augmentation methods that rewrite problems, Caco leverages executable code-based chains of thought (Code CoTs) to synthesize new problems and solutions with guaranteed correctness.

Caco implements this through three key stages:

Unifying Code CoT, collecting diverse seed reasoning traces from both mathematical and algorithmic problems, and converting them into a standardized executable format.
Scaling Code CoT, training a dedicated code generator that not only expands the dataset but also realizes Pattern-level Augmentation by restructuring reasoning logic (e.g., decomposition, reformulation, alternative solution paths).
Instruction Reversing, back-translating code into natural language problems with contextual and stylistic variations, followed by natural language CoT solution generation dual verification for correctness.

Caco yields 1.3M validated problem–solution pairs in under 55 GPU hours using only open-source models. Models trained on Caco data achieve consistent improvements across mathematics, logic puzzles, scientific QA, and code reasoning, surpassing strong baselines and demonstrating broad cross-domain generalization.

We release the Caco dataset and three Caco models fine-tuned on this dataset.

Dataset/Model	MATH	Olympiad	Theorem-QA	HuggingFace🤗
Caco1.3M	-	-	-	link
Caco-CodeGen	-	-	-	link
DeepSeekMath-7B-Caco	68.2	29.5	33.8	link
Qwen2.5-7B-Caco	82.4	46.5	46.0	link
Llama3-8B-Caco	70.6	34.1	31.0	link

🎯 Quick Start

Install the dependencies:

conda create -n caco python=3.10
conda activate caco
pip install torch==2.3.1 --index-url https://download.pytorch.org/whl/cu121
# Install LLaMA-Factory
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
git checkout v0.9.1
pip install transformers==4.46.1 accelerate==0.34.2 deepspeed==0.15.4
pip install -e ".[torch,metrics]"
# Install packages for evaluation
pip install flash-attn --no-build-isolation
pip install sympy==1.12.1 antlr4-python3-runtime==4.11.1 pebble word2number boto3 triton==2.3.1 ipython
pip install vllm==0.5.3.post1
# Install latex2sympy
cd ../evaluation_dart/latex2sympy
pip install -e .
cd ..
# Install dart-math evaluation
pip install -e .

⚙️ Data Generation

You can directly download Caco-1.3M data for training.

huggingface-cli download LHL3341/Caco-1.3M

We also provide our code in ./data_process for:

Code execution and input/output extraction
Answer consistency filteration
CodeGen training

🤖 Training

Our training codes depend on LLaMA-Factory.

bash ./scripts/sft.sh

📊 Evaluation

export MODEL_NAME=/path/to/your/model
bash ./scripts/test.sh

🚀 Future Work

We highlight three directions for extending Caco:

Raising Difficulty: Incorporate harder and cleaner seed datasets (e.g. AM-Thinking-distill, DAPO) and apply hardness-aware sampling with adversarial program mutations.
Expanding Diversity: Extend beyond math to science, logic, proofs and procedural planning. Train multi-domain CodeGen with domain tags and compositional templates.
RL with Verifiable Rewards (RLVR): Caco’s executable traces provide a natural, low-noise reward signal, which can be seamlessly applied to scale up RLVR data.

Citation

If you find our code, model, or data are useful, please kindly cite our paper:

@article{caco,
 title={Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning}, 
 author={Honglin Lin and Qizhi Pei and Xin Gao and Zhuoshi Pan and Yu Li and Juntao Li and Conghui He and Lijun Wu},
 journal={arXiv preprint arXiv:2510.04081},
 year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LLaMA-Factory		LLaMA-Factory
data_process		data_process
evaluation_dart		evaluation_dart
evaluation_qwen		evaluation_qwen
scripts		scripts
LICENSE		LICENSE
README.md		README.md
caco.png		caco.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Caco: Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

🎯 Quick Start

⚙️ Data Generation

🤖 Training

📊 Evaluation

🚀 Future Work

Citation

About

Uh oh!

Releases

Packages

Languages

License

LHL3341/Caco

Folders and files

Latest commit

History

Repository files navigation

Caco: Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

🎯 Quick Start

⚙️ Data Generation

🤖 Training

📊 Evaluation

🚀 Future Work

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages