Official implementation of "Adaptive Pruning of Pretrained Transformer via Differential Inclusions", published at ICLR 2025.
SPP is a novel pruning framework that enables adaptive compression of pretrained transformers. Instead of pruning at a fixed compression ratio, SPP constructs a Transformer Weight Family with different sparsity levels in a single search stage. This allows users to efficiently deploy transformers with varying levels of sparsity without redoing the pruning process for each ratio.
- 📉 Adaptive pruning: Generates pruned models with different sparsity levels in a single pruning process.
- ⚡ Efficient compression: Reduces computational costs while preserving model accuracy.
- 🔍 Pair-wise structured pruning: Maintains the transformer’s functional structure while enabling flexibility in compression.
- 🏆 State-of-the-art performance: Outperforms existing pruning methods on DeiT, Swin, CLIP, and LLMs (Llama2, OPT).
git clone https://github.com/yizhuoDi/Solution-Path-Pruning.git
cd Solution-Path-Pruningconda env create -n spp -f spp.yml
source activate sppSPP supports pruning on multiple transformer architectures, such as DeiT.
sh deit_search.shAfter pruning, you can fine-tune the model to recover accuracy:
sh deit_retrain.shWe evaluated SPP on multiple datasets and model architectures, demonstrating its superior performance over traditional pruning methods.
| Model | Dataset | Params Reduction | Accuracy (%) |
|---|---|---|---|
| DeiT-Small | ImageNet-1k | 70.6% | 80.2 |
| Swin-Tiny | ImageNet-1k | 66.1% | 80.6 |
| CLIP-Large | COCO | 75.9% | 70.8 (Image-to-Text) |
| Llama2-7B | ARC-e | 50.0% | 71.8 |
More results and comparisons can be found in our paper.
If you find our work useful, please cite:
@article{ding2025adaptive,
title={Adaptive Pruning of Pretrained Transformer via Differential Inclusions},
author={Ding, Yizhuo and Fan, Ke and Wang, Yikai and Sun, Xinwei and Fu, Yanwei},
journal={arXiv preprint arXiv:2501.03289},
year={2025}
}This project is released under the MIT License.
For questions, please open an issue on GitHub or contact us via email at [email protected].