📌 ICML 2025 Oral (Top 1.0%)
MGD³ presents a novel approach to dataset distillation by leveraging pre-trained diffusion models without the need for fine-tuning. The method enhances diversity and representativeness in synthetic datasets through a three-stage process:
- Mode Discovery: Identifies distinct data modes within each class.
- Mode Guidance: Steers the diffusion process toward the discovered modes.
- Stop Guidance: Transitions to unguided diffusion to prevent artifacts.
This approach ensures representative and diverse synthetic datasets suitable for training models.
For more details, visualizations, and supplementary materials, visit the Project Page.
- No Fine-Tuning Required: Utilizes pre-trained diffusion models directly.
- Enhanced Diversity: Achieves superior intra-class diversity compared to existing methods.
- Scalability: Demonstrates effectiveness on large-scale datasets like ImageNet-1K.
- Clone the repository:
git clone https://github.com/jachansantiago/mode_guidance.git
cd mode_guidance- Set up the environment:
conda create -n modeguidance python=3.8
conda activate modeguidance
pip install -r requirements.txt-
For text-to-image distillation:
Install our modified diffusers library:
pip install -e diffusersTo run the code on the ImageNette dataset:
./scripts/nette.shThis project builds upon the following repositories: