FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [ICCV 2025]
[Project Page] [Paper] [arxiv] [Supplemental Material] [中文版介绍]
Hao Li*, Xiang Chen*, Jiangxin Dong, Jinhui Tang, Jinshan Pan
IMAG Lab, Nanjing University of Science and Technology
Welcome to visit our website (专注底层视觉领域的信息服务平台) for low-level vision: https://lowlevelcv.com/
The potential of large-scale training data for universal image restoration. (a) Analysis of universal image restoration performance in real-world scenarios as training data vary. As the size of real-world training data increases, the image restoration model can achieve significant performance improvement. (b) Our proposed FoundIR, trained on our million-scale dataset, achieves state-of-the-art performance across a broad range of image restoration tasks compared to existing universal image restoration methods.
- ✅ July, 07, 2025. Release the training set. Please fill out the request form to get the download link!
- ✅ July, 02, 2025. Release the training code and script!
- ✅ June 26, 2025. 🎉 Our FoundIR was accepted by ICCV 2025!
- ✅ May 13, 2025. Update the script for calculating the metrics, including PSNR, SSIM, LPIPS, FID, CLIP-IQA, MANIQA, MUSIQ, NIQE, NIMA. Thanks to the awesome pyiqa.
- ✅ February 18, 2025. Release the testing code and pre-trained models of the specialist models, the testset (GT) on Google Drive (GT), and the visual results on Google Drive (FoundIR) and Baidu Yun (Others).
- ✅ February 05, 2025. Release the testing code and pre-trained model of the generalist model, and the testset (LQ) on Google Drive (LQ).
- ✅ December 03, 2024. Release paper and supplemental material.
- ✅ November 22, 2024. Creat the repository and the project page.
- Release training code
- Release training dataset
| Benchmark | GT | LQ | FoundIR | Compared methods |
|---|---|---|---|---|
| Download Link | Google Drive | Google Drive | Google Drive | Baidu Yun (pw: b6qb) |
The public data results of Table 2 can be found in Baidu Yun (pw: 58vg).
The total size of the training set is nearly 4.85 TB. Make sure you have enough space on your hard disk to unzip the data.
| Training Set | GT&LQ |
|---|---|
| Download Link | Request Form |
More samples can be found in the supplemental material (P7-P9).
conda env create -f environment.yml
- Download the pre-trained model and put it in the
./premodelfolder.
For our testset:
- download the testset and organize them as follows:
|--dataset
| |--01Blur
| | |--GT
| | | |--0001.png
| | | |--0002.png
| | | |...
| | |--LQ
| | | |--0001.png
| | | |--0002.png
| | | |...
| |--02Blur_Noise
| | |--GT
| | | |--0151.png
| | | |--0152.png
| | | |...
| | |--LQ
| | | |--0151.png
| | | |--0152.png
| | | |...
| | ...
- Run the following command to test the generalist model.
python test.py --dataroot ./dataset --meta ./Testset_meta_info.txt
For your own data:
- Put the testset in the
./dataset/LQfolder (simply copy the LQ folder and rename itGTif you don't have GT images.). - Recomment L40-L44 in
test.pyto test your own data.
## For our testset
# dataset = CombinedDataset(opt, image_size, augment_flip=False, equalizeHist=True, crop_patch=False, generation=False, task='meta_info')
## For your own data
dataset = CombinedDataset(opt, image_size, augment_flip=False, equalizeHist=True, crop_patch=False, generation=False, task=None)
- Run the following command to test the generalist model.
python test.py --dataroot ./dataset --meta None
(If you have a GPU with less than 24GB, you can reduce the crop_size on test.py L102.)
We provide two specialist models, i.e., Lowlight and Weather models, to refine the results of the generalist model.
In our experiments, we refine the generalist model's outputs as follows:
Weather model: 0501-0700 and 1051-1100 inputs
Lowlight model: 0701-0800, 1101-1250, and 1301-1500 inputs
Please note that this is optional, allowing you to further refine the generalist model’s outputs using the following commands, especially for challenging lowlight, hazy, and rainy inputs.
- Install the environment.
cd ./specialist_model
pip install -r requirements.txt
python setup.py develop
- Put the testset in the
./datasetfolder. - Run the following command to test the specialist models.
python inference_lowlight.py
or
python inference_weather.py
And you can find the output visual results in the folder results/.
- Install the pyiqa package. Thanks to Chaofeng Chen.
pip install pyiqa
- Run the following command to calculate the metrics.
python cal_metrics.py --inp_imgs ./dataset/restored --gt_imgs ./dataset/GT --log ptah_save_log
Our FoundIR is trained in two stages, please follow the steps in train.sh to train the model.
sh train.sh
- Quantitative Results
- Qualitative Results
More qualitative results can be found in the supplemental material (P10-P37).
If this work is helpful for your research, please consider citing the following BibTeX entry.
@inproceedings{li2024foundir,
title={FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration},
author={Li, Hao and Chen, Xiang and Dong, Jiangxin and Tang, Jinhui and Pan, Jinshan},
booktitle={ICCV},
year={2025}
}
We would like to thank our team members (Hao Chen, Yinghui Fang, Jiashuo Liu, Ke Wu, Renyuan Situ, ...) for their contributions in data collection and post-processing of this work.
If you have any questions, please feel free to reach us out at [email protected] and [email protected].




