Skip to content

[ICCV 2025] FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

License

Notifications You must be signed in to change notification settings

House-Leo/FoundIR

Repository files navigation

FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [ICCV 2025]

visitorsGitHub Stars

[Project Page]   [Paper]   [arxiv]   [Supplemental Material]   [中文版介绍]

Hao Li*, Xiang Chen*, Jiangxin Dong, Jinhui Tang, Jinshan Pan
IMAG Lab, Nanjing University of Science and Technology

Welcome to visit our website (专注底层视觉领域的信息服务平台) for low-level vision: https://lowlevelcv.com/


The potential of large-scale training data for universal image restoration. (a) Analysis of universal image restoration performance in real-world scenarios as training data vary. As the size of real-world training data increases, the image restoration model can achieve significant performance improvement. (b) Our proposed FoundIR, trained on our million-scale dataset, achieves state-of-the-art performance across a broad range of image restoration tasks compared to existing universal image restoration methods.


🚩 New Features/Updates

  • ✅ July, 07, 2025. Release the training set. Please fill out the request form to get the download link!
  • ✅ July, 02, 2025. Release the training code and script!
  • ✅ June 26, 2025. 🎉 Our FoundIR was accepted by ICCV 2025!
  • ✅ May 13, 2025. Update the script for calculating the metrics, including PSNR, SSIM, LPIPS, FID, CLIP-IQA, MANIQA, MUSIQ, NIQE, NIMA. Thanks to the awesome pyiqa.
  • ✅ February 18, 2025. Release the testing code and pre-trained models of the specialist models, the testset (GT) on Google Drive (GT), and the visual results on Google Drive (FoundIR) and Baidu Yun (Others).
  • ✅ February 05, 2025. Release the testing code and pre-trained model of the generalist model, and the testset (LQ) on Google Drive (LQ).
  • ✅ December 03, 2024. Release paper and supplemental material.
  • ✅ November 22, 2024. Creat the repository and the project page.

To Do

  • Release training code
  • Release training dataset

📖 Dataset

Testset & Visual Results (Table 1)

Benchmark GT LQ FoundIR Compared methods
Download Link Google Drive Google Drive Google Drive Baidu Yun (pw: b6qb)

The public data results of Table 2 can be found in Baidu Yun (pw: 58vg).

Training Set

The total size of the training set is nearly 4.85 TB. Make sure you have enough space on your hard disk to unzip the data.

Training Set GT&LQ
Download Link Request Form

More samples can be found in the supplemental material (P7-P9).


💻 Evaluation

➡️ Environment

conda env create -f environment.yml

➡️ Testing the Generalist Model

For our testset:
  • download the testset and organize them as follows:
    |--dataset
    |    |--01Blur
    |    |    |--GT
    |    |    |    |--0001.png
    |    |    |    |--0002.png
    |    |    |    |...
    |    |    |--LQ
    |    |    |    |--0001.png
    |    |    |    |--0002.png
    |    |    |    |...
    |    |--02Blur_Noise
    |    |    |--GT
    |    |    |    |--0151.png
    |    |    |    |--0152.png
    |    |    |    |...
    |    |    |--LQ
    |    |    |    |--0151.png
    |    |    |    |--0152.png
    |    |    |    |...
    |    | ...
  • Run the following command to test the generalist model.
python test.py --dataroot ./dataset --meta ./Testset_meta_info.txt
For your own data:
  • Put the testset in the ./dataset/LQ folder (simply copy the LQ folder and rename it GT if you don't have GT images.).
  • Recomment L40-L44 in test.py to test your own data.
## For our testset
# dataset = CombinedDataset(opt, image_size, augment_flip=False, equalizeHist=True, crop_patch=False, generation=False, task='meta_info')

## For your own data
dataset = CombinedDataset(opt, image_size, augment_flip=False, equalizeHist=True, crop_patch=False, generation=False, task=None)
  • Run the following command to test the generalist model.
python test.py --dataroot ./dataset --meta None

(If you have a GPU with less than 24GB, you can reduce the crop_size on test.py L102.)

➡️ Testing the Specialist Models (Optional)

We provide two specialist models, i.e., Lowlight and Weather models, to refine the results of the generalist model.

In our experiments, we refine the generalist model's outputs as follows:

Weather model: 0501-0700 and 1051-1100 inputs
Lowlight model: 0701-0800, 1101-1250, and 1301-1500 inputs

Please note that this is optional, allowing you to further refine the generalist model’s outputs using the following commands, especially for challenging lowlight, hazy, and rainy inputs.

  • Install the environment.
cd ./specialist_model
pip install -r requirements.txt
python setup.py develop
  • Put the testset in the ./dataset folder.
  • Run the following command to test the specialist models.
python inference_lowlight.py
or
python inference_weather.py

And you can find the output visual results in the folder results/.

➡️ Evaluation Metrics

pip install pyiqa
  • Run the following command to calculate the metrics.
python cal_metrics.py --inp_imgs ./dataset/restored --gt_imgs ./dataset/GT --log ptah_save_log

💻 Training

Our FoundIR is trained in two stages, please follow the steps in train.sh to train the model.

sh train.sh

Results

  • Quantitative Results

  • Qualitative Results

More qualitative results can be found in the supplemental material (P10-P37).


Citation

If this work is helpful for your research, please consider citing the following BibTeX entry.

@inproceedings{li2024foundir,
      title={FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration},
      author={Li, Hao and Chen, Xiang and Dong, Jiangxin and Tang, Jinhui and Pan, Jinshan},
      booktitle={ICCV},
      year={2025}
}

Acknowledgement

We would like to thank our team members (Hao Chen, Yinghui Fang, Jiashuo Liu, Ke Wu, Renyuan Situ, ...) for their contributions in data collection and post-processing of this work.

Contact

If you have any questions, please feel free to reach us out at [email protected] and [email protected].

About

[ICCV 2025] FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Resources

License

Stars

Watchers

Forks

Packages

No packages published