Skip to content

ykrmm/ViLU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ViLU: Learning Vision-Language Uncertainties for Failure Prediction

This is the official Pytorch implementation of ViLU accepted at ICCV2025.

Description

Installation

conda create -n vilu python=3.11
conda activate vilu
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -e .

Datasets

data/
|––––ImageNet/
|––––––––ILSVRC2012_devkit_t12.tar.gz
|––––––––train/
|––––––––val/
|––––CIFAR-10/
|––––––––cifar-10-batches-py/
|––––CIFAR-100/
|––––––––cifar-100-python/
|––––caltech101/
|––––––––101_ObjectCategories/
|––––EuroSAT_RGB/
|––––dtddataset/
|––––––––dtd/
|––––––––––––images/
|––––fgvc-aircraft-2013b/
|––––flowers102/
|––––food-101/
|––––––––images/
|––––oxford-iiit-pet/
|––––––––images/
|––––stanford_cars/
|––––SUN397/
|––––UCF-101-midframes/
|––––CC3M/
|––––––––train/
|––––––––val/
|––––CC12M/
|––––––––train/
|––––––––val/
|––––LAION/
|––––––––train/
|––––––––val/

Training

We provide training scripts in the scripts folder. For instance, to launch the training of vilu on CIFAR-10 with the ViT-B/32 backbone with 400 epochs do:

sh scripts/train_vilu_cifar10.sh

Note that you may adapt the values of CUDA_VISIBLE_DEVICES. Training on an RTX A6000 GPU takes around 2 hours. The main code for viewing the training loops is located in vilu/engine/trainer.py

Notebook for reproductibility and visualisations

In the notebooks folder you can find the scripts for loading and evaluating the pre-trained vilu models on each dataset, as well as the code for visualising the failure curves.

Description

About

[ICCV'25] ViLU: Learning Vision-Language Uncertainties for Failure Prediction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published