Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery

Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery (NeurIPS 2023)
By Sarah Rastegar, Hazel Doughty, and Cees Snoek.

Dependencies

pip install -r requirements.txt

kmeans_pytorch Installation

Since our work relies heavily on kmeans_pytorch for cluster assignments, you need to ensure that it is correctly imported to reproduce the results from the paper. You can install kmeans_pytorch directly in the directory by executing the following commands:

cd InfoSieve
git clone https://github.com/subhadarship/kmeans_pytorch
cd kmeans_pytorch
pip install --editable .

Note: While using scikit-learn's KMeans provides improvements, the results in the paper have been reported using kmeans_pytorch.

Config

Set paths to datasets, pre-trained models and desired log directories in config.py

Set SAVE_DIR (logfile destination) and PYTHON (path to python interpreter) in bash_scripts scripts.

Datasets

We use fine-grained benchmarks in this paper, including:

The Semantic Shift Benchmark (SSB) and Herbarium19

We also use generic object recognition datasets, including:

CIFAR-10/100 and ImageNet

Scripts

Train representation: To run the code with the hyperparameters used in the paper, execute the following command:

python contrastive_training.py

This script will automatically train the representations, extract features, and fit the semi-supervised KMeans algorithm. It also provides final evaluations on both the best checkpoint and the final checkpoint.

Dataset Hyperparameter Specifics: If you're working with the CUB dataset, set the unsupervised_smoothing parameter to 1.0, for other fine-grained datasets (Scars, Pets, Aircraft) to 0.5 and for generic datasets to 0.1. Please note that for Herbarium_19, --unbalanced 1 should be added, because it is long-tailed.

Evaluation

In the Final Reports section at the end, please note that only the evaluations reported for:

Reports for the best checkpoint:
Reports for the last checkpoint:

are the evaluations performed at test time to evaluate the checkpoints. Additionally, please note that Train ACC Unlabelled_v2 is the metric reported by our work and prior studies.

📋 Citation

If you use this code in your research, please consider citing our paper:

@inproceedings{
rastegar2023learn,
title={Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery},
author={Sarah Rastegar and Hazel Doughty and Cees Snoek},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=m0vfXMrLwF}
}

Acknowledgements

The codebase is mainly built on the repo of https://github.com/sgvaze/generalized-category-discovery.

Further Resources

If you found our code helpful and are interested in exploring more, also check out our ECCV 2024 paper SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery .

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
__pycache__		__pycache__
assets		assets
bash_scripts		bash_scripts
data		data
methods		methods
models		models
project_utils		project_utils
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
config.py		config.py
contrastive_train.sh		contrastive_train.sh
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery

Dependencies

kmeans_pytorch Installation

Config

Datasets

Scripts

Evaluation

📋 Citation

Acknowledgements

Further Resources

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

SarahRastegar/InfoSieve

Folders and files

Latest commit

History

Repository files navigation

Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery

Dependencies

kmeans_pytorch Installation

Config

Datasets

Scripts

Evaluation

📋 Citation

Acknowledgements

Further Resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages