Skip to content

jjho-choi/ConceptScope

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts

Official Repository – NeurIPS 2025

🌐 Project Page | 📄 Paper | 🎬 Demo Video | 🚀 Run Interactive Demo

ConceptScope Teaser

ConceptScope is a framework for analyzing vision dataset bias by uncovering and quantifying visual concepts using Sparse Autoencoders (SAEs). This repository provides:

  • Training SAEs on vision models (e.g., CLIP) and extracting meaningful latent concepts

  • Categorizing concepts into target, context, and bias types

  • Evaluating concept prediction and segmentation, and benchmarking bias discovery tasks

  • An interactive demo for visualizing and analyzing dataset bias

Table of Contents

Getting Started

1. Set Up the Environment

Create and activate a Conda environment, then install the required dependencies:

conda create --name conceptscope python=3.11
conda activate conceptscope 

pip install -r requirements.txt

2. Download Weights and Data

To reproduce the paper results without running the full analysis, you can directly download the pretrained SAE weights and pre-computed outputs as follows:

gdown 16EQFKCWRpSNR-LWmxpuTGqViLpVf3KEb  # out.zip
unzip out.zip

If you wish to analyze the ImageNet dataset, download the pre-computed SAE latents as follows:

gdown 1q33iBwUHEqgiNQGFqVgYAUjoquYHfTIL  # train_sae_latents.h5
mv train_sae_latents.h5 out/dataset_analysis/imagenet/openai_l14_32k_base

⚠️ Note: The SAE latent file from the ImageNet training set is approximately 32 GB in size.

Run Demo

ConceptScope Demo Walkthrough

This demo provides an interactive UI for exploring concepts discovered by ConceptScope.
It allows you to:

  • Select a dataset and class
  • Visualize the distribution of target, context, and bias concepts
  • Inspect high-activation and low-activation samples for any selected latent

Requirements

  • Trained SAE checkpoint (from train_sae.py)
  • Concept categorization results (from main.py in conceptscope/)

Launching the Demo

1. Set Up Environment Variables

Create a .env file in the project root and set the following variables:

PROJECT_ROOT="<project_root>"
DEVICE="cuda:<your_device_number>"
PORT="<your_port>"
CHECKPOINT_NAME="openai_l14_32K_base"

1. Start the Backend

From the project root, run:

PYTHONPATH=./ uvicorn src.demo.backend.main:app --port <PORT> 

2. Start the Front

In a separate terminal, run:

PYTHONPATH=./ streamlit run src/demo/frontend/streamlit_app.py

Once running, Streamlit will provide a local URL (e.g. http://localhost:8501). Open it in your browser to access the UI.

Train SAE

This step trains a Sparse Autoencoder (SAE) on a specified vision backbone (e.g., CLIP ViT-L/14) to learn a disentangled latent representation.

Run

Example training command:

PYTHONPATH=./ python src/sae_training/train_sae.py \
    --device cuda:0 \
    --block_layer -2 \
    --use_ghost_grads \
    --seed 1 \
    --n_checkpoints 1 \
    --total_training_tokens 5000000 \
    --log_to_wandb \
    --model_name openai/clip-vit-large-patch14 \
    --expansion_factor 32 \
    --clip_dim 1024 \
    --dataset imagenet \
    --b_dec_init_method geometric_median \
    --lr 0.0004 \
    --l1_coefficient 0.00008 \
    --batch_size 128 

Key arguments

Argument Description Default
--model_name HuggingFace model name or path to a local checkpoint openai/clip-vit-large-patch14
--block_layer Transformer block layer index to extract activations from -2
--clip_dim CLIP embedding dimension (e.g., 1024 for ViT-L/14) 1024
--expansion_factor SAE expansion factor (number of latents per input dimension) 32
--b_dec_init_method Initialization method for decoder bias (e.g., geometric_median) geometric_median
--l1_coefficient L1 penalty weight for sparsity 0.00008
--total_training_tokens Total number of tokens to train over 50000000
--use_ghost_grads Enable ghost gradients to prevent dead neurons disabled

Outputs

  • SAE checkpoint (.pt) stored under out/checkpoints/experiment_id
  • Training logs & metrics (optionally logged to Weights & Biases)

Notes

  • To train on a custom dataset, add it to the load_dataset function in src/utils/image_dataset_loader.py.

Construct Concept dictionary

This step extract meaningful latent from the trained SAE and assign semantic labels using reference images and VLMs. It can also save SAE activations for later analysis.

Requirements

  • Trained SAE checkpoint (.pt)
  • Dataset for activation extraction (e.g., ImageNet validation split)

Run

Example command:

PYTHONPATH=./ python src/construct_concept_dict/main.py \
    --device cuda:1 \
    --sae_path ./out/checkpoints/openai_l14_32K_base/clip-vit-large-patch14_-2_resid_32768.pt \
    --batch_size 128 \
    --dataset_name imagenet \
    --split val \
    --use_gpt \
    --save_features

Key arguments

Argument Type Default Description
--dataset_name str imagenet Name of the dataset to process
--sae_path str (required) Path to the trained SAE checkpoint
--backbone str openai/clip-vit-large-patch14 Backbone model type used during SAE training
--seed int 1 Random seed for reproducibility
--split str train Dataset split to process (train, val, test)
--device str cuda Device to run on (cpu or cuda)
--use_gpt flag disabled If set, use GPT to generate human-readable latent names
--batch_size int 64 Batch size for processing
--save_features flag disabled If set, save extracted features for later use

If you set --use_gp, you have to set OPENAI_API_KEY in env file.

Outputs

  • valid_latent.json – indices of meaningful latents
  • concept_dict.json (optional) – human-readable names if --use_gpt is enabled
  • <split>_sae_latents.h5 (optional) – cached activations for downstream analysis if --save_features is used

ConceptScope

This module runs ConceptScope analysis on a given dataset using a trained SAE. It computes alignment scores, categorizes latents into target, context, and bias concepts, and saves the results for downstream bias analysis and visualization.

Requirements

  • Trained SAE checkpoint (.pt)
  • Dataset (e.g., ImageNet, NICO, COCO) accessible locally or via huggingface datasets

Run

Example command:

PYTHONPATH=./ python src/conceptscope/main.py \
    --device cuda:0 \
    --sae_path ./out/checkpoints/openai_l14_32K_base/clip-vit-large-patch14_-2_resid_32768.pt \
    --dataset_name imagenet \
    --split train \
    --batch_size 128 \
    --num_samples 128 \
    --target_threshold 1.0 \
    --clip_model_name openai/clip-vit-large-patch14

Key arguments

Argument Type Default Description
--save_root str ./out Root directory to save results
--dir_name str dataset_analysis Subdirectory name for results
--sae_path str (required) Path to the trained SAE checkpoint
--device str cuda Device to run the model on (cpu or cuda)
--backbone str openai/clip-vit-large-patch14 Backbone model used during SAE training
--clip_model_name str openai/clip-vit-large-patch14 CLIP model for similarity computation
--dataset_name str imagenet Name of the dataset to analyze
--split str train Dataset split to analyze (train, val, test)
--target_attribute str None Optional target attribute for subgroup analysis
--batch_size int 64 Batch size for processing
--num_samples int 256 Number of samples to compute alignment scores
--target_threshold float 0.0 Alignment score threshold for marking a latent as target
--bias_threshold_sigma float 1.0 Multiplier for the standard deviation used to determine the concept strength threshold for marking a latent as bias

Outputs

  • <split>_concept_categorization.json: dictionary with per-class lists of target, context, and bias latents.
  • Each latent entry contains:
{"target":
    [ 
        {
            "latent_idx": 123,
            "latent_name": "striped_pattern",
            "alignment_score": 0.73,
            "mean_activation": 0.012,
            "normalized_alignment_score": 1.42,
        }, ... 
    ]
}

Notes

  • Increasing --target_threshold makes target concepts more selective
  • Adjust --bias_threshold_sigma if you want stricter or looser bias detection.

📜 License & Credits

Reference Implementations

License Notice

Our code is distributed under an MIT license, please see the LICENSE file for details. The NOTICE file lists license for all third-party code included in this repository. Please include the contents of the LICENSE and NOTICE files in all re-distributions of this code.


Citation

If you find our code or models useful in your work, please cite our paper:

@inproceedings{
    choi2025characterizing,
    title     = {Characterizing Dataset Bias via Disentangled Visual Concepts},
    author    = {Choi, Jinho and Lim, Hyesu and Schneider, Steffen and Choo, Jaegul},
    booktitle = {Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS)},
    year      = {2025}
    url       = {https://openreview.net/forum?id=lkmlNHuzY4}
}

About

Official implementation of paper: Characterizing Dataset Bias via Disentangled Visual Concepts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages