DIY-SC enhances semantic correspondence by refining foundational features in a pose-aware manner. This approach is not limited to SPair-71k and can be adapted to other tasks requiring robust feature matching.
Below, we first demonstrate how DIY-SC can be seamlessly integrated into existing codebases, for example as a drop-in replacement for DINOv2 features. We then outline the procedures for SPair-71k evaluation, pseudo-label generation, and model training.
While the adapter only requires torch, the demonstration below requires the following:
pip install pillow matplotlib
pip install torch torchvision torchaudio
DINOv2 features can be refined in the following way:
from model_utils.projection_network import AggregationNetwork
# Load model
ckpt_dir = 'ckpts/0300_dino_spair/best.pth'
aggre_net = AggregationNetwork(feature_dims=[768,], projection_dim=768)
aggre_net.load_pretrained_weights(torch.load(ckpt_dir))
# Refine features
desc_dino = <DINOv2 features of torch.Size([1, 768, N, N])>
with torch.no_grad():
desc_proj = aggre_net(desc_dino) # of shape torch.Size([1, 768, N, N])
We show an examplary application in demo_diy.py.
Additionally, we provide an interactive notebook in DIY-SC/notebooks/correspondence_demo.ipynb.
For simple loading of the model, we also support the integration via Pytorch Hub:
import torch
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_dino', pretrained=True)
We also support other pre-trained adapters via Pytorch Hub.
# adapter for SD+DINO features (SPair-71k trained)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_sd_dino', pretrained=True)
# adapter for SD+DINO features (ImageNet-3D trained)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_sd_dino_in3d', pretrained=True)
# adapter for SD+DINO features (ImageNet-3D trained, fine-tuned on SPair-71k)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_sd_dino_in3d_spair', pretrained=True)
# adapter for DINO features (SPair-71k trained)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_dino', pretrained=True)
# adapter for DINO features (ImageNet-3D trained)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_dino_in3d', pretrained=True)
# adapter for DINO features (ImageNet-3D trained, fine-tuned on SPair-71k)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_dino_in3d_spair', pretrained=True)
# adapter for DINO features, projecting to 384 channels (SPair-71k trained)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_dino_384', pretrained=True)
# adapter for DINO features, projecting to 128 channels (SPair-71k trained)
aggre_net = torch.hub.load('odunkel/DIY-SC-torchhub', 'agg_dino_128', pretrained=True)In the following, we evaluate DIY-SC on the semantic correspondence benchmark SPair-71k.
We present two options for evaluation, pseudo-label generation, and training of the adapter:
I) Light-weight (DINOv2): This strategy only requires DINOv2 features, which reduces installation and compute requirements.
II) Full (SD+DINOv2): This strategy relies on DINOv2 and SD features and, therefore, requires third library pacakages and is computationally heavier.
To support pseudo-label generation and SPair-71k experiments, install the following packages:
pip install tqdm opencv-python wandb ConfigArgParse loguru pandas scipy
To extract instance masks, install SAM and download the checkpoint:
pip install git+https://github.com/facebookresearch/segment-anything.git
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth -P ckpts
We refer to GeoAware-SC for the instructions to install the requirements for computing the SD features.
Download SPair-71k (as in Geo-Aware) by running bash scripts/download_spair.sh.
Compute SAM masks with bash scripts/compute_sam_masks.sh.
Pre-compute the feature maps by running:
bash scripts/precompute_features.sh --dinoShow the command for preparing option II) SD+DINOv2
bash scripts/precompute_features.sh --dino --sdPre-trained adapters can be evaluated on the SPair-71k test split via:
python pck_train.py --config configs/eval_spair.yaml --ONLY_DINO --LOAD ckpts/0300_dino_spair/best.pthShow the command for evaluating option II) SD+DINOv2
python pck_train.py --config configs/eval_spair.yaml --EXP_ID 0 --LOAD ckpts/0280_spair/best.pthWe refer to notebook notebooks/agg_spair_results.ipynb for loading sample-specific results and computing SPair-71k evaluation metrics.
We provide generated pseudo-labels on Google Drive. To download them, you can use the gdown tool:
pip install gdown
gdown --folder https://drive.google.com/drive/folders/1nGjNsWpqbcqUJS-fNXU_41pMBMdE42Je?usp=sharing -O dataThis will download all pseudo-label files into the data directory.
Alternatively, to generate pseudo-labels yourself, perform the following steps. First, compute the spherical points:
bash scripts/precompute_features.sh --sph
Then, generate pseudo-labels for validation and training splits:
python gen_pseudo_labels.py --filter_sph --subsample 300 --split val --dataset_version v01 --only_dino
python gen_pseudo_labels.py --filter_sph --subsample 30000 --split trn --dataset_version v01 --only_dinoShow the command for generation pseudo-labels with option II) SD+DINOv2
python gen_pseudo_labels.py --filter_sph --subsample 300 --split val --dataset_version v01
python gen_pseudo_labels.py --filter_sph --subsample 30000 --split trn --dataset_version v01Finally, the refinement adapter is trained via:
python pck_train.py --config configs/train_spair.yaml --EXP_ID 0
The following features are currently planned or already in development:
- ImageNet-3D training functionality
- OrientAnything integration
- Feature upsampling support
Contributions welcome: If you have feature suggestions or would like to help implement any of the above, feel free to open an issue or submit a pull request.
If you find our work useful, please consider giving a star ⭐ and a citation.
@article{duenkel2025diysc,
title={{Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels}},
author={D{\"u}nkel, Olaf and Wimmer, Thomas and Theobalt, Christian and Rupprecht, Christian and Kortylewski, Adam},
journal={arXiv preprint arXiv:2506.05312},
year={2025}
}We thank GeoAware-SC and SphericalMaps for open-sourcing their great works.
Licensing information for GeoAware-SC.
Our code partially builds on the GeoAware-SC repo. However, it does not contain licensing information, making the licensing state of all the code taken from it unclear.