[ICASSP 2025] KARST: Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission for Visual Classification
Pytorch implementation for ICASSP 2025 paper of "KARST: Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission for Visual Classification"
we introduce an innovative Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission (KARST) for various recognition tasks. Specifically, its multi-kernel design extends Kronecker projections horizontally and separates adaptation matrices into multiple complementary spaces, reducing parameter dependency and creating more compact subspaces. Besides, it incorporates extra learnable re-scaling factors to better align with pre-trained feature distributions, allowing for more flexible and balanced feature aggregation.
The framework and application of KARST:
- Python 3.11.9
- torch 2.3.0
- timm 0.9.16
- avalanche-lib 0.5.0
Please refer to SSF or VPT for preparing the 19 datasets included in VTAB-1K.
We follow NOAH to conduct the few-shot evaluation. There are two parts you shold pay attention to:
-
Images
For improved organization and indexing, images from five datasets (
fgvc-aircraft, food101, oxford-flowers102, oxford-pets, standford-cars) should be consolidated into a folder named FGFS. -
Train/Val/Test splits
The content, copied from the
data/few-shotdirectory in NOAH, should be placed in the FGFS folder and renamed asfew-shot_splitfor path correction.
The file structure should look like:
FGFS
├── few-shot_split
│ ├── fgvc-aircraft
│ │ └── annotations
│ │ ├── train_meta.list.num_shot_1.seed_0
│ │ └── ...
│ │ ...
│ └── food101
│ └── annotations
│ ├── train_meta.list.num_shot_1.seed_0
│ └── ...
├── fgvc-aircraft
│ ├── img1.jpeg
│ ├── img2.jpeg
│ └── ...
│ ...
└── food101
├── img1.jpeg
├── img2.jpeg
└── ...- The pre-trained weights of ViT-B/16 is stored at this link.
- For Swin-B, the pre-trained weights will be automatically download to cache directory when you run training scripts.
If this project is helpful for you, you can cite our paper
