SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks (CVPR 2025 Highlight)
We will update more detailed result (including dataset, training, verification) in the future
- 2025.2.28: Build project page
- 2025.3.2: Add code
- :Add G2APS-ReID reconstructed code
- :Add detailed process description
- :Add the LAGPeR and the usage license(LAGPR is undergoing systematic collation and revision)
20250405 - Our paper was selected as a CVPR 2025 Highlight!
20250310 - Our paper is available on arxiv
20250227 - Our paper has been accepted by CVPR'25!
We propose two large-scale aerial-ground cross-view person re-identification datasets to extend the AGPReID task framework. As depicted in Figure 1, the AGPReID task configuration involves matching pedestrian identities across heterogeneous aerial and ground surveillance perspectives. The proposed datasets are formally characterized as follows:
Figure 1: Illustrative example of AGPReID task
We constructed the LAGPeR dataset, a large-scale AGPReID benchmark, by collecting multi-scenario surveillance data across seven distinct real-world environments. You can fill in the Data Release Protocol here and send it to this email address to obtain the data usage rights of LAGPeR.
- Real World: The LAGPeR dataset is constructed from seven different real-world environments through autonomous data collection and manual annotations.
- Multi View: Features a multi-camera framework comprising 21 surveillance nodes distributed across 7 scenes, with synchronized capture from three observation planes: aerial view, ground oblique view, and frontal ground view.
-
Large Scale: Contains
$63,841$ images of$4,231$ identities, establishing LAGPeR as one of the largest real-world aerial-ground ReID benchmark to date.
We reconstructed the AGPReID dataset G2APS-ReID from a large-scale pedestrian search dataset G2APS. Its scene instance is as follows. Under copyright constraints, the G2APS-ReID dataset cannot be publicly released. However, we have made available the complete codebase for reconstructing this dataset from G2APS, which can be accessed at here.
- For the LAGPeR dataset, we selected 12 cameras (including 8 ground cameras and 4 drone cameras) from the first four scenes as the training set, while images from 9 cameras in the remaining three scenes were used for evaluation.
- For the G2APS-ReID datasets, we randomly selected 60% of the IDs as the training set and the remaining as the test set. And then, we manually adjusted the IDs in the test set by reallocating IDs with too few or too many images to the training set.
- We calculated the gradient histogram features of images with the same ID and view and used K-nearest neighbor clustering to divide the images into K groups, randomly selecting one image from each group as a query image, thus selecting K representative images as queries (as shown in Tab 1).
- We added
$G \rightarrow A+G$ setting, which includes images from both ground and aerial perspectives in the gallery.
| Setting | Subset | #View. | LAGPeR | G2APS-ReID | ||||
|---|---|---|---|---|---|---|---|---|
| #Cam | #IDs | #Images | #Cam | #IDs | #Images | |||
| - | Train | Aerial+Ground | 12 | 2,708 | 40,770 | 2 | 1,569 | 100,871 |
| A → G | Query | Aerial | 3 | 1,523 | 3,046 | 1 | 1,219 | 4,876 |
| Gallery | Ground | 6 | 1,523 | 15,533 | 1 | 1,219 | 37,202 | |
| G → A | Query | Ground | 6 | 1,523 | 3,046 | 1 | 1,219 | 4,876 |
| Gallery | Aerial | 3 | 1,523 | 7,717 | 1 | 1,219 | 62,791 | |
| A → G | Query | Ground | 6 | 1,523 | 3,046 | - | - | - |
| Query | Aerial+Ground | 9 | 1,523 | 20,204 | - | - | - | |
Please refer to INSTALL.md.
Download the LAGPeR and G2APS-ReID datasets and modify the dataset path.
Download the ViT-base Pre-trained model and modify the path. Line 13 in configs:
PRETRAIN_PATH: xxx
Training SeCap on the LAGPeR dataset with one GPU:
CUDA_VISIBLE_DEVICES=0 python3 tools/train_net.py --config-file ./configs/LAGPeR/secap.yml MODEL.DEVICE "cuda:0" SOLVER.IMS_PER_BATCH 64 Training SeCap on the LAGPeR dataset with 4 GPU:
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 tools/train_net.py --config-file ./configs/LAGPeR/secap.yml --num-gpus 4 SOLVER.IMS_PER_BATCH 256Testing SeCap on the LAGPeR dataset:
CUDA_VISIBLE_DEVICES=0 python3 tools/train_net.py --config-file ./configs/LAGPeR/secap.yml --eval-only MODEL.WEIGHTS xxx 

