[AAAI 2026] Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter

Zhiyang Chen*, Chen Zhang*, Hao Fang and Runmin Cong

[paper] [BibTeX]

*These authors contributed equally.

Installation

conda create --name DiveSeg python=3.10 -y
conda activate DiveSeg
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia

# under your working directory
git clone [email protected]:facebookresearch/detectron2.git
cd detectron2
pip install -e .

cd ..
git clone https://github.com/ettof/Diveseg.git
cd Diveseg
pip install -r requirements.txt
cd mask2former/modeling/pixel_decoder/ops
sh make.sh
cd ../../../..

Data Preparation & Pre-trained Weights

Download the two benchmark datasets and organize them as follows:

data/
├── UIIS/
│   ├── train/
│   ├── val/
│   └── annotations/
│       ├── train.json
│       └── val.json
└── USIS10K/
    ├── multi_class_annotations/
    ├── foreground_annotations/
    ├── train/
    ├── val/
    └── test/

UIIS Dataset
USIS10K Dataset

The pre-trained weights of DINOv2 are available for download at link. (We use the register-free version of DINOv2-Large.)

Please place the downloaded files in the checkpoints directory as specified in the config file.

Train & Evaluate

Train the DiveSeg model on UIIS or USIS10K dataset:

bash train.sh

Evaluate pre-trained models on test sets:

bash eval.sh

You are expected to get results like this:

Dataset	Test	Backbone	$mAP$	$AP_{50}$	$AP_{75}$	weights
UIIS	Instance	ViT-L	35.6	52.0	38.5	model
USIS10K	Class-Agnostic	ViT-L	64.1	82.8	72.2	model
USIS10K	Multi-Class	ViT-L	48.4	62.3	54.4	model

Citing DiveSeg

@article{chen2025empowering,
  title={Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter},
  author={Chen, Zhiyang and Zhang, Chen and Fang, Hao and Cong, Runmin},
  journal={arXiv preprint arXiv:2511.08334},
  year={2025}
}

Acknowledgement

This repo is based on DINOv2, detectron2 and Mask2Former. Thanks for their great work!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
datasets		datasets
mask2former		mask2former
tools		tools
.gitignore		.gitignore
DiveSeg.png		DiveSeg.png
LICENSE		LICENSE
README.md		README.md
eval.sh		eval.sh
requirements.txt		requirements.txt
train.sh		train.sh
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[AAAI 2026] Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter

Installation

Data Preparation & Pre-trained Weights

Train & Evaluate

Citing DiveSeg

Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

ettof/Diveseg

Folders and files

Latest commit

History

Repository files navigation

[AAAI 2026] Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter

Installation

Data Preparation & Pre-trained Weights

Train & Evaluate

Citing DiveSeg

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages