Skip to content

geshang777/FOCUS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FOCUS: Towards Universal Foreground Segmentation

Overview

We introduce FOCUS, Foreground ObjeCts Universal Segmentation framework that can handle multiple foreground tasks with one unified architecture. To achieve boundary-aware segmentation, we develop a multi-scale semantic network using the edge information of objects to enhance image features and propose a novel distillation method, integrating the contrastive learning strategy to refine the prediction mask in multi-modal feature space. Extensive experiments demonstrate that FOCUS achieves SoTA performance across five foreground segmentation tasks, including Salient Object Detection (SOD), Camouflaged Object Detection (COD), Shadow Detection (SD), Defocus Blur Detection (DBD), and Forgery Detection (FD).

News

  • [2025.07.06] FOCUS(DINOv2-L) checkpoints and prediction results are now opensource. We've also updated the training scripts to support DINOv2-L as the backbone, you can now train FOCUS using one single NVIDIA A6000 GPU. Hope you enjoy it!
  • [2025.06.27] Our new paper Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning is released. In this paper, we explore how to endow large language models (LLMs) with open-world segmentation capabilities using purely reinforcement learning, relying solely on foreground segmentation data.
  • [2025.01.03] FOCUS(DINOv2-G) checkpoints and prediction results are now opensource. You can follow the guidelines here to quickly leverage the state-of-the-art performance of our model. Hope you enjoy it!
  • [2024.12.16] Our code is released! Feel free to contact us if you have any questions!
  • [2024.12.10] Our paper has been accepted by AAAI2025!🔥

Getting Started

Environment Setup

  • We use CUDA 12.2 for implementation.
  • Our code is built upon Pytorch 2.1.1, please make sure you are using PyTorch ≥ 2.1 and matched torchvision. Besides, please check PyTorch version matches that is required by Detectron2.
  • We train our models on 2 NVIDIA A6000 GPUs with 48G memory, please make sure that your VRAM is sufficient to avoid the potential OOM issues during training.
#create environment
conda create --name focus python=3.8
conda activate focus
pip install -r requirements.txt

#install detectron2
git clone [email protected]:facebookresearch/detectron2.git # under your working directory
cd detectron2 && pip install -e . && cd ..

#install other dependencies
pip install git+https://github.com/cocodataset/panopticapi.git
cd third_party/CLIP
python -m pip install -Ue .
cd ../../

#compile CUDA kernel for MSDeformAttn
cd focus/modeling/pixel_decoder/ops && sh make.sh && cd ../../../../

Quick Start

We provide an inference demo here if you want to try out the our model. You should download the weights from our MODEL_ZOO.md first and run the following command. Make sure that you use the config file that matches the downloaded weights.

python demo/demo.py --config-file path/to/your/config \
  --input path/to/your/image \
  --output path/to/your/output_file \
  --opts MODEL.WEIGHTS path/to/your/weights

Prepare Datasets

You should download required dataset (CAMO, COD10K, CHAMELEON, NC4K, DUTS, DUTOMRON, HKU-IS, ECSSD, PASCAL-S, ISTD, DUT/CUHK, CASIA1.0, CASIA2.0) into the datasets folder following

datasets/
├── CAMO-V.1.0-CVIU2019
│   ├── GT
│   ├── Images
│   │   ├── Test
│   │   └── Train
├── CASIA
│   ├── CASIA 1.0 dataset
│   ├── CASIA 1.0 groundtruth
│   ├── CASIA2.0_Groundtruth
│   └── CASIA2.0_revised
├── CHAMELEON
│   ├── GT
│   └── Imgs
├── COD10K-v3
│   ├── Test
│   └── Train
├── DEFOCUS
│   └── dataset
│       ├── test_data
│       │   ├── CUHK
│       │   └── DUT
│       └── train_data
│           ├── 1204gt
│           └──  1204source
├── DUTOMRON
│   ├── DUT-OMRON-image
│   └── pixelwiseGT-new-PNG
├── DUTS
│   ├── DUTS-TE
│   └── DUTS-TR
├── ECSSD
│   ├── ground_truth_mask
│   └── images
├── HKU-IS
│   ├── gt
│   └── imgs
├── ISTD_Dataset
│   ├── test
│   └── train
├── NC4K
│   ├── GT
│   └── Imgs
├── PASCAL
│   └── Imgs

and run the corresponding dataset preparation script by running:

python utils/prepare/prepare_<dataset>.py

# e.g. python utils/prepare/prepare_camo.py

Prepare Pretrained Weights

download pre-trained DINOv2 weights by:

#dinov2-g
wget -P ./ckpt https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_reg4_pretrain.pth

#dinov2-l
wget -P ./ckpt https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_reg4_pretrain.pth

and run the following line to convert DINOv2 weights into detectron2 format while prepare ResNet weights for edge enhancer

#dinov2-g
python utils/convert_dinov2.py ./ckpt/dinov2_vitg14_reg4_pretrain.pth ./ckpt/dinov2_vitg14_pretrain_updated.pkl

#dinov2-l
python utils/convert_dinov2.py ./ckpt/dinov2_vitl14_reg4_pretrain.pth ./ckpt/dinov2_vitl14_pretrain_updated.pkl

Training

python train_net.py \
  --config-file path/to/your/config \
  --num-gpus NUM_GPUS

Evaluation

python train_net.py --eval-only \
  --config-file path/to/your/config \
  --num-gpus NUM_GPUS \
  MODEL.WEIGHTS path/to/your/weights

Citation

If you think our work is helpful, please star this repo and cite our paper!

@inproceedings{you2025focus,
  title={{FOCUS}: Towards Universal Foreground Segmentation},
  author={You, Zuyao and Kong, Lingyu and Meng, Lingchen and Wu, Zuxuan},
  booktitle={AAAI},
  year={2025},
}

Acknowledgements

FOCUS is built upon Mask2Former, CLIP, ViT-Adapter, OVSeg, and detectron2. We express our gratitude to the authors for their remarkable work.

About

[AAAI 2025] Official Implementation of "FOCUS: Towards Universal Foreground Segmentation"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published