Skip to content

[ICCV'25] ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement

License

Notifications You must be signed in to change notification settings

KahimWong/ADCD-Net

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[ICCV'25] ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement

arXiv

Description

The official source code of the paper "ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement".

model_overview

We present a robust document forgery localization model that adaptively leverages RGB/DCT forensic traces and incorporates key document image traits. To counter DCT traces' sensitivity to block misalignment, we modulate DCT feature contributions via predicted alignment scores, enhancing resilience to distortions like resizing and cropping. A hierarchical content disentanglement method boosts localization by reducing text-background disparities. Leveraging pristine background regions, we build a untampered prototype to improve accuracy and robustness.

TODO

  • 2026.1.1 Correct our Doc Protocol evaluation settings and results.
  • Update DDP training script and make training more stable
  • Retrain model with fixed NonAlignCrop
  • General inference pipline for images outside DocTamper
  • Update better OCR model
  • Evaluate ADCD-Net on ForensicHub benchmark (Doc Protocol)
  • Release model checkpoint and OCR marks of DocTamper
  • Release training and inference code

ForensicHub Benchmark (Doc Protocol)

doc_protocol

Models are trained on Doctamper train set and evaluated on seven test sets. The samples in FCD, SCD and Test set are compressed once, using the final quality factor specified in the official DocTamper pickle file. The authentic images are skipped in all test set since there are no true positives. Please refer to ForensicHub for more details.

ADCD-Net is trained on 4 NVIDIA GeForce RTX 4090 24G GPUs which takes about 27 hours with 100k training steps and 40 batch size.

Environment Setup

Install dependencies: python 3.10.13, pytorch 2.3.0+cu121, albumentations 2.0.8

Prepare DocTamper Data

Download the DocTamper dataset from DocTamper (qt_table.pk and files in pks can be found from the DocTamper repository) and the ocr mask and model checkpoints from ADCD-Net. The files from ADCD-Net is organized as follows:

ADCDNet.pth # ADCD-Net checkpoint
docres.pkl # DocRes checkpoint
DocTamperOCR/ # OCR mask directory
    ├── TrainingSet # Training set directory
    ├── TestingSet # Testing set directory
    ├── FCD # FCD dataset directory
    └── SCD # SCD dataset directory

Prepare Doc Protocol Data

Beside DocTamper, we provide the 4 cross-domain test set data and the corresponding ocr masks and path pickle files here, you should modify your correct directory in the path.

Get OCR masks

Please refer to seg_char.py. For the environment of PaddleOCR, please check PaddleOCR.

Train with DocTamper

  1. set the paths of the dataset, ocr mask and model checkpoint in cfg.py
  2. run the main.py
mode = 'train'
root = 'path/to/root' # TODO:
docres_ckpt_path = 'path/to/docres.pkl' # TODO:

Evaluate with DocTamper

  1. set the paths of the dataset, distortions, ocr mask and model checkpoint in cfg.py.
  2. run the main.py
mode = 'val'
root = 'path/to/root' # TODO:
ckpt = 'path/to/ADCDNet.pth' # TODO:
docres_ckpt_path = 'path/to/docres.pkl' # TODO:

multi_jpeg_val = False  # able to use multi jpeg distortion
jpeg_record = False  # manually set multi jpeg distortion record
min_qf = 75  # minimum jpeg quality factor
shift_1p = False  # shift 1 pixel for evaluation
val_aug = None # other distortions can be added here

General Evaluation

  1. Generate OCR masks with seg_char.py.
  2. Build a pickle file containing list of tuples (img_path, mask_path, ocr_mask_path).
  3. In cfg.py, set mode='general_val', and specify the paths to the pickle file and the model checkpoint.
  4. Details can be found in the GeneralValDs class in ds.py for general dataset construction.

Citation

If you find our project useful in your research, please cite it in your publications.

@inproceedings{wong2025adcd,
  title={ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement},
  author={Wong, Kahim and Zhou, Jicheng and Wu, Haiwei and Si, Yain-Whar and Zhou, Jiantao},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  year={2025}
}

About

[ICCV'25] ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages