[ICCV'25] ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement
The official source code of the paper "ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement".
We present a robust document forgery localization model that adaptively leverages RGB/DCT forensic traces and incorporates key document image traits. To counter DCT traces' sensitivity to block misalignment, we modulate DCT feature contributions via predicted alignment scores, enhancing resilience to distortions like resizing and cropping. A hierarchical content disentanglement method boosts localization by reducing text-background disparities. Leveraging pristine background regions, we build a untampered prototype to improve accuracy and robustness.
- 2026.1.1 Correct our Doc Protocol evaluation settings and results.
- Update DDP training script and make training more stable
- Retrain model with fixed
NonAlignCrop - General inference pipline for images outside DocTamper
- Update better OCR model
- Evaluate ADCD-Net on ForensicHub benchmark (Doc Protocol)
- Release model checkpoint and OCR marks of DocTamper
- Release training and inference code
Models are trained on Doctamper train set and evaluated on seven test sets. The samples in FCD, SCD and Test set are compressed once, using the final quality factor specified in the official DocTamper pickle file. The authentic images are skipped in all test set since there are no true positives. Please refer to ForensicHub for more details.
ADCD-Net is trained on 4 NVIDIA GeForce RTX 4090 24G GPUs which takes about 27 hours with 100k training steps and 40 batch size.
Install dependencies: python 3.10.13, pytorch 2.3.0+cu121, albumentations 2.0.8
Download the DocTamper dataset from DocTamper (qt_table.pk and files in pks can be found from the DocTamper repository) and the ocr mask and model checkpoints from ADCD-Net.
The files from ADCD-Net is organized as follows:
ADCDNet.pth # ADCD-Net checkpoint
docres.pkl # DocRes checkpoint
DocTamperOCR/ # OCR mask directory
├── TrainingSet # Training set directory
├── TestingSet # Testing set directory
├── FCD # FCD dataset directory
└── SCD # SCD dataset directory
Beside DocTamper, we provide the 4 cross-domain test set data and the corresponding ocr masks and path pickle files here, you should modify your correct directory in the path.
Please refer to seg_char.py. For the environment of PaddleOCR, please check PaddleOCR.
- set the paths of the dataset, ocr mask and model checkpoint in
cfg.py - run the
main.py
mode = 'train'
root = 'path/to/root' # TODO:
docres_ckpt_path = 'path/to/docres.pkl' # TODO:- set the paths of the dataset, distortions, ocr mask and model checkpoint in
cfg.py. - run the
main.py
mode = 'val'
root = 'path/to/root' # TODO:
ckpt = 'path/to/ADCDNet.pth' # TODO:
docres_ckpt_path = 'path/to/docres.pkl' # TODO:
multi_jpeg_val = False # able to use multi jpeg distortion
jpeg_record = False # manually set multi jpeg distortion record
min_qf = 75 # minimum jpeg quality factor
shift_1p = False # shift 1 pixel for evaluation
val_aug = None # other distortions can be added here- Generate OCR masks with
seg_char.py. - Build a pickle file containing list of tuples
(img_path, mask_path, ocr_mask_path). - In
cfg.py, setmode='general_val', and specify the paths to the pickle file and the model checkpoint. - Details can be found in the
GeneralValDsclass inds.pyfor general dataset construction.
If you find our project useful in your research, please cite it in your publications.
@inproceedings{wong2025adcd,
title={ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement},
author={Wong, Kahim and Zhou, Jicheng and Wu, Haiwei and Si, Yain-Whar and Zhou, Jiantao},
booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
year={2025}
}
