DocTamper

The DocTamper dataset is now avaliable at BaiduDrive and Google Drive (part1 and part2).

The DocTamper dataset is only available for non-commercial use, you can request a password for it by sending an email with education email to [email protected] explaining the purpose.

To visualize the images and their corresponding ground-truths from the provided .mdb files, you can run this command "python vizlmdb.py --input DocTamperV1-FCD --i 0".

The official implementation of the paper Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution is in the "models" directory.

I delay the release of training codes as forced by my supervisor and the cooperative enterprise who bought them. My training pipline for DocTamper dataset and the IoU metric heavily brought from a famous project in this area, the results of the paper can be easily re-produced with it, you just need to adjust the loss functions and the learing rate decay curve. I also used its augmentation pipline except for (RandomBrightnessContrast, ShiftScaleRotate, CoarseDropout).

Open Source Scheme:
1、Inference models and codes: June, 2023.
2、Training codes: TBD.
3、Data synthesis code: Within 2024.

Any question about this work please contact [email protected].

DocTamper dataset

Dear researcher,

Thank you for your attention. the password of the dataset is IntSig_DLVC_411

After unrar, the resulting .mdb files should be opened by Python scripts:

To visualize the images and their corresponding ground-truths from the provided .mdb files, you can run the command like "python vizlmdb.py --input DocTamperV1-FCD --i 0". The vizlmdb.py is https://github.com/qcf-568/DocTamper/blob/main/vizlmdb.py The dataloader for training or inference can be refered to Line43~Line107 of https://github.com/qcf-568/DocTamper/blob/main/models/eval_dtd.py To run the eval_dtd.py, you can refer to this Colab Notebook https://colab.research.google.com/drive/1rWaSKy2Rsy5welyvj6FbzF01o2zv8ips?usp=sharing Kind Regards,

Chenfan Qu

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
models		models
pks		pks
README.md		README.md
dataloader.py		dataloader.py
metrics.py		metrics.py
qt_table.pk		qt_table.pk
vizlmdb.py		vizlmdb.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DocTamper

DocTamper dataset

About

Uh oh!

Releases

Packages

Languages

cognigen-xyz/DocTamper

Folders and files

Latest commit

History

Repository files navigation

DocTamper

DocTamper dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages