This webpage presents the dataset dedicated to detection and spotting of forgeries in payslips images. The dataset is made of a corpus of 477 alterated payslips in which near 6000 characters were forged. Provided with a reliable groundtruth, this dataset should be useful in many work in digital forensics research domain.

Publication :

Dataset of Genuine Documents : Synthetic real-like Payslips
  • 200 documents
  • 5 fonts, 4 text sizes
  • Fixed layout, different data :
    • Company information
    • Employee information
    • Wage information


Dataset of Forged Documents
  • Workshop with one-day fraudsters, experts and non-experts on standard computer with common Tools (Windows 7®, MS Paint®)
  • 3 types of forgeries : Imitation, Copy/paste Inter and Intra documents


XML File Dataset of Forged Documents

You can download a zip file containing a sample of the dataset here

If you want to download the dataset, please contact us at: nicolas.sidereATuniv-lr.fr, mickael.coustatyATuniv-lr.fr