Signature Verify

AI-Powered Signature Detection & Verification

Open SourceComputer VisionFine-TuningMetric Learning

Upload a document and reference signature to verify authenticity

About

Signature Verify is a two-stage AI pipeline that detects signatures in document images and verifies whether they're genuine or forged. It demonstrates the full ML lifecycle: dataset curation, model fine-tuning, evaluation, and deployment — all with custom-trained models, no commercial APIs.

Stage 1 uses YOLOv12s (attention-centric object detection) to locate signatures in scanned documents. Stage 2 uses a Siamese SigNet encoder with a binary verification head to compare detected signatures against a reference, outputting a genuine/forged verdict with confidence scoring.

Both models were fine-tuned on GPU (RunPod RTX 4090) at a total cost of ~$5. The project honestly documents the iterative training process — including approaches that didn't work (triplet loss, contrastive loss) and why the final binary classifier approach succeeded.

Key Numbers

0.91

mAP@0.5 (detection)

20.4%

EER (verification)

~$5

total training cost

commercial APIs used

Pipeline Architecture

Signature Detection YOLOv12s (9.3M params)

Attention-centric object detector fine-tuned on 2,819 document images. Locates signatures with bounding boxes.

Preprocessing Pillow + scikit-image

Crop, grayscale, Otsu threshold, resize to 220×155px. Normalizes any input to a consistent format.

Signature Verification SigNet + Classifier (16.9M params)

SigNet CNN encoder (pretrained on signatures) + binary classifier on |emb_a - emb_b|. Outputs genuine probability.

What We Found

Two-phase training prevents catastrophic forgetting

Freezing the backbone first (20 epochs), then fine-tuning everything with 100x lower learning rate (80 epochs) preserves pretrained features while adapting to the target domain. Detection mAP jumped from 0.85 to 0.91 in Phase 2.

Metric learning hit a ceiling — classification broke through

Triplet loss and contrastive loss both plateaued at ~25% EER. SigNet features are classification-oriented, not metric-oriented. A binary classifier on |emb_a - emb_b| reduced EER to 20.4% by learning which dimensions matter.

Bbox format matters more than you think

Our first detection training showed 0.93 mAP on validation but 0.52 on test — a conversion bug interpreted COCO [x,y,w,h] as [x,y,x2,y2]. Fixing one line of code aligned val and test results. Always verify label formats.