What is Document Liveness?
Digital account opening often requires that users submit proof of identity in the form of a physical document. Or, more accurately, it requires that users capture a picture of their identity document. If the captured ID is real, unaltered, and physically present at the moment of capture, we say that the document was “live”. Conversely, attempts to deceive or circumvent ID verification are known as “spoofs” or document liveness attacks.
There are countless ways to alter a document or document image, but only a few methods to submit that image for identity verification. The IDLive Doc suite detects the methods most commonly used by fraudsters to bypass digital identity verification:
- Screen Replay attacks – capturing a document image from a digital screen, rather than from the original physical copy
- Printed Copy attacks – submitting paper reproductions of a document as if they were originals
- Portrait Substitution attacks – physically replacing a document portrait, often by simple overlay
- Injection attacks – bypassing a requirement for real-time image capture by “injecting” a previously-recorded image as if it were live
What are pipelines? How do I select which pipelines to use?
Pipelines are the smallest functional unit of the IDLive Doc suite. Each pipeline operates independently, and is focused on a specific vector of document attack. Suffixes in the pipeline name denote the particular attack vector that each pipeline covers, like -sr for screen replay attacks and -pc for printed copy attacks. We recommend using the most recent pipeline for each attack vector. To avoid hard-coding your pipeline selection(s) after each new release, IDLive Doc encourages using pipeline aliases like default-sr which automatically select the latest pipeline in each release version.
How can I learn more about the attack vectors covered by the IDLive Doc suite?
We offer a dataset of 140 attack images to all IDLive Doc evaluators and customers. The images are categorized by attack vector and subvector to educate about common fraud techniques, and to help our customers better design tests based on their own production data. Please reach out to your technical consultant to gain access to this dataset, or drop a request to the Support team.
What documents are supported by the IDLive Doc suite?
IDLive Doc works with all standard government-issued documents, like IDs, driver’s licenses, and passports. The solution is document-agnostic, meaning that IDLive Doc pipelines do not require training against specific document templates to detect spoofed versions. It’s a powerful approach that supports new documents without model retraining or version upgrades, and the lack of template-specific maintenance allows us to keep our prices highly competitive. Exceptions to this document-agnostic approach are listed below:
- Card Backs – The IDLive Doc suite does not currently support the backside of card documents, like IDs and driver’s licenses. However, most document backs only contain a selection of the data available on the front side, so a second liveness evaluation would be redundant.
- Unsupportable Cases – Printed Copy and Portrait Substitution pipelines are respectively trained to identify documents printed on paper, and portrait images that are physically attached to the identity document. Authentic documents with those physical attributes will trigger a false detection. Plastic documents have generally replaced paper globally, but valid examples still exist in certain countries (e.g., Brazil, Italy, and Russia). We recommend that you disable or disregard printed copy and/or portrait substitution detection when a document with paper-based features is detected.
- Nonstandard Identity Documents – Other documents, even if they are official and government-issued, are not currently supported. This includes birth certificates, military IDs, government benefit cards, and any documents printed in the Letter/A4 form factor.
How does IDLive Doc detect presentation attacks?
Each IDLive Doc pipeline is composed of several individual machine learning models, which are trained to detect evidence of fraud in different, complementary ways. Models are trained on diverse sets of labeled data, which teaches each model how to distinguish between authentic document images and a particular type of fraud.
When presented with an image, each model first produces a raw, unbounded score which has a theoretical range of [-∞, ∞]. The model outputs are then combined into an ensemble layer, which weights the individual scores to produce consistent results and the highest possible generalized accuracy. The output of each ensemble, or pipeline, is a normalized probability value between 0 and 1, where 0.5 is the default threshold indicating liveness.
While we can’t share the specific techniques we use to train our models, each presentation attack vector produces recognizable artifacts that are useful for fraud detection. Where traditional ID verification (IDV) solutions attempt to authenticate a document by its security features (checksums, holograms, fonts, etc.), IDLive Doc instead focuses on detecting evidence of fraud, which is far less variable across document types. Screen replay attacks, for example, often produce a Moiré pattern caused by interference between the “donor” screen and the sensor of the capturing device. Printed copies, especially those produced by consumer-grade printers, are unable to faithfully reproduce the full color gamut of authentic documents (see page 4). And the differences in physical texture between document and overlaid portrait image often aid in the detection of portrait substitution attacks.
How do I interpret IDLive Doc’s output?
Each IDLive Doc pipeline generates a normalized to a 0-to-1 probability value, where 1 indicates the highest likelihood of a legitimate document and 0 the lowest. For example, an image with a probability of 0.35 is much more likely to resemble known examples of fraudulent documents, while a 0.92 would indicate a high chance that the document is legitimate. IDLive Doc recommends using a default probability of 0.5 as the threshold to separate live and spoof documents. If this default threshold does not satisfy your business needs, we recommend exploring the preconfigured calibration options before attempting to set a custom threshold.
As with human review, IDLive Doc’s judgments can be wrong. They can even be inconsistent between nearly-identical images, just like people can be. But unlike human review, the models are sometimes wrong in ways that seemingly defy common sense. That’s why it’s important to conduct a thorough product evaluation to understand how IDLive Doc performs, rather than judging the pipelines in human terms.
How accurate is the IDLive Doc suite of products?
In general, IDLive Doc pipelines achieve a 1% BPCER, also known as the False Rejection Rate. However, product accuracy is challenging to communicate in a meaningful way, for several reasons.
- Accurately detecting spoofing attacks and approving genuine images is dependent on image quality. The IDLive Doc suite offers a robust image quality assessment (IQA) module to identify low-quality images and to offer end-users recapture guidance before evaluating for liveness. Even so, low-quality images like those with glare, blur, or poor resolution will generally receive lower-accuracy judgments.
- To some extent, accuracy is also dependent on the production quality and sophistication of the issued document. While IDLive Doc is designed to be globally document agnostic, certain documents may have lower print quality or may have security features which interfere with the accurate evaluation of the document’s liveness. If your business focuses on specific documents or regions, please inquire about IDLive Doc’s accuracy on your particular document cohort.
- The rate and relative sophistication of document fraud is highly variable by country, industry, and individual business. Although ID R&D has curated a large repository of attack images, fraud techniques change rapidly and our test data may not faithfully represent your specific challenges. The best way to evaluate the IDLive Doc suite is with your own data.
- Unlike in the field of biometrics, there is no published standard or public competition to compare the accuracy of document liveness solutions. ID R&D strives to make its test data as true to real-world conditions as possible, but direct comparison to other solutions should only be performed using the same set of production or production-like images.
What attacks cannot be detected by the IDLive Doc suite?
- IDLive Doc pipelines are not trained to detect counterfeit cards printed to plastic materials like polycarbonate, PVC, and Teslin. To detect these attack documents, which are often of high quality, we recommend corroborating a user’s identity against independent data like a government database, credit bureau, or other reputable source.
- Sometimes, images may be submitted which bypass the capture device. This can be allowed behavior, as in the upload of a previously-captured image, or it can be malicious in the form of an injection attack. In either case, IDLive Doc can only evaluate the image’s liveness relative to the moment it was captured, which is not particularly useful for onboarding or eKYC. We recommend that our customers never allow image uploads for identity documents, and that you consider ID R&D’s Document Injection Attack Detection (Doc IAD) solution to prevent injection attacks.
How does ID R&D handle my users’ sensitive data?
The IDLive Doc suite is designed to run on-prem or in a cloud environment controlled by your business. Zero personally identifiable information (PII) is received or processed by ID R&D under normal operating conditions. However, you may choose to share data for troubleshooting purposes, or with the explicit intent to support model training. Unless otherwise agreed, any data you share with ID R&D remains the property of your business, to be managed and deleted according to your instructions. We offer a method for secure data upload, and all customer data is stored in a secure cloud environment for which access is only granted on an as-needed basis.
If you wish to contribute data to improve IDLive Doc’s accuracy against your documents, please contact us.