All posts
productengineering

Face Search at Scale: Matching Faces Across Watchlist Databases

Dinesh· Founder·9 min read

Document verification answers one question: is this document genuine? But there is a second question that regulated onboarding requires an answer to, and it is a harder one: is this the person they claim to be, and should they be onboarding at all?

A valid Aadhaar card tells you that an Aadhaar card with those details exists. It does not tell you that the person submitting it is the person pictured on it. It does not tell you whether that person is on a sanctions list. It does not tell you whether they are a politically exposed person requiring enhanced due diligence. It does not tell you whether they opened three accounts at your institution last month under different identities.

These are the questions that biometric screening exists to answer. 9thSense Identity Sense provides the capabilities to answer all of them, running alongside document verification in the same case.

Three Biometric Capabilities

1:1 Face Match compares two images of the same person and returns a confidence score for whether they are the same individual. The primary use case in KYC is matching a live selfie against the photo on an identity document. The document tells you the person's details; the face match confirms that the person holding the document is the person pictured on it.

Face matching handles the real-world variation in how faces appear across different photos: different lighting conditions, different ages, different camera quality, different poses. The system produces a match or no-match verdict with a confidence level that your workflow can act on.

1:N Face Search searches a single face image against a database of enrolled faces and returns the closest matches ranked by confidence. This is what makes watchlist screening possible. Instead of comparing two known images, you are asking: does this face appear anywhere in this database of millions of records?

1:N search runs against multiple databases simultaneously. A face submitted during onboarding can be screened against PEP databases, sanctions lists, internal fraud watchlists, and other configured sources in a single operation. Each match result identifies which database produced it, so your compliance team knows exactly what kind of hit they are looking at and what the appropriate response is.

Liveness Detection is the gate before biometric matching can be trusted. Without liveness verification, face matching is trivially defeated by a photograph. Someone submits a printed photo of the person they are impersonating, the 1:1 match succeeds, and the fraud proceeds.

Liveness verification confirms that the submitted face image is of a live person physically present at the camera. The types of spoof it detects include printed photo attacks, digital display replays, and camera injection attacks where a pre-recorded video is fed directly into the camera API to bypass the capture step. Liveness runs before face matching — if liveness fails, the biometric workflow does not proceed.

The Six Databases

Identity Sense searches across six biometric databases, each serving a distinct compliance or fraud prevention purpose.

Politically Exposed Persons (PEP) — Politicians, senior government officials, and their immediate family members. PEP screening is a FATF requirement for financial services. A PEP match does not block onboarding — it triggers enhanced due diligence requirements.

Sanctions — Individuals and entities on OFAC, EU, UN, and domestic sanctions lists. A sanctions match is a hard block. Onboarding cannot proceed when there is a confirmed sanctions hit.

Known Fraudsters — Internal and shared fraud intelligence: previously identified fraudsters, synthetic identity patterns, and accounts linked to money mule activity. This is the database that catches repeat offenders who cycle through onboarding flows under different names or identities.

Celebrity — Public figures. Celebrity impersonation is a significant fraud vector, particularly in financial services, where someone claims to be a well-known person to add credibility to a false identity. A celebrity match flags the submission for review.

Ecosystem Users — Existing users in the platform. Cross-tenant deduplication ensures that the same person cannot onboard under different identities across different clients using the platform. This is the database that catches someone who was already rejected or flagged at another institution that shares the ecosystem.

Employee — Internal workforce. Used for employee onboarding, access control, and to detect scenarios where staff members attempt customer-side transactions under a different identity.

Which databases are searched for a given case is configurable. A standard KYC onboarding flow might search all six. A simpler employee badge verification might search only the employee database.

Confidence Scoring

Raw similarity measures are not operationally useful. A number that a biometric engineer understands does not help a compliance analyst decide what to do next.

Identity Sense translates similarity measures into a confidence score on a human-readable scale. The score indicates the strength of a biometric match, from no match at the low end to definitive match at the high end. Results above a configured threshold are surfaced to your compliance team. The threshold is configurable — you can set it more conservatively for high-risk use cases and less conservatively for lower-stakes screening.

The match result that reaches your compliance analyst includes the confidence score, the database source, and the name or identifier of the matched record. Not a distance value. A clear signal: "Match found in sanctions list — [Name] — confidence 94. Review required."

How Multiple Photos Improve Match Quality

One of the practical challenges in biometric databases is that reference photos vary in quality. A PEP database might have official photographs for senior politicians but poor-quality press photos for more obscure officials. A fraud watchlist might have a driver's license photo but nothing else.

Identity Sense handles this by supporting multiple reference photos per enrolled person. When searching, the system finds the best match across all available reference photos for each record. A query face only needs to match any one of the reference photos to produce a high-confidence result.

This design matters significantly in practice. It reduces false negatives — cases where a genuine match is missed because the only reference photo available was a poor angle or an old image. Enrollment is incremental: additional reference photos can be added to an existing record as they become available, improving match quality over time without disrupting existing enrollments.

Liveness as a Non-Negotiable

Liveness detection is enforced at the service level, not as an optional configuration. This is a deliberate design decision.

The temptation in any verification pipeline is to make security checks optional to reduce friction for legitimate applicants. Liveness verification does add a small step for the applicant. But liveness is also the check that prevents the entire face matching capability from being trivially bypassed. A face match without liveness is not a security control — it is a false sense of security.

There is no legitimate scenario in which a genuine applicant submitting a live selfie would fail liveness detection while a fraudster submitting a photograph would pass. The check is asymmetric in its impact: it catches fraudsters reliably and creates no friction for genuine applicants using a normal camera.

Shield Sense adds a second layer beyond liveness: deepfake detection on submitted face images. A liveness check confirms the capture is live; deepfake detection confirms the face itself is genuine and not AI-generated. Both checks together close the vectors that a sophisticated fraud attempt would try to exploit.

Fitting Into a Verification Workflow

Identity Sense does not require a separate integration. Face screening runs within the same 9thSense case as document verification. An intelligent agent can be configured to require a selfie alongside documents and to run biometric checks automatically as part of the workflow.

The compliance analyst reviewing the case sees the full picture: document extractions, fraud detection verdicts, face match results, and watchlist screening outcomes — all in one place, all attached to the same case record, all auditable.

This is the operational advantage of a unified platform over separate-vendor integrations. When a case surfaces a potential issue, the analyst does not need to cross-reference results from multiple systems. The case record contains everything, correlated by the platform, with a clear indication of what checks passed, what flagged, and what requires a decision.

Use Cases

KYC onboarding is the primary use case. Every major regulated onboarding flow in financial services requires some form of identity verification. 1:1 matching confirms identity; watchlist search ensures the person is permissioned to onboard.

Sanctions compliance specifically requires the ability to screen against regularly updated sanctions lists. The platform maintains these databases and applies updates automatically. Your compliance team does not need to manage database refresh cycles.

Employee verification for high-security environments, including access control systems, requires periodic re-verification rather than one-time onboarding. Identity Sense supports verification as an ongoing operation.

Fraud investigation uses the 1:N search capability in reverse: when a fraud case is opened against a known individual, searching their face across databases identifies linked accounts and connected fraud patterns across the ecosystem.

Merchant onboarding for payment aggregators and lenders increasingly requires face verification of proprietors and directors, not just document submission. Identity Sense integrates into merchant onboarding agents the same way it integrates into consumer KYC flows.

The Self-Hosted Consideration

Biometric data — face images and their derived representations — is among the most sensitive data a financial institution handles. Many enterprise and regulatory frameworks impose strict requirements on where biometric data can be processed and stored.

Identity Sense operates within your infrastructure. Whether you deploy on 9thSense-managed infrastructure in your designated region or on your own servers, the biometric processing and storage happens in your environment. Customer face images do not transit a third-party AI API.

This is not a configuration option. It is a design baseline — built to meet the requirements of the financial institutions that cannot compromise on data residency for biometric data.

Identity Sense docs → | Talk to us about KYC →

Try it yourself →

pip install 9thsense