Digital fingerprint merging with neural network for biometric security and personal data protection technology

The case for privacy-preserving record linkage in justice

Justice agencies no longer have to choose between collaboration and privacy as new technologies enable insight without identity exposure.


In brief
  • Learn how justice agencies can share insight without exposing identities or increasing privacy risk. 
  • See why privacy-preserving record linkage changes the long-standing trade-off between data access and trust.
  • Understand the technologies enabling cross-system intelligence without sharing personally identifiable information.

Across justice, law enforcement and adjacent domains, the ability to understand how individuals interact with systems over time and across jurisdictions is critical. Investigations span agencies. Individuals move across geographic boundaries. Policy leaders demand accurate, de-duplicated reporting. Analysts seek patterns that only emerge when data is viewed holistically, across jurisdictional boundaries.

At the same time, few data environments carry higher privacy stakes. Criminal histories, investigative records, juvenile data and victim or witness information are among the most sensitive categories of data governments hold. Systems processing this data are required to meet a Criminal Justice Information Services (CJIS)/National Institute of Standards and Technology (NIST) 800-53 High baseline. But even when baseline security standards are met, real-world use cases frequently demand protections that go well beyond compliance checklists.

The result is a familiar impasse: Agencies know that connecting data would improve outcomes but hesitate to share identifying information — especially at scale, across organizations or in cloud based analytics environments.

What if that trade-off were no longer necessary?

From data sharing to intelligence sharing

The central challenge is not data access but identity linkage: how to determine that records from different systems refer to the same individual without revealing the underlying personal information.

This problem has gained prominence under the umbrella of privacy-enhancing technologies (PETs) — a growing set of approaches designed to extract value from data while minimizing exposure of personally identifiable information (PII). As artificial intelligence (AI) adoption accelerates and privacy legislation expands, PETs are rapidly moving from niche research topics to operational necessities.

In the justice context, the implications are profound. Instead of moving raw identity data between agencies or vendors, PET-based approaches allow systems to exchange privacy-enhanced representations of identity attributes. These mathematical representations preserve the properties needed for matching, while not exposing the original identifiers.

How privacy-preserving record linkage works

Privacy-preserving record linkage relies on three complementary methods.

 

First, identity attributes are transformed at the source. Names, addresses, identifiers and contact information are converted into encrypted and de identified representations before they ever leave an agency’s control. Multiple representations of the same attribute can coexist, each optimized for a different type of matching scenario — high precision identifiers, imprecise free-text fields or phonetic similarities. While deterministic, irreversible encryption forms the base, it’s not enough for the problem space; encrypted values of “David” and “Dave” may be very far apart, thus hindering matches by similarity. That’s where locally sensitive hashing comes in; identity attributes are encrypted such that the vector distance between encrypted values is roughly proportional to the difference in inputs. 

 

Second, matching occurs without decryption. Using vector algebra and symbolic logic, systems can evaluate whether two protected records are likely to refer to the same individual. Because the process relies on deductive rules rather than probabilistic inference, the results are repeatable, explainable and auditable — qualities that are especially important in justice environments.

 

Third, relationships are modeled as graphs that exhibit the commutative property. If record A links to B, and B links to C, the system infers that A links to C, thus detecting broader identity clusters even when individual records are sparse or incomplete. Crucially, the logic used to establish each link remains transparent and is inscribed on the edges of the graph.

 

Why determinism and explainability matter

Many modern data systems lean heavily on probabilistic models and machine learning. While powerful, these approaches can introduce opacity, drift and challenges around explainability. In the language of mathematics, any approach that uses inductive logic — drawing general conclusions from specific instances — can never be “truth preserving.” And truth is an important word in justice.

 

In contrast, deterministic, rule-based linkage is “truth preserving” and provides repeatable, explainable results. Analysts can add additional rules as they learn more about the data. They can author different rulesets for different matching confidence levels. Every matching decision can be traced back to the exact predicates that produced it. This matters not just for technical teams but also for governance, legal review and public trust. 

 

Unlocking value without increasing risk

When identity linkage no longer requires sharing raw PII, new possibilities emerge.

 

Agencies can generate cross-jurisdictional metrics without exposing personal identities. Situational awareness can be enhanced through event-based notifications that do not require the exposure of identity information. Third-party analytics and cloud environments can be leveraged without expanding breach risk or liability.

 

Perhaps most importantly, agencies retain control over their own confidentiality posture. Different organizations can apply different privacy transformations to the same types of data, without needing to coordinate or disclose their methods. Intelligence is shared, but identity remains protected.

 

A shift in mindset

Privacy-preserving record linkage is more than a technical solution — it represents a shift in how institutions think about data collaboration. In an era of accelerating policy change, rising public scrutiny and expanding analytical ambition, that shift may prove essential. The future of justice data is not about choosing between insight and privacy. It is about designing systems that deliver both — by design, not by exception.

The views reflected in this article are those of the author and do not necessarily reflect the views of Ernst & Young LLP or other members of the global EY organization. 

Summary 

Justice agencies have long been forced to choose between intelligence sharing and confidentiality. That trade-off is no longer necessary. Privacy-preserving record linkage can enable cross-agency intelligence, de-duplicated metrics and situational awareness without exposing personal identities. By combining cryptography, mathematics and graph-based analysis, agencies can collaborate more effectively while strengthening trust and confidentiality.

About this article

Related articles

5 elements of justice systems modernization

Explore five strategies for modernizing justice information systems.

Justice information systems modernization, reimagined

By modernizing data-sharing and integration, agencies can better connect the dots across the criminal justice system.