What are PETs? (The privacy kind, not the furry kind!)

In an era where data drives innovation but privacy concerns loom large, Privacy Enhancing Technologies (PETs) have emerged as essential tools for organizations seeking to unlock insights while protecting individual privacy. These technologies enable data analysis, sharing, and computation without unnecessarily exposing sensitive personal information. IAB Tech Lab’s Addressability and PETs working group has been working on using PETs in advertising use cases since 2022, including using them in the ADMaP (Attribute Data Matching Protocol) specification and in the PAIR (Publisher Advertiser Identity Reconciliation) protocol.

To further industry awareness and understanding around PETs, we have launched the Privacy Lab in our Tools Portal, providing companies with the opportunity to run their data sets through different types of PETs to see what makes the most sense for their own use case. Below, we describe four foundational approaches for PETs and summarize how each balances utility with privacy protection.

Differential Privacy

Differential Privacy is a mathematical framework that adds carefully calibrated noise to data or query results, ensuring that individual records cannot be identified or inferred.

It works by injecting random noise into either the data itself or the results of queries. The amount of noise is precisely calibrated using a privacy budget, known as an epsilon, which quantifies the privacy-utility tradeoff. A smaller epsilon means stronger privacy but potentially less accurate results, whereas a larger epsilon would result in more accurate results, with less privacy. epsilon values below 1 are thought to provide plausible deniability from a privacy perspective.

Privacy Pros

Privacy Cons

Provides mathematically rigorous, quantifiable privacy guarantees that can be proven and audited

Protects against a wide range of attacks, including linkage attacks and inference from multiple queries

Privacy guarantees hold regardless of what auxiliary information an attacker possesses

Allows for controlled privacy-utility tradeoffs through the privacy budget parameter

Well-suited for scenarios involving repeated queries or public data releases. For example, 2020 US census results were released with differential privacy

Introduces noise that reduces data accuracy and utility, with stronger privacy requiring more significant accuracy losses

Privacy budgets are exhaustible—each query or analysis consumes privacy budget, eventually making further queries too noisy to be useful

Difficult to explain to non-technical stakeholders, which can create trust and adoption challenges

Choosing appropriate privacy parameters requires deep expertise and depends heavily on the specific use case

Can provide weaker protection for outliers or rare populations in the dataset

Homomorphic Encryption

Homomorphic Encryption allows computations to be performed directly on encrypted data without decrypting it first. The results, when decrypted, match what would have been obtained by computing on the original plaintext data.

In this approach to privacy, data is encrypted using special cryptographic schemes that preserve certain mathematical properties. Computations (like addition or multiplication) can then be performed on the ciphertext output, producing an encrypted result that decrypts to the correct answer.

Privacy Pros

Privacy Cons

Data remains encrypted throughout processing, providing strong confidentiality guarantees

Enables cloud computing scenarios where service providers can process data without accessing its contents

Particularly valuable for highly sensitive data like medical records or financial information

Eliminates the need to trust the party performing the computation

Provides cryptographic security rather than relying on statistical privacy

The final decrypted output is not privacy-protected—it may reveal sensitive information about individuals

Does not prevent inference attacks or protect against threats like membership inference

Computationally intensive, resulting in significant performance overhead (though improving with newer schemes)

Limited to specific types of computations, with fully homomorphic encryption still impractical for many real-world applications

Key management creates security challenges—whoever holds the decryption key can access all information

k-Anonymity

k-Anonymity is a property of datasets where each record is indistinguishable from at least k-1 other records with respect to certain identifying attributes (quasi-identifiers). A dataset with k=5 means every individual is hidden within a group of at least 5 people.

This approach removes identifying information, and quasi-identifiers (like age, zip code, gender) are generalized or suppressed until each combination appears at least k times in the dataset. For example, specific ages might be replaced with age ranges, or precise locations with broader regions.

Privacy Pros

Privacy Cons

Intuitive and easy to understand, making it accessible to non-technical stakeholders

Provides a clear, measurable privacy metric

Relatively straightforward to implement compared to cryptographic approaches

Maintains data utility for many types of analysis, especially statistical queries

No computational overhead when querying the anonymized dataset

Vulnerable to several well-known attacks, including homogeneity attacks (when sensitive attributes lack diversity within groups) and background knowledge attacks

Does not protect against inference from multiple data releases or when combined with auxiliary information

Can significantly reduce data utility through generalization, potentially eliminating useful patterns

The “right” value of k is context-dependent and difficult to determine

Extensions like l-diversity and t-closeness are needed to address some vulnerabilities, adding complexity

Provides no formal privacy guarantees in the mathematical sense

Secure Multiparty Computation (MPC)

Secure Multiparty Computation allows multiple parties to jointly compute a function over their private inputs without revealing those inputs to each other. Imagine three hospitals wanting to determine the average salary of their combined workforce without sharing individual employee data with one another—MPC makes this possible.

MPC protocols use cryptographic techniques to split data into shares distributed across participants. Computations happen on these shares, and only the final result is revealed. No single party ever sees the complete raw data from the other parties.

Privacy Pros

Privacy Cons

Provides strong mathematical privacy guarantees, ensuring input data remains completely hidden from other participants

Enables collaboration between mutually distrustful parties without requiring a trusted third party

Protects privacy even if some participants attempt to cheat or collude (depending on the protocol’s threat model)

Maintains privacy throughout the entire computation process

The final output itself receives no privacy protection—if the result reveals sensitive information about individuals, MPC alone won’t prevent this

Vulnerable to inference attacks when combined with auxiliary information or repeated queries

Requires all participating parties to actively engage in the protocol, creating coordination challenges

Privacy guarantees can break down if too many parties collude (typically if more than half collude in standard protocols)

Choosing the Right Approach

No single PETs solution fits every scenario. Secure Multiparty Computation excels when multiple parties need to collaborate without sharing raw data. Differential Privacy shines for statistical analysis and public data releases where formal guarantees are essential. Homomorphic Encryption is ideal for outsourced computation on highly sensitive data. k-Anonymity works well for simpler use cases with clear quasi-identifiers and less sophisticated threat models.

In practice, organizations increasingly combine multiple PETs to achieve comprehensive privacy protection. A system might use homomorphic encryption for secure computation, apply differential privacy to query results, and implement k-anonymity as an additional safeguard for released datasets.

As privacy regulations evolve, Privacy Enhancing Technologies will become increasingly central to responsible data use. Understanding the strengths and limitations of each approach is essential for building systems that protect privacy while enabling the beneficial use of data.