Approximation algorithms for k-anonymity

An Zhu

Approximation algorithms for k-anonymity

An Zhu

2005, Journal of Privacy …

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

We consider the problem of releasing a table containing personal records, while ensuring individual privacy and maintaining data integrity to the extent possible. One of the techniques proposed in the literature is k-anonymization. A release is considered k-anonymous if the information corresponding to any individual in the release cannot be distinguished from that of at least k − 1 other individuals whose information also appears in the release. In order to achieve k-anonymization, some of the entries of the table are either suppressed or generalized (e.g. an Age value of 23 could be changed to the Age range 20-25). The goal is to lose as little information as possible while ensuring that the release is k-anonymous. This optimization problem is referred to as the

Batya Kenig

k-Anonymity is a privacy preserving method for limiting disclosure of private information in data mining. The process of anonymizing a database table typically involves generalizing table entries and, consequently, it incurs loss of relevant information. This motivates the search for anonymization algorithms that achieve the required level of anonymization while incurring a minimal loss of information. The problem of k-anonymization with minimal loss of information is NP-hard. We present a practical approximation algorithm that enables solving the k-anonymization problem with an approximation guarantee of O(ln k). That algorithm improves an algorithm due to Aggarwal et al. (Proceedings of the international conference on database theory (ICDT), 2005) that offers an approximation guarantee of O(k), and generalizes that of Park and Shim (SIGMOD '07: proceedings of the 2007 ACM SIG-MOD international conference on management of data, 2007) that was limited to the case of generalization by suppression. Our algorithm uses techniques that we introduce herein for mining closed frequent generalized records. Our experiments show that the significance of our algorithm is not limited only to the theory of k-anonymization. The proposed algorithm achieves lower information losses than the leading approximation algorithm, as well as the leading heuristic algorithms. A modified version of our algorithm that issues-diverse k-anonymizations also achieves lower information losses than the corresponding modified versions of the leading algorithms.

Log In

Approximation algorithms for k-anonymity

Sign up for access to the world's latest research

Abstract

Related papers