The 4th International Workshop on

Privacy and Anonymity in the Information Society (PAIS)

March 25, 2011, Uppsala (Sweden)

Collocated with EDBT/ICDT 2011

Invited Talk
Dr. Yücel Saygin  is the Keynote Speaker for PAIS 2011.

Title: A Probabilistic Look Ahead of Anonymization

Abstract: Data anonymization is an expensive process, and sometimes the utility of the anonymized data may not justify the cost of anonymization. For example in a distributed setting where the data reside at different sites and needs to be anonymized without a trusted server, Secure Multiparty Computation (SMC) protocols need to be employed. However, the cost of SMC protocols could be prohibitive, and therefore the parties may want to look ahead of anonymization to decide if it is worth running the expensive SMC protocols. In this work, we describe a probabilistic fast look ahead of k-anonymization of horizontally partitioned data. The look ahead returns an upper bound on the probability that k-anonymity will be achieved at a certain utility where the utility is quantified by commonly used metrics from the anonymization literature. The look ahead process exploits prior information such as total data size, attribute distributions, or attribute correlations, all of which require simple SMC operations to compute. More specifically, given only statistics on the private dataset, we show how to calculate the probability that a mapping of values to generalizations will make a private dataset k-anonymous.