Using k-anonymization for registry data: pitfalls and alternatives

Sten Anspal; Mart Kaska; Indrek Seppo

doi:10.12697/ACUTM.2017.21.05

Authors

Sten Anspal Estonian Centre for Applied Research, Tallinn
Mart Kaska Estonian Centre for Applied Research, Tallinn
Indrek Seppo Estonian Centre for Applied Research, Tallinn

DOI:

https://doi.org/10.12697/ACUTM.2017.21.05

Keywords:

privacy-preserving computing, k-anonymization

Abstract

We describe an applied study of ICT students' employment in Estonia based on data from two national registries. The study offered an opportunity to compare results from both k-anonymised data as well as those from the novel Sharemind platform for privacy-preserving statistical computing, which offers a way to use confidential data for research without loss of information. Comparison of results using k-anonymized and lossless data indicate substantial differences in estimates of students' employment rates. The results illustrate, on the basis of a real-world study, how the effects of k-anonymization can lead to considerable bias in estimates. While privacy-preserving computing does entail inconveniences because original microdata is not revealed to the statistician, this can be offset by greater confidence in the results.

Downloads

Download data is not yet available.

Using k-anonymization for registry data: pitfalls and alternatives

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

Make a Submission