Scalable k-anonymous Microaggregation: Exploiting the Tradeoff between Computational Complexity and Information Loss

Florian Thaeter, Rüdiger Reischuk

2021

Abstract

k-anonymous microaggregation is a standard technique to improve privacy of individuals whose personal data is used in microdata databases. Unlike semantic privacy requirements like differential privacy, k-anonymity allows the unrestricted publication of data, suitable for all kinds of analysis since every individual is hidden in a cluster of size at least k. Microaggregation can preserve a high level of utility, that means small information loss caused by the aggregation procedure, compared to other anonymization techniques like generalization or suppression. Minimizing the information loss in k-anonymous microaggregation is an NP-hard clustering problem for k ≥ 3. Even more, no efficient approximation algorithms with a nontrivial approximation ratio are known. Therefore, a bunch of heuristics have been developed to restrain high utility – all with quadratic time complexity in the size of the database at least. We improve this situation in several respects providing a tradeoff between computational effort and utility. First, a quadratic time algorithm ONA* is presented that achieves significantly better utility for standard benchmarks. Next, an almost linear time algorithm is developed that gives worse, but still acceptable utility. This is achieved by a suitable adaption of the Mondrian clustering algorithm. Finally, combining both techniques a new class MONA of parameterized algorithms is designed that deliver competitive utility for user-specified time constraints between almost linear and quadratic.

Download


Paper Citation


in Harvard Style

Thaeter F. and Reischuk R. (2021). Scalable k-anonymous Microaggregation: Exploiting the Tradeoff between Computational Complexity and Information Loss. In Proceedings of the 18th International Conference on Security and Cryptography - Volume 1: SECRYPT, ISBN 978-989-758-524-1, pages 87-98. DOI: 10.5220/0010536600870098


in Bibtex Style

@conference{secrypt21,
author={Florian Thaeter and Rüdiger Reischuk},
title={Scalable k-anonymous Microaggregation: Exploiting the Tradeoff between Computational Complexity and Information Loss},
booktitle={Proceedings of the 18th International Conference on Security and Cryptography - Volume 1: SECRYPT,},
year={2021},
pages={87-98},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010536600870098},
isbn={978-989-758-524-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 18th International Conference on Security and Cryptography - Volume 1: SECRYPT,
TI - Scalable k-anonymous Microaggregation: Exploiting the Tradeoff between Computational Complexity and Information Loss
SN - 978-989-758-524-1
AU - Thaeter F.
AU - Reischuk R.
PY - 2021
SP - 87
EP - 98
DO - 10.5220/0010536600870098