Towards Practical k-Anonymization: Correlation-based Construction of Generalization Hierarchy

Tomoaki Mimoto, Anirban Basu, Shinsaku Kiyomoto


The privacy of individuals included in the datasets must be preserved when sensitive datasets are published. Anonymization algorithms such as k-anonymization have been proposed in order to reduce the risk of individuals in the dataset being identified. k-anonymization is the most common technique of modifying attribute values in a dataset until at least k identical records are generated. There are many algorithms that can be used to achieve k-anonymity. However, existing algorithms have the problem of information loss due to a tradeoff between data quality and anonymity. In this paper, we propose a novel method of constructing a generalization hierarchy for k anonymization algorithms. Our method analyses the correlation between attributes and generates an optimal hierarchy according to the correlation. The effect of the proposed scheme has been verified using the actual data: the average of k of the datasets is 83:14, and it is around 1=3 of the value obtained by conventional methods.


