loading
Documents

Research.Publish.Connect.

Paper

Authors: Martin Franke ; Ziad Sehili and Erhard Rahm

Affiliation: University of Leipzig, Germany

ISBN: 978-989-758-296-7

Keyword(s): Record Linkage, Privacy, Locality Sensitive Hashing, Blocking, Bloom Filter, Apache Flink.

Abstract: Privacy-preserving record linkage (PPRL) aims at integrating person-related data without revealing sensitive information. For this purpose, PPRL schemes typically use encoded attribute values and a trusted party for conducting the linkage. To achieve high scalability of PPRL to large datasets with millions of records, we propose parallel PPRL (P3RL) approaches that build on current distributed dataflow frameworks such as Apache Flink or Spark. The proposed P3RL approaches also include blocking for further performance improvements, in particular the use of LSH (locality sensitive hashing) that supports a flexible configuration and can be applied on encoded records. An extensive evaluation for different datasets and cluster sizes shows that the proposed LSH-based P3RL approaches achieve both high quality and high scalability. Furthermore, they clearly outperform approaches using phonetic blocking.

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 52.23.192.92

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Franke M., Sehili Z. and Rahm E. (2018). Parallel Privacy-preserving Record Linkage using LSH-based Blocking.In Proceedings of the 3rd International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS, ISBN 978-989-758-296-7, pages 195-203. DOI: 10.5220/0006682701950203

@conference{iotbds18,
author={Martin Franke and Ziad Sehili and Erhard Rahm},
title={Parallel Privacy-preserving Record Linkage using LSH-based Blocking},
booktitle={Proceedings of the 3rd International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS,},
year={2018},
pages={195-203},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006682701950203},
isbn={978-989-758-296-7},
}

TY - CONF

JO - Proceedings of the 3rd International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS,
TI - Parallel Privacy-preserving Record Linkage using LSH-based Blocking
SN - 978-989-758-296-7
AU - Franke M.
AU - Sehili Z.
AU - Rahm E.
PY - 2018
SP - 195
EP - 203
DO - 10.5220/0006682701950203

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.