loading
Papers

Research.Publish.Connect.

Paper

Authors: Gabrielle Karine Canalle ; Bernadette Farias Lóscio and Ana Carolina Salgado

Affiliation: Federal University of Pernambuco, Brazil

ISBN: 978-989-758-247-9

Keyword(s): Attribute Selection, Entity Resolution, Data Integration.

Related Ontology Subjects/Areas/Topics: Coupling and Integrating Heterogeneous Data Sources ; Databases and Information Systems Integration ; Enterprise Information Systems

Abstract: Data integration is an essential task for achieving a unified view of data stored in heterogeneous and distributed data sources. A key step in this process is the Entity Resolution, which consists of identifying instances that refer to the same real-world entity. In general, similarity functions are used to discover equivalent instances. The quality of the Entity Resolution result is directly affected by the set of attributes selected to be compared. However, such attribute selection can be challenging. In this context, this work proposes a strategy for selection of relevant attributes to be considered in the process of Entity Resolution, more precisely in the instance matching phase. This strategy considers characteristics from attributes, such as quantity of duplicated and null values, in order to identify the most relevant ones for the instance matching process. In our experiments, the proposed strategy achieved good results for the Entity Resolution process. Thus, the attributes c lassified as relevant were the ones that contributed to find the greatest number of true matches with a few incorrect matches. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.205.60.226

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Karine Canalle, G.; Lóscio, B. and Salgado, A. (2017). A Strategy for Selecting Relevant Attributes for Entity Resolution in Data Integration Systems.In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 3: ICEIS, ISBN 978-989-758-247-9, pages 80-88. DOI: 10.5220/0006316100800088

@conference{iceis17,
author={Gabrielle Karine Canalle. and Bernadette Farias Lóscio. and Ana Carolina Salgado.},
title={A Strategy for Selecting Relevant Attributes for Entity Resolution in Data Integration Systems},
booktitle={Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 3: ICEIS,},
year={2017},
pages={80-88},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006316100800088},
isbn={978-989-758-247-9},
}

TY - CONF

JO - Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 3: ICEIS,
TI - A Strategy for Selecting Relevant Attributes for Entity Resolution in Data Integration Systems
SN - 978-989-758-247-9
AU - Karine Canalle, G.
AU - Lóscio, B.
AU - Salgado, A.
PY - 2017
SP - 80
EP - 88
DO - 10.5220/0006316100800088

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.