loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Lamisse F. Bouabdelli 1 ; 2 ; Fatma Abdelhedi 2 ; Slimane Hammoudi 3 and Allel Hadjali 1

Affiliations: 1 LIAS Laboratory, ISAE-ENSMA, Poitiers, France ; 2 CBI² Research laboratory, Trimane, Paris, France ; 3 ESEO, Angers, France

Keyword(s): Data Lakes, Data Quality, Entity Resolution, Entity Matching, Machine Learning.

Abstract: Entity Resolution (ER) is a critical challenge for maintaining data quality in data lakes, aiming to identify different descriptions that refer to the same real-world entity. We address here the problem of entity resolution in data lakes, where their schema-less architecture and heterogeneous data sources often lead to entity duplication, inconsistency, and ambiguity, causing serious data quality issues. Although ER has been well studied both in academic research and industry, many state-of-the-art ER solutions face significant drawbacks. Existing ER solutions typically compare two entities based on attribute similarity, without taking into account that some attributes contribute more significantly than others in distinguishing entities. In addition, traditional validation methods that rely on human experts are often error-prone, time-consuming, and costly. We propose an efficient ER approach that leverages deep learning, knowledge graphs (KG), and large language models (LLM) to auto mate and enhance entity disambiguation. Furthermore, the matching task incorporates attribute weights, thereby improving accuracy. By integrating LLM for automated validation, this approach significantly reduces the reliance on manual expert verification while maintaining high accuracy. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.108

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Bouabdelli, L. F., Abdelhedi, F., Hammoudi, S., Hadjali and A. (2025). An Advanced Entity Resolution in Data Lakes: First Steps. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0; ISSN 2184-285X, SciTePress, pages 661-668. DOI: 10.5220/0013643200003967

@conference{data25,
author={Lamisse F. Bouabdelli and Fatma Abdelhedi and Slimane Hammoudi and Allel Hadjali},
title={An Advanced Entity Resolution in Data Lakes: First Steps},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={661-668},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013643200003967},
isbn={978-989-758-758-0},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - An Advanced Entity Resolution in Data Lakes: First Steps
SN - 978-989-758-758-0
IS - 2184-285X
AU - Bouabdelli, L.
AU - Abdelhedi, F.
AU - Hammoudi, S.
AU - Hadjali, A.
PY - 2025
SP - 661
EP - 668
DO - 10.5220/0013643200003967
PB - SciTePress