loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Ginés Almagro-Hernández 1 ; 2 and Jesualdo Fernández-Breis 2 ; 1

Affiliations: 1 Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, Murcia, Spain ; 2 IMIB-Pascual Parrilla, Murcia, 30100, Spain

Keyword(s): Knowledge Engineering, Schema Inference, Functional Probability.

Abstract: In the information age, tabular data often lacks explicit semantic metadata, challenging the inference of its underlying schema. This is a particular challenge when there is no prior information. Existing methodologies often assume perfect data or require supervised training, which limits their applicability in real-world scenarios. The relational database model utilizes functional dependencies (FDs) to support normalization tasks. However, the direct application of strict FDs to real-world data is problematic due to inconsistencies, errors, or missing values. Previous proposals, such as fuzzy functional dependencies (FFDs), have shown weaknesses, including a lack of clear semantics and ambiguous benefits for database design. This article proposes the concept of functional probability (FP), a novel approach for quantifying the probability of existence of a functional dependency between incomplete and uncertain data, for supporting semantic schema inferencing. FP measures the probabil ity that a randomly selected tuple satisfies the functional dependency with respect to the most frequent association observed. Based on Codd’s relational model and Armstrong’s axioms, this methodology allows for inferring a minimal and non-redundant set of FDs, filtering weak candidates using probability thresholds. The method has been evaluated on two tabular datasets, yielding expected results that demonstrate its applicability. This approach overcomes the limitations of strict dependencies, which are binary, and FFDs, which lack clear semantics, offering a robust analysis of data quality and the inference of more realistic and fault-tolerant database structures. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.186

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Almagro-Hernández, G. and Fernández-Breis, J. (2025). Inferring Semantic Schemas on Tabular Data Using Functional Probabilities. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KEOD; ISBN 978-989-758-769-6; ISSN 2184-3228, SciTePress, pages 156-163. DOI: 10.5220/0013773000004000

@conference{keod25,
author={Ginés Almagro{-}Hernández and Jesualdo Fernández{-}Breis},
title={Inferring Semantic Schemas on Tabular Data Using Functional Probabilities},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KEOD},
year={2025},
pages={156-163},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013773000004000},
isbn={978-989-758-769-6},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KEOD
TI - Inferring Semantic Schemas on Tabular Data Using Functional Probabilities
SN - 978-989-758-769-6
IS - 2184-3228
AU - Almagro-Hernández, G.
AU - Fernández-Breis, J.
PY - 2025
SP - 156
EP - 163
DO - 10.5220/0013773000004000
PB - SciTePress