loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Ricardo Almeida 1 ; Paulo Maio 2 ; Paulo Oliveira 2 and João Barroso 3

Affiliations: 1 ISEP-IPP and School of Engineering of Polytechnic of Porto, Portugal ; 2 ISEP-IPP, School of Engineering of Polytechnic of Porto, GECAD – Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development and, Portugal ; 3 UTAD – University of Trás-os-Montes and Alto Douro, Portugal

ISBN: 978-989-758-158-8

Keyword(s): Data Quality, Data Cleaning, Knowledge Reuse, Vocabulary, Ontologies.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Biomedical Engineering ; Expert Systems ; Health Information Systems ; Knowledge Engineering and Ontology Development ; Knowledge Representation ; Knowledge-Based Systems ; Symbolic Systems

Abstract: The organizations’ demand to integrate several heterogeneous data sources and an ever-increasing volume of data is revealing the presence of quality problems in data. Currently, most of the data cleaning approaches (for detection and correction of data quality problems) are tailored for data sources with the same schema and sharing the same data model (e.g., relational model). On the other hand, these approaches are highly dependent on a domain expert to specify the data cleaning operations. This paper extends a previously proposed data cleaning methodology that reuses cleaning knowledge specified for other data sources. The methodology is further detailed/refined by specifying the requirements that a data cleaning operations vocabulary must satisfy. Ontologies in RDF/OWL are proposed as the data model for an abstract representation of the data schemas, no matter which data model is used (e.g., relational; graph). Existing approaches, methods and techniques that support the implementa tion of the proposed methodology, in general, and specifically of the data cleaning operations vocabulary are also presented and discussed in this paper. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.231.247.139

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Almeida, R.; Maio, P.; Oliveira, P. and Barroso, J. (2015). An Ontology-based Methodology for Reusing Data Cleaning Knowledge.In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2015) ISBN 978-989-758-158-8, pages 202-211. DOI: 10.5220/0005596402020211

@conference{keod15,
author={Ricardo Almeida. and Paulo Maio. and Paulo Oliveira. and João Barroso.},
title={An Ontology-based Methodology for Reusing Data Cleaning Knowledge},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2015)},
year={2015},
pages={202-211},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005596402020211},
isbn={978-989-758-158-8},
}

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2015)
TI - An Ontology-based Methodology for Reusing Data Cleaning Knowledge
SN - 978-989-758-158-8
AU - Almeida, R.
AU - Maio, P.
AU - Oliveira, P.
AU - Barroso, J.
PY - 2015
SP - 202
EP - 211
DO - 10.5220/0005596402020211

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.