Papers Papers/2022 Papers Papers/2022



Paper Unlock

Authors: Mario Mezzanzanica ; Roberto Boselli ; Mirko Cesarini and Fabio Mercorio

Affiliation: University of Milan-Bicocca, Italy

Keyword(s): Data Quality, Data Management, Cleansing Algorithms, Model-based Reasoning.

Related Ontology Subjects/Areas/Topics: Data Engineering ; Data Management and Quality ; Data Management for Analytics ; Data Structures and Data Management Algorithms ; Information Quality

Abstract: Data cleansing is growing in importance among both public and private organisations, mainly due to the relevant amount of data exploited for supporting decision making processes. This paper is aimed to show how model-based verification algorithms (namely, model checking) can contribute in addressing data cleansing issues, furthermore a new benchmark problem focusing on the labour market dynamic is introduced. The consistent evolution of the data is checked using a model defined on the basis of domain knowledge. Then, we formally introduce the concept of universal cleanser, i.e. an object which summarises the set of all cleansing actions for each feasible data inconsistency (according to a given consistency model), then providing an algorithm which synthesises it. The universal cleanser can be seen as a repository of corrective interventions useful to develop cleansing routines. We applied our approach to a dataset derived from the Italian labour market data, making the whole dataset and outcomes publicly available to the community, so that the results we present can be shared and compared with other techniques. (More)


Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Mezzanzanica, M.; Boselli, R.; Cesarini, M. and Mercorio, F. (2013). Automatic Synthesis of Data Cleansing Activities. In Proceedings of the 2nd International Conference on Data Technologies and Applications - DATA; ISBN 978-989-8565-67-9; ISSN 2184-285X, SciTePress, pages 138-149. DOI: 10.5220/0004491101380149

author={Mario Mezzanzanica. and Roberto Boselli. and Mirko Cesarini. and Fabio Mercorio.},
title={Automatic Synthesis of Data Cleansing Activities},
booktitle={Proceedings of the 2nd International Conference on Data Technologies and Applications - DATA},


JO - Proceedings of the 2nd International Conference on Data Technologies and Applications - DATA
TI - Automatic Synthesis of Data Cleansing Activities
SN - 978-989-8565-67-9
IS - 2184-285X
AU - Mezzanzanica, M.
AU - Boselli, R.
AU - Cesarini, M.
AU - Mercorio, F.
PY - 2013
SP - 138
EP - 149
DO - 10.5220/0004491101380149
PB - SciTePress