Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases

Fatma Abdelhedi, Rym Jemmali, Rym Jemmali, Gilles Zurfluh

2021

Abstract

The exponential growth of collected data, following the digital transformation of companies, has led to the evolution of databases towards Big Data. Our work is part of this context and concerns more particularly the mechanisms allowing to extract datasets from a Data Lake and to store them in a unique Data Warehouse. This one will allow to realize, in a second time, decisional analyses facilitated by the functionalities offered by the NoSQL systems (richness of the data structures, query language, access performances). This article proposes an extraction mechanism applied only to relational databases of the Data Lake. This mechanism relies on an automatic approach based on the Model Driven Architecture (MDA) which provides a set of schema transformation rules, formalized with the Query/View/Transform (QVT) language. From the physical schemas describing relational databases, we propose transformation rules that allow to generate a physical model of a Data Warehouse stored on a document-oriented NoSQL system (OrientDB). This paper presents the successive steps of the transformation process from the meta-modeling of the datasets to the application of the rules and algorithms. We provide an experimentation using a case study related to the health care field.

Download


Paper Citation


in Harvard Style

Abdelhedi F., Jemmali R. and Zurfluh G. (2021). Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases. In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 3: KMIS; ISBN 978-989-758-533-3, SciTePress, pages 64-72. DOI: 10.5220/0010690600003064


in Bibtex Style

@conference{kmis21,
author={Fatma Abdelhedi and Rym Jemmali and Gilles Zurfluh},
title={Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases},
booktitle={Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 3: KMIS},
year={2021},
pages={64-72},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010690600003064},
isbn={978-989-758-533-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 3: KMIS
TI - Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases
SN - 978-989-758-533-3
AU - Abdelhedi F.
AU - Jemmali R.
AU - Zurfluh G.
PY - 2021
SP - 64
EP - 72
DO - 10.5220/0010690600003064
PB - SciTePress