Data Ingestion from a Data Lake: The Case of Document-oriented NoSQL Databases

Fatma Abdelhedi, Rym Jemmali, Rym Jemmali, Gilles Zurfluh

2022

Abstract

Nowadays, there is a growing need to collect and analyze data from different databases. Our work is part of a medical application that must allow health professionals to analyze complex data for decision making. We propose mechanisms to extract data from a data lake and store them in a NoSQL data warehouse. This will allow us to perform, in a second time, decisional analysis facilitated by the features offered by NoSQL systems (richness of data structures, query language, access performances). In this paper, we present a process to ingest data from a Data Lake into a warehouse. The ingestion consists in (1) transferring NoSQL DBs extracted from the Data Lake into a single NoSQL DB (the warehouse), (2) merging so-called "similar" classes, and (3) converting the links into references between objects. An experiment has been performed for a medical application.

Download


Paper Citation


in Harvard Style

Abdelhedi F., Jemmali R. and Zurfluh G. (2022). Data Ingestion from a Data Lake: The Case of Document-oriented NoSQL Databases. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-758-569-2, pages 226-233. DOI: 10.5220/0011068300003179


in Bibtex Style

@conference{iceis22,
author={Fatma Abdelhedi and Rym Jemmali and Gilles Zurfluh},
title={Data Ingestion from a Data Lake: The Case of Document-oriented NoSQL Databases},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2022},
pages={226-233},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011068300003179},
isbn={978-989-758-569-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - Data Ingestion from a Data Lake: The Case of Document-oriented NoSQL Databases
SN - 978-989-758-569-2
AU - Abdelhedi F.
AU - Jemmali R.
AU - Zurfluh G.
PY - 2022
SP - 226
EP - 233
DO - 10.5220/0011068300003179