loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Fred Ferreira and Robson do Nascimento Fidalgo

Affiliation: Center of Informatics (CIn), Federal University of Pernambuco (UFPE), Recife, PE, Brazil

Keyword(s): Data Warehouse, Distributed SQL, NewSQL, HTAP Databases, Data Modeling, Performance Analysis.

Abstract: Data Warehouses (DWs) have become an indispensable asset for companies to support strategic decision-making. In a world where enterprise data grows exponentially, however, new DW architectures are being investigated to overcome the deficiencies of traditional relational Database Management Systems (DBMS), driving a shift towards more modern, cloud-based DW solutions. To enhance efficiency and ease of use, the industry has seen the rise of next-generation analytics DBMSs, such as NewSQL, a hybrid storage class of solutions that support both complex analytical queries (OLAP) and transactional queries (OLTP). We under-stand that few studies explore whether the way the data is denormalized has an impact on the performance of these solutions to process OLAP queries in a distributed environment. This paper investigates the role of data modeling in the processing time and data volume of a distributed DW. The Star Schema Benchmark was used to evaluate the performance of a Star Schema and a F ully Denormalized Schema in three different market solutions: Singlestore, Amazon Redshift and MariaDB Columnstore in two different memory availability scenarios. Our results show that data denormalization is not a guarantee for improved performance, as solutions performed very differently depending on the schema. Furthermore, we also show that a hybrid-storage (HTAP) NewSQL solution can outperform an OLAP solution in terms of mean execution time. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.191.80.173

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Ferreira, F. and do Nascimento Fidalgo, R. (2024). A Performance Analysis for Efficient Schema Design in Cloud-Based Distributed Data Warehouses. In Proceedings of the 26th International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-692-7; ISSN 2184-4992, SciTePress, pages 39-49. DOI: 10.5220/0012546200003690

@conference{iceis24,
author={Fred Ferreira. and Robson {do Nascimento Fidalgo}.},
title={A Performance Analysis for Efficient Schema Design in Cloud-Based Distributed Data Warehouses},
booktitle={Proceedings of the 26th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2024},
pages={39-49},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012546200003690},
isbn={978-989-758-692-7},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 26th International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - A Performance Analysis for Efficient Schema Design in Cloud-Based Distributed Data Warehouses
SN - 978-989-758-692-7
IS - 2184-4992
AU - Ferreira, F.
AU - do Nascimento Fidalgo, R.
PY - 2024
SP - 39
EP - 49
DO - 10.5220/0012546200003690
PB - SciTePress