Low Level Big Data Compression

Jaime Salvador-Meneses, Zoila Ruiz-Chavez, Jose Garcia-Rodriguez

2018

Abstract

In the last years, some specialized algorithms have been developed to work with categorical information, however the performance of these algorithms has two important factors to consider: the processing technique (algorithm) and the representation of information used. Many of the machine learning algorithms depend on whether the information is stored in memory, local or distributed, prior to processing. Many of the current compression techniques do not achieve an adequate balance between the compression ratio and the decompression speed. In this work we propose a mechanism for storing and processing categorical information by compression at the bit level, the method proposes a compression and decompression by blocks, with which the process of compressed information resembles the process of the original information. The proposed method allows to keep the compressed data in memory, which drastically reduces the memory consumption. The experimental results obtained show a high compression ratio, while the block decompression is very efficient. Both factors contribute to build a system with good performance.

Download


Paper Citation


in Harvard Style

Salvador-Meneses J., Ruiz-Chavez Z. and Garcia-Rodriguez J. (2018). Low Level Big Data Compression. In Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR; ISBN 978-989-758-330-8, SciTePress, pages 353-358. DOI: 10.5220/0007228003530358


in Bibtex Style

@conference{kdir18,
author={Jaime Salvador-Meneses and Zoila Ruiz-Chavez and Jose Garcia-Rodriguez},
title={Low Level Big Data Compression},
booktitle={Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR},
year={2018},
pages={353-358},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007228003530358},
isbn={978-989-758-330-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 1: KDIR
TI - Low Level Big Data Compression
SN - 978-989-758-330-8
AU - Salvador-Meneses J.
AU - Ruiz-Chavez Z.
AU - Garcia-Rodriguez J.
PY - 2018
SP - 353
EP - 358
DO - 10.5220/0007228003530358
PB - SciTePress