Analysis Layer Implementation Method for a Streaming Data Processing System

Aleksey Burdakov, Uriy Grigorev, Andrey Ploutenko, Oleg Ermakov

Abstract

Analysis is an important part of the widely used streaming data processing. The frequency of flow element occurrence and their values sum are calculated during analysis. The algorithms like Count-Min Sketch and others give a big error in restoring the aggregate with a large number of elements. The article proposes application of a vector matrix. Each vector has a length of 'n'. If the number of different elements approaches 'n', then the window size is automatically reduced. This allows accurate storage of the aggregate without element loss. The SELECT operator for searching in a vector array is also proposed. It allows getting various slices of the aggregated data accumulated over the window. The comparison of the developed method with the Count-Min Sketch data processing method in the Analysis Layer was performed. The experiment showed that the method based on the vector matrix more than twice reduces memory consumption. It also ensures the exact SELECT statement execution. An introduction of a floating window allows maintaining the calculation accuracy and avoiding losing records from the stream. The same query sketch-based execution error reaches 200%.

Download


Paper Citation


in Harvard Style

Burdakov A., Grigorev U., Ploutenko A. and Ermakov O. (2021). Analysis Layer Implementation Method for a Streaming Data Processing System. In Proceedings of the 6th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS, ISBN 978-989-758-504-3, pages 262-269. DOI: 10.5220/0010465902620269


in Bibtex Style

@conference{iotbds21,
author={Aleksey Burdakov and Uriy Grigorev and Andrey Ploutenko and Oleg Ermakov},
title={Analysis Layer Implementation Method for a Streaming Data Processing System},
booktitle={Proceedings of the 6th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS,},
year={2021},
pages={262-269},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010465902620269},
isbn={978-989-758-504-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 6th International Conference on Internet of Things, Big Data and Security - Volume 1: IoTBDS,
TI - Analysis Layer Implementation Method for a Streaming Data Processing System
SN - 978-989-758-504-3
AU - Burdakov A.
AU - Grigorev U.
AU - Ploutenko A.
AU - Ermakov O.
PY - 2021
SP - 262
EP - 269
DO - 10.5220/0010465902620269