An Algorithm for Message Type Discovery in Unstructured Log Data

Daniel Tovarňák

2019

Abstract

Log message abstraction is a common way of dealing with the unstructured nature of log data. It refers to the separation of static and dynamic part of the log message, so that both parts can be accessed independently, allowing the message to be abstracted into a more structured representation. To facilitate this task, so-called message types and the corresponding matching patterns must be first discovered, and only after that can be this pattern-set used to pattern-match individual log messages in order to extract dynamic information and impose some structure on them. Because the manual discovery of message types is a tiresome and error-prone process, we have focused our research on data mining algorithms that are able to discover message types in already generated log data. Since we have identified several deficiencies of the existing algorithms, which are limiting their capabilities, we propose a novel algorithm for message type discovery addressing these deficiencies.

Download


Paper Citation


in Harvard Style

Tovarňák D. (2019). An Algorithm for Message Type Discovery in Unstructured Log Data.In Proceedings of the 14th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-379-7, pages 665-676. DOI: 10.5220/0007919806650676


in Bibtex Style

@conference{icsoft19,
author={Daniel Tovarňák},
title={An Algorithm for Message Type Discovery in Unstructured Log Data},
booktitle={Proceedings of the 14th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2019},
pages={665-676},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007919806650676},
isbn={978-989-758-379-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Software Technologies - Volume 1: ICSOFT,
TI - An Algorithm for Message Type Discovery in Unstructured Log Data
SN - 978-989-758-379-7
AU - Tovarňák D.
PY - 2019
SP - 665
EP - 676
DO - 10.5220/0007919806650676