IT-Application Behaviour Analysis: Predicting Critical System States on OpenStack using Monitoring Performance Data and Log Files

Patrick Kubiak, Stefan Rass, Martin Pinzger

Abstract

Recent studies have proposed several ways to optimize the stability of IT-services with an extensive portfolio of processual, reactive or proactive approaches. The goal of this paper is to combine monitored performance data, such as CPU utilization, with discrete data from log files in a joint model to predict critical system states. We propose a systematic method to derive mathematical prediction models, which we experimentally test using a downsized clone of a real life contract management system as a testbed. First, this testbed is used for data acquisition under variable and fully controllable system loads. Next, based on the monitored performance metrics and log file data, we train models (logistic regression and decision trees) that unify both, numeric and textual, data types in a single incident forecasting model. We focus on 1) investigating different cases to identify an appropriate prediction time window, allowing to prepare countermeasures by considering prediction accuracy and 2) identifying variables that appear more likely than others in the predictive models.

Download


Paper Citation