Risk Estimation in Data-driven Fault Prediction for a Biomass-fired

Power Plant

Ivan Ryzhikov

, Mika Liukkonen

, Ari Kettunen

and Yrjö Hiltunen

Department of Environmental Science, University of Eastern Finland, Yliopistonranta 1, 70210, Kuopio, Finland

Sumitomo SHI FW Energia OY, Relanderinkatu 2, 78200, Varkaus, Finland

Keywords: Fault Detection, Feature Creation, Data-driven Modeling, Machine Learning, Risk Estimation, Deep Neural

Network.

Abstract: In this study, we consider a fault prediction problem for the case when there are no variables by which we

could determine that the system is in the fault state. We propose an approach that is based on constructing

auxiliary variable, thus it is possible to reduce the initial problem to the supervised learning problem of risk

estimation. The suggested target variable is an indicator showing how close the system is to the fault that is

why we call it a risk estimation variable. The risk is growing some time before the actual fault has happened

and reaches the highest value in that timestamp, but there is a high level of uncertainty for the times when the

system has been operating normally. We suggest specific criterion that takes uncertainty of risk estimation

into account by tuning three weighting coefficients. Finally, the supervised learning problem with risk variable

and specific criterion can be solved by the means of machine learning. This work confirm that data-driven

risk estimation can be integrated into digital services to successfully manage plant operational changes and

support plant prescriptive maintenance. This was demonstrated with data from a commercial circulating

fluidized bed firing various biomass and residues but is generally applicable to other production plants.

1 INTRODUCTION

The fault prediction problem appears in different

industries. In many cases a fault causes serious

damage to production or business processes, which

comes to a loss of production efficiency and,

consequently, money. Companies need extra

resources to undo the damage of the fault, that is why

preventing the fault is a better practice. By preventing

the fault, we mean having an detection system that

would indicate if the process is of the high fault risk

and we need to do something to prevent the ongoing

fault. This situation is typical for some industries, and

many times a critical process fault can mean a big loss

for the company. In (Paltrinieri and Khan, 2016) the

importance of risk assessment is considered for

chemical industries. Another example is the energy

sector, where any unexpected load limitation or

shutdown of a power unit can cause considerable

economical losses. That is why it is very important to

recognize if the situation is risky that one can prevent

the system from the fault. In fault detection problem

for the power plants there is no single performance

indicator showing how close the system is to the fault.

Rapidly evolving energy market sets challenges to

traditional combustion-based power plants as it

demands efficiency and flexibility in terms of fuel

and load range. For example, the share of biomass as

an energy source has increased significantly during

recent years and it is expected to keep on increasing.

In this paper we consider a real-world problem

concentrating on boiler fault prediction in biomass-

fired circulating fluidized bed (CFB) power plants.

These plants are extremely important and have not

only the financial benefits, but also benefits for the

environment as they can be used to replace fossil-fuel

-based power generation. These plants can utilize

challenging fuels such as biomass or waste residues

efficiently, but the drawback is that these types of fuel

may often cause different problems such as blockages

in the material flow. Especially this concerns biomass

fractions that include large amounts of alkali metals.

Although the consequences of the blockages are

serious, we still cannot measure the quality of the fuel

accurately and need to control the process using the

observational data coming from different other

sensors. In this study we applied the proposed

Ryzhikov, I., Liukkonen, M., Kettunen, A. and Hiltunen, Y.

Risk Estimation in Data-driven Fault Prediction for a Biomass-ﬁred Power Plant.

DOI: 10.5220/0010113104230429

In Proceedings of the 12th International Joint Conference on Computational Intelligence (IJCCI 2020), pages 423-429

ISBN: 978-989-758-475-6

423

approach to find patterns in a system state that takes

place priorly to the fault.

Most industrial processes are complex, so they

cannot be designed faultless and cannot be properly

modelled in advance. It is also hard to tell which

observing variables could be used for detection of the

cases when something is wrong with the process.

Moreover, even the process experts cannot always list

the states and conditions by which we could identify

the situations when process could cause the system

fault. If we could have an adequate mathematical

model of the target system, it could be used to predict

the future system state by the inputs and previous

states. In that case, if we know the future system state,

we can predict the fault. But due to complexity of the

production process there is no mathematical model.

But in the case when the most of processes

characteristics are being monitored, we have

observations, that we can use to build data-driven

models.

All the above lead the fault prediction to be based

on analysis of the data that corresponds to stable

functioning and the data that is prior to the fault. The

goal of the prediction system is to identify the patterns

that lead the system to the fault. It is important to

mention that not all the fault prediction problems can

be considered initially as a regression or classification

problems. We consider a case, in which we only know

the time the fault happened and there are only a few

faults occurred during the comparatively large time

interval. Here we need to reduce the initial problem

to regression problem by adjusting the criteria and

auxiliary variable construction. Then we apply

statistical learning methods to the adjusted dataset to

build up a prediction system.

Machine learning algorithms are being widely

utilized to find the relation between the input and the

output of the system (Kuhn and Johnsson, 2016).

There are studies on applying the machine learning

algorithms for solving the fault prediction problem

for supervised learning, but the most of these studies

are focused on specific processes. Since the fault

prediction requires recognition of specific patterns in

data, that cause the system fault, by fault prediction

we would mean the risk estimation problem. By risk

we mean some variable, that indicates the degree of

how dangerous the current situation is, this

interpretation is a simplification of the risk definition

done by (Kaplan and Garrick, 1981), so we are not

estimating the consequences and probabilities. In

(Paltrinieri et. al., 2019) the machine learning based

approach is considered as a promising tool of solving

risk estimation problems. The difference in

approaches is the following: is in that paper there is a

variable that can be used for risk estimation, and in

our case, we need to construct it first. Other

approaches of fault detection can be based on training

model on labelled observations of the system with

and without faults (Bondyra et al., 2018), but these

approaches require observations for both regular and

fault system states. In this study we are interested in

recognition pre-fault state instead of the fault state.

In study (Rakhshani et al., 2009) authors consider

the fault prediction problem for a power plant boiler,

where the risk estimation is based on the dataset with

labeled observations. There is continuous variable

that equals its max value for normal system states and

min value for the faults. The risk estimation is the

prediction of that variable and its values become the

base for the fault detection system. Depending on the

value, the state can be classified as normal, low fault

risk and fault. But it could be too late to prevent the

fault if we detect the fault by the time it has happened

and this case we consider in this paper. It has also

been considered in the study (Hujanen, 2019), where

the problem was reduced to the classification problem

and deep neural networks were applied to find a

model. In this study we propose different approach,

where the risk is assumed to grow constantly starting

from the time prior to the fault. Also, the risk

modeling is adjusted according to uncertainty of the

actual risk level for the observations that is not in this

prior to the fault interval, since there is no prior

information that these observations are of the low or

high risk.

In this paper we describe the reduction of the

initial fault detection problem, the way to construct

the risk variable and adjusted criterion and making

data-driven models. We also discuss the possibility of

using the risk prediction models for identification of

relation between different fault cases.

2 RISK ESTIMATION

APPROACH

Today computational resources allow us to make the

data-driven solutions based on the artificial neural

networks and the other computationally intensive

algorithms (Chollet and Allaire, 2018), (Goodfellow

et al., 2016). These methods and their

implementations are becoming more important in the

era of Industry 4.0 (Brink et al., 2016), when the

collected data could be analyzed and used as

decision-making systems for improving

performance.

The considered process state can be characterized

by different inputs that correspond to the sensor data

NCTA 2020 - 12th International Conference on Neural Computation Theory and Applications

424

from the different parts of the boiler plant. Each of

these inputs can be described as time series with fixed

step size: 







,



,…,





,







,



,…,





where  is a sample size. We also know  times at

which the fault happened: 





,1,, so we assume

that there had been some time before that, at which

the risk began to grow. This time before the fault is a

parameter ∆ of the proposed approach. We put

forward a hypothesis, that there is no risk in any other

timestamp, than timesteps before the fault limited by

the parameter. We also assume that risk increases

monotonically starting from zero, and it reaches its

maximum value of one by the fault time, so the risk

variable can be evaluated by the following function





,











∆

1,



∆



0,,

(1)

where 



is the fault time and ∆ is the parameter.

Since there could be  different faults, the risk

function for whole observation time can be evaluated

as a sum of single fault functions (1):









,





.





(2)

We assume that there is always a normal system state

between the different faults, so it is possible to find

such ∆that ∄,,











:











∆, so non-zero

intervals of the risk functions are not overlapping.

According to this approach, we need to find a relation

between the system state variables and the risk

feature. In this study we assume that the risk is

increasing identically before any of the faults.

We need to split the data on train and test sets to

estimate the adequacy of model and its

generalization. Since we work with time series,

which consists of several intervals corresponding to

several faults, we consider two splitting schemes.

First option is to leave the data for one of the faults

for the test and to keep other faults data for the train.

This would help us to understand which faults have

similar (or different) patterns corresponding to the

risk increase. Second option is to split the data on two

subsets, one before some date as train and validation

and second after that date as test. In that case we can

see, how good is historical data in predicting the

future faults. To provide validation we used

stratification, so train and validation contain

observations from a common process and

observations from the interval before the fault.

As a modeling criterion we used the root mean

square error









































(3)

where n is a test or validation subset size, 









,

1,



are risks (2) at 



timestamps and ̃









,1,



are risk estimations at the same time points by the

model. Since we cannot properly estimate the risk for

the time, when no fault was detected and we cannot

estimate the risk for time intervals right after the

fault, we suggested to use specific weights for these

errors in the sum (3):







̃

















∙









̃

















(4)

where 







is a weighting function,















,∈







,∈







,∈



(5)

and 



are the time intervals corresponding to

states after the faults, 



are the time intervals

before the faults and 



are the other intervals.

Here 



, 



and 



are weighing

coefficients. These coefficients are used for

increasing the influence of errors caused at the points,

when the risk was growing and decrease the

influence of errors of risk estimation for the time

intervals for which the risk value is uncertain.

The goal of our risk modelling approach is to

estimate the risk of the current system state and to

observe its dynamics for decision making. It means

that we need to have model with optimal parameters



∗

, which is adequate in risk estimation and thus

minimizing the criterion (4):













∗



min

















,

(6)

where ̃









is the model prediction in case of its

parameters . Data-driven model estimates the risk

by process state variables, so

























|





(7)

and 











,:



.

The fault prediction problem is reduced to

minimization problem (4), where we use specific

weight coefficients (5). The solution of reduced

problem is optimal model parameters (6), that we use

to estimate a risk by system state variables. Now the

risk estimation can be used for fault prediction and

decision making, but this topic is out of the scope of

the study. In the next chapter we discuss the way we

Risk Estimation in Data-driven Fault Prediction for a Biomass-ﬁred Power Plant

425

transform the state variables into risks (7) by solving

the regression problem (6).

3 DATA-DRIVEN RISK

ESTIMATION

In this study we consider the dataset, a collection of

process variables measurements. We explicitly

selected 54 variables, which, from the experts’ point

of view, could be useful for the fault detection. The

dataset contains 50879 observations and 8 different

process faults. In this study we tried different time

delta parameters and finally used ∆2 hours. In case

of this value and observation step size, there are only

192 observations can be labelled as leading to the

fault. It is typical that the faults occur uniquely, so

there is imbalance between the number of

observations of normal process state and the number

of observations leading to the fault.

We tried different machine learning algorithms,

such as lasso regression, random forests, and

artificial neural networks with different number of

layers and perceptron, still the chosen one is

beneficial. In this study we used deep neural network

with 5 layers, containing 64, 64, 64, 32 and 1

neurons, respectively. We added dropout for the 2

and 4

levels: 0.5, 0.5 and 0.25, respectively. We

used root mean square propagation as a learning

algorithm with a batch size of 5000 and 100

iterations. We used the Keras framework (Allaire and

Chollet, 2018) for modeling, and the application were

implemented in R (R Core Team, 2018). The weighs

(5) for criterion (4) are set as following: 









1, 



10. The weights were tuned

manually, but these weights tune the model

sensitivity, and the desired sensitivity comes out of

the business needs.

As it was discussed in the previous chapter, we

used two different cross-validation schemes. First,

we use leave-one-out approach for the faults. We

train and validate the model on all the data except the

one of the faults, which is used for the final test.

According to that, we solved 8 different regression

problems, which correspond to 8 different faults. Due

to randomness of the learning algorithms, we solve

each problem 10 times for each case to evaluate the

statistics.

We separated the errors on the ones that model

makes on the common system functioning, and the

error in risk estimation in case of the fault. The

minimum error values for common and risky states

are given in Table 1 and Table 2, respectively.

Table 1: Minimum error in risk estimation of the common

process for cases when one fault is left for test and others

were used to learn the model.

Fault case to test Error minimum (common)

1 1.746891e-05

2 0.0003728659

3 0.0004703251

4 0.0001684821

5 4.076825e-05

6 6.904885e-06

7 0.0001324624

8 5.641435e-05

Table 2: Minimum error in risk estimation of the process 2

hours before the fault for cases when one fault is left for test

and others were used to learn the model.

Fault case to test Error minimum (fault)

1 0.09567184

2 0.1083031

3 0.09283183

4 0.08412199

5 0.1146508

6 0.1078044

7 0.103839

8 0.1143047

As one can see, in Table 1 the average minimum

value is less than 0.0002, but for 2

, 3

and 7

faults

the risk estimation for the common state was not

stable. This point will be proved by the risk

estimation visualization below. The results in Table 2

show us, that some faults cannot be predicted by the

model trained on other faults, at least here is no risk

increase at the ∆ interval before the test fault.

Since it is hard to differentiate result only by the

table values, let us visualize the risk estimation for all

the considered problems. Figures 1-8 represent the

risk modeling results in case of different fault cases

used as test.

If we compare Figures and Tables, we can see

that some faults are predicted, since we see the

increase of the risk near the fault time. This increase

happens earlier, than it is being expected: not in ∆

interval prior to the fault. It means that the problems

of this nature require specific metric. Metric which

one can use to estimate if the risk prediction

adequacy is a problem itself. In this study we put

forward a hypothesis that the fault is expected during

the similar time interval prior to the fault. In the

further work we will consider another option of

metric calculation and comparing the modeling

results.

NCTA 2020 - 12th International Conference on Neural Computation Theory and Applications

426

Figure 1: Fault risk estimation in case of the 1

fault left for

the test.

Figure 2: Fault risk estimation in case of the 2

fault left for

the test.

Figure 3: Fault risk estimation in case of the 3

fault left for

the test.

Figure 4: Fault risk estimation in case of the 4

fault left for

the test.

Figure 5: Fault risk estimation in case of the 5

fault left for

the test.

Figure 6: Fault risk estimation in case of the 6

fault left for

the test.

Risk Estimation in Data-driven Fault Prediction for a Biomass-ﬁred Power Plant

427

Figure 7: Fault risk estimation in case of the 7

fault left for

the test.

Figure 8: Fault risk estimation in case of the 8

fault left for

the test.

Now we consider the second validation scheme

in which we use only one third of observations for

train, leaving the rest of the data for the test. For the

same model used we get the results that is presented

in Figure 9.

Figure 9: Risk estimation for the case when we use the

observations before some date as train and leave another

data as test.

This experiment shows us, that the system

predicts some of the faults, but it also shows some

risk increase when there is common process. Of

course, that could happen, because we used only 30%

of data for the train, but even that amount of data is

enough to demonstrate that the proposed approach is

promising. We can see that some of the test faults

caused the risk increase and we also see that this

increase is greater than the one happens by mistake.

To demonstrate the risk estimation right before the

fault, we selected only ∆ intervals and give it in

Figure 10 for train and test.

Figure 10: ∆-interval risk estimations.

In this chapter we examined proposed approach

on solving the real-world biomass and residue fuels

energy station problem. This approach is useful for

analysis of the fault cases and the estimation of the

risk. We considered two different cross-validation

schemes and both schemes demonstrated promising

results.

4 CONCLUSIONS

In this study we proposed the auxiliary risk feature

and used it as a target variable for solving fault

prediction problem. This variable represents the

degree of how close the current situation is to the

fault: the higher the risk is, the closer system is to the

fault. To construct this variable, we used the fault

datetime and the specific time parameter – the time

prior to the fault, when we expect the risk to grow.

Another part of the approach is to provide errors in

risk estimation before the faults being more valuable

for the model learning, than the errors on all the other

intervals, on which the uncertainty is higher. The

weigh parameter values should be tuned so it would

provide the suitable balance between errors of both

types: predicting high risk, when system runs

normally, and predicting no risk, when there is high

risk of the fault. This balance should be determined

by the business needs and out of the scope of this

study.

NCTA 2020 - 12th International Conference on Neural Computation Theory and Applications

428

The proposed approach was applied to solve the

fault detection problem for a CFB process based

power plant burning various type of biomasses.

These systems are expected to benefit from this kind

of risk estimation system, so that one could be able

to detect possible process disturbances in advance to

buy time for remedial actions aimed at preventing a

critical system failure that may eventually lead to a

load limitation or an unexpected shutdown. This

work confirm that data-driven risk estimation can be

integrated into digital services to successfully

manage plant operational changes and support plant

prescriptive maintenance. This was demonstrated

with data from a commercial circulating fluidized

bed firing various biomass and residues but is

generally applicable to other production plants.

Moreover, data-based, digital predictive tools are

expected to play a growing role in the future service

business within the energy production sector as

customers are expecting better availability and

predictability combined with requirement to burn

cheaper and challenging fuels. The considered

approach is useful in revealing the similarities and

differences for the faults and, thus, it is useful for

further monitoring of the system state and for fault

prediction.

As a modeling approach we utilized the deep

neural networks and the results shown in Tables 1

and 2 and on Figures 1-10 demonstrates that the

model gives promising results.

We continue the research by applying another

class of models and using the lagged inputs. Since the

process is continuous and generally its state can be

characterized by the states in the previous

observation points, the promising option would be to

use the convolutional and recurrent neural networks.

The future studies involve developing specific

metric that would help to compare the model

accuracy more precisely. That would allow making

automatic modeling system. Another part of the

future studies is related to the risk time interval

identification since it could be different for all the

cases. Parameter estimation problem can be reduced

to the global optimization problem, that is combined

with modeling, so we will be able to find the risk

parameters and corresponding models

simultaneously.

REFERENCES

Allaire, J.J., Chollet, F., 2018. keras: R Interface to 'Keras'.

R package version 2.2.4. URL https://CRAN.R-

project.org/package=keras

Brink, H., Richards, J., Fetherolf, M., 2016. Real-World

Machine Learning, Manning.

Bondyra A., Gąsior P., Gardecki S. and Kasiński A., 2018.

Development of the Sensory Network for the Vibration-

based Fault Detection and Isolation in the

Multirotor UAV Propulsion System. In Proceedings of

the 15th International Conference on

Informatics in Control, Automation and Robotics -

Volume 2: ICINCO, ISBN 978-989-758-321-6, pp.

112-119. DOI: 10.5220/0006846801120119

Chollet, F., Allaire, J. J., 2018. Deep Learning with R.

Manning. ISBN 978-1-61729-554-6.

Goodfellow, I., Bengio, Y. and Courville, A. 2016. Deep

Learning. MIT Press.

Hujanen, J. Machine Learning Methods for Early Process

Deviation Detection in Circulating Fluidized Bed

Boilers. In: Proc. of Nordic Flame Days 2019.

Kaplan. S., Garrick, B.J., 1981. On the quantitative

definition of risk. Risk Anal., vol. 1, pp. 11-27

Kuhn, M., Johnson, K., 2016. Applied predictive modeling.

Springer

Paltrinieri, N., Comfort, L., Reniers, G., 2019. Learning

about risk: Machine learning for risk assessment. Safety

science, vol. 118, Elseiver, pp 475-486.

Paltrinieri, N., Khan, F., 2016. Dynamic Risk Analysis in

the Chemical and Petroleum Industry, Dynamic Risk

Analysis in the Chemical and Petroleum Industry:

Evolution and Interaction with Parallel Disciplines in

the Perspective of Industrial Application. Butterworth-

Heinemann. 10.1016/B978-0-12-803765-2.01001-5.

R Core Team, 2018. R: A language and environment for

statistical computing. R Foundation for Statistical

Computing, Vienna, Austria. URL https://www.R-

project.org

Rakhshani, E., Sariri, I., Rouzbehi, K, 2009. Application of

data mining on fault detection and prediction in Boiler

of power plant using artificial neural network. 2009

International Conference on Power Engineering,

Energy and Electrical Drives, Lisbon, Portugal

Risk Estimation in Data-driven Fault Prediction for a Biomass-ﬁred Power Plant

429