Feature Selection Combined with Neural Network for Diesel Engine

Diagnosis

M. Benkaci

and G. Hoblos

1,2

Institut de Recherche en Systèmes Electroniques Embarqués, Avenue Galilée, Saint-Etienne du Rouvray, France

Ecole Supérieure d’Ingénieurs en Génie Électrique, Avenue Galilée, Saint-Etienne du Rouvray, France

Keywords: Leaks Detection, Automotive Diagnosis, Feature Selection, Neural Data Classification, Diesel Air Path.

Abstract: The Feature selection is an essential step for data classification used in fault detection and diagnosis process.

In this work, a new approach is proposed which combines a feature selection algorithm and neural network

tool for leaks detection task in diesel engine air path. The Chi

is used as feature selection algorithm and the

neural network based on Levenberg-Marquardt is used in system behaviour modelling. The obtained neural

network is used for leaks detection. The model is learned and validated using data generated by xMOD.

This tool is used again for test. The effectiveness of proposed approach is illustrated in simulation when the

system operates on a low speed/load and the considered leak affecting the air path is very small.

1 INTRODUCTION

In order to reduce air pollution caused by

automotive engine, several legislations are

introduced. The first legislation is proposed by the

California Air Resources Board in 1970; it has been

continuously updated and it became very strict.

Since 1993, marked by the introduction of the Kyoto

Protocol, the European anti-pollution standards are

becoming more stringent where the authorized

emissions of a diesel vehicle are decreased from

(NOx = nil, CO =2720, HC+NOx=970, PM=140) in

Euro1 standard to (NOx = 80, CO =500,

HC+NOx=170, PM=5) in Euro6 standard. Typically,

each fault that increases the emission level must be

detected and isolated. The leaks in the intake canal

are among the most difficult faults to manage.

The ability of neural networks to approximate a

nonlinear function propels it to become one of the

best tools for fault detection and isolation. By

exploring these artificial intelligence techniques,

another class of fault detection and isolation

algorithms is appeared. In (Isermann, 1984),

Isermann discussed the superior features of the

neural networks in fault classification and

recognition. Sorsa and Costin (Sorsa and Costin,

1993) show the capabilities of supervised neural

networks as Multilayers Perceptron (MLP) and

Radial Basis Function (RBF) to perform a good and

effective fault detection and isolation tasks. Another

work using MLP network is presented in

(Capriglione et al., 2003). The RBF was used in on-

board fault diagnosis for the air path of spark

ignition engine (Sangha et al., 2006). The leakage

problem of gasoline-engine is treated in (Chen,

2011) where the neural network based on the

steepest-descent method combined with a back

propagation algorithm is developed to train three

detection systems.

Because of an increased complexity of the today

engines which are characterized by an important

number of sensors, the reduction of the acquired data

becomes essential. Feature selection is one of

important step before beginning a data classification

task, especially when this task is dedicated to fault

detection and isolation (FDI). It refers to the

problem of selecting the input features that are most

predictive for given outcome. The feature selection

problems are found in all supervised and

unsupervised machine learning which include

classification, regression, time-series, prediction and

clustering. The feature selection tasks try to achieve

three main purposes: reduce the cost of extracting

features, improve the classification accuracy and the

reliability of the estimated performances.

There are many works that used the feature

selection algorithm for fault detection and diagnosis.

In reference (Christina and Tshlizidzi, 2006), a new

approach for intrusion detection and diagnosis is

317

Benkaci M. and Hoblos G..

Feature Selection Combined with Neural Network for Diesel Engine Diagnosis.

DOI: 10.5220/0004042703170324

In Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2012), pages 317-324

ISBN: 978-989-8565-21-1

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

proposed. In (Sugumaran et al., 2007), the authors

use the decision tree to identify the best features in

classification task, they use a proximal support

vector machine characterized by its capability to

classify efficiently the faults in Roller Bearning

system.

In this paper, new methodology dealing with

small leaks detection problem in diesel air path is

developed. To achieve this goal, a new scheme

based on neural network technique is proposed. The

nominal mode (without leak) and leakage mode

corresponding to several diameters of leak were

trained using a Levenberg-Marquardt algorithm.

Before using the acquired data, a feature selection

task is proposed in order to reduce the complexity of

the problem. The main challenge of the proposed

approach is the use of a selected sensors leading to a

reduced cost. The data of two considered modes are

generated using xMOD platform which will be

described later.

The paper is organized in this way. First, the

considered problem is presented in section 2.

Section 3 describes the proposed approach in details

where a brief description of neural networks used,

which based on steepest-descent and Gauss-Newton

method, is given and main detection scheme is

illustrated. After a brief description of xMOD tool

used in engine data collecting, the section 4 gives

some obtained results using our approach. Then,

these results are discussed and commented in order

to illustrate the effectiveness of leakage detection.

2 PROBLEM STATEMENT

For several years, the anti-pollution standards are

dramatically increased and the constraints in

automotive industry become very complex. The

main objective of these standards is to reduce the

emissions level of cars. In the case of diesel engines,

there are several pollutants: carbon monoxide,

unburned hydrocarbons, nitrogen oxides (NOx) and

diesel particulates mater. Usually, the emissions

level proportionally increases with the appearance of

faults in diesel engines, more precisely in diesel air

path. These faults can be due to sensor failures,

actuator failures or system degradation. In this

paper, the last failures class is considered. More

precisely, the leakage detection in diesel air path is

studied. This failure can cause multiple non-desired

system behaviour. In addition to the high emissions

level, this failure causes multiple non-desired effects

such as:

 Operating points changing of the air path

subsystems,

 Incomplete combustion in cylinders,

 Appearance of smoke and the reduction of

performances.

Often, this type of failure can be confused with

the two other types of faults, i.e. sensors or

actuators; consequently, it is very important to

distinguish this fault from others.

Additionally to the main objective of this paper,

the feature selection problem is considered. We all

know that today’s vehicles are characterized by an

increased complexity justified by the important

number of embedded sensors which grow

significantly. Consequently, the uses of selected

subset of sensors data which are correlate the

considered problem is widely desired in such

applications.

In this work, our main objective is to detect air

leaks in diesel air path regardless of their diameters.

Before performing leaks detection, we make a

feature selection in order to reduce the data

complexity.

It is important to specify that, for this

application, small leaks are hidden and are very

difficult to detect because of phenomenon of non-

solicitation system.

3 PROPOSED APPROACH

Nowadays, the neural network is an essential tool

used in many research activities for industrial

complex systems. An advantage of using neural

network to detect faults of systems is that it can get

the knowledge denoted by data. Over and above

remembering ability of learned information, a neural

network has both, ability to generalize an obtained

model and apply the associative property into the

available memory. The error tolerance,

characterizing the neural network, effectively treats

the errors of the model. Additionally, it can perform

a nonlinear mapping and also learn dynamic

behaviors in order to generalize the obtained models.

Generally, the collected data for detection

process are noisy, but, the error tolerance ability of

neural network makes the detection scheme be able

to differentiate the pattern from noise. So, the last

property is a huge advantage in fault detection and

isolation problem. Additionally, the similar patterns

are separated using a generalization property

characterizing a neural network.

The leak detection in intake system is very

difficult to achieve especially when the operating

point corresponds to low load-torque couple. In

ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics

318

these conditions, the compressor in air path is not

solicited by the driver, allowing to a similar pressure

between the intake system and atmospheric one.

This constraint needs an improved detection system.

The proposed system must, increase the accuracy of

model, enhance the performances of vehicle and

grantee the management of small leaks. In this

paper, the Levenberg-Marquardt (LM) algorithm

(Levenberg, 1944) is proposed to realize the

detection tasks. The LM algorithm is used to train

the air path diesel dynamics. Ones the dynamics are

modelled, the leak is detected by comparing the new

measurements with model established using neural

network. The proposed approach contains two

blocks which are the training block and the decision

block. This approach is shown in this diagram.

Nominal data

acquisition

Faulty data

acquisition

Real Data Acquisition of

Selected Sensors (Features)

Training Block

Offline

Testing Block

online

Neural Model

Detection Task

Interaction

Decision Block

Decision

Making

Feature

selection

Figure 1: Detection scheme.

The proposed approach is designed to operate on

on-line mode, thus, a classical process of real data

acquisition is adopted in this work. It is important to

remember that in this application we only use

sensors selected by feature selection algorithm and

summarize the intake behaviour of diesel air path.

The data acquired in this step are sent to the decision

block in order to detect the leaks affecting vehicle.

3.1 Training Block

3.1.1 Feature Selection

In this work, a most popular feature selection is

chosen; it is Chi2 algorithm (Alexandrov et al.,

2001). Chi2 is simple and general algorithm which

achieves feature ranking using a discretization

process. This algorithm is combined with neural

network classifier to select the feature that we must

keep.

The Chi2 algorithm is based on the X

which

runs on two stages in this manner:

 Stage1:

1) Set the sigLevel to 0.5 for all features;

2) Sort each feature according to its values;

3) Compute the X

value for every pair of adjacent

intervals;































(1)

Where:

k: number of class;

: number of samples in the ith interval and

jth class;

: number of samples in the ith interval;

: number of samples in the jth class;

N: total number of samples;















(2)

4) Merge the pair of adjacent interval with the

lowest X

value until the X

value of each pair

of adjacent intervals exceeds sigLevel;

This process is repeated by decreasing sigLevel until

inconsistency rate δ is exceeded in the discretized

data.

 Stage2:

5) Start with the sigLevel0 corresponding to the

last value of sigLevel determined in the first

stage;

6) Associate sigLevel(i) with each features and

run merging;

7) Consistency test:

If inconsistency < δ merge intervals and

decrease sigLevel(i);

Else if inconsistency > δ eliminate the ith

features for the next step.

Firstly, the WEKA (Witten et al., 2005) data

mining tool is used to perform Chi2 ranking. The

features are sorted according to their rank. Secondly

the most important features are selected using the

neural network classifier which will be described

later. More precisely, the features will be eliminated

iteratively from least important to most important

and the weight of the eliminated feature is evaluated

according to the obtained classification Mean

Squared Error (MSE).

3.1.2 Training

Pattern classification using neural network aims to

determinate the class boundaries by the classifier.

The training phase of neural network achieves this

Feature Selection Combined with Neural Network for Diesel Engine Diagnosis

319

goal. In this paper, a gradient-based training

algorithm is used. This category of algorithms is

most commonly used by researchers. One of these

algorithms is Hessian-based algorithms; they can

significantly reduce the convergence time. The

Levenberg-Marquardt algorithm belongs to Hessian-

based techniques; it makes use the advantages of

Hessian-based algorithm in the optimization of

nonlinear least squares.

The Levenberg-Marquardt algorithm is a well-

known optimization technique. It locates the

minimum of a function which is expressed by the

sum of squares of nonlinear functions. This

algorithm, widely used in several disciplines, is a

combination of Steepest-Descent with Gauss-

Newton method. According to the current position

compared with the correct one, these techniques act

by intermittently; if the current position is far then

the correct one the steepest-Descent is applied, by

against, if it’s neighboring the current solution, the

Gauss-Newton takes over. The Steepest-Descent

technique used in LM algorithm is slow, but it

guarantees the convergence property. When the

current position becomes near to correct one the LM

algorithm switches to Gauss-Newton method which

converges rapidly.

For the neural network training the objective

function is the error of the type:





























(3)

Where y

are real data of diesel engines, a

are a

network output, p is the total number of samples and

represents the total number of nodes in the output

layer.

In this work, the used neural network contains

five layers. The first layer is the input layer which

receives the data corresponding to the selected

sensors which are used in this application. The three

following layers are the hidden ones which represent

the network core. The last one is the output layer

which generates two signals corresponding to the

detection task (“without leak” or “with leak”).

The steps required in neural network using L-M

algorithm in batch-mode training are the following:

 Compute the corresponding network outputs

and evaluate the mean square error for all

inputs as in equation (1);

 Calculate the Jacobian matrix j(x), where x

represents the weights and biases of the

network;

 Solve the equation which adapts weights in

order to obtain Δx, The update of the

weighted vector Δx is computed as follows:



































(4)

Where µ is the training parameter and R is a vector

of size pn

computed as follows:































































(5)

(x)J(x) is referred to as the Hessian matrix.

 Recalculate the error using x + Δx. If there is

the reduction of the error calculated in step 1,

the training parameter µ is reduced by µ

, keep

x = x + Δx and return to the step 1. If there is

not reduction, increase µ by µ

and go back to

step 3. µ

and µ

are fixed by the user;

 The algorithm is stopped in two cases; when

the gradient is less than the predefined value,

or when the error is reduced to some error

objective.

Generally, the training step in neural network is

very complex and it needs important computing

resources, especially in on-line case. In this work,

the training problem is realized in off-line mode,

then, the obtained neural model is used to detect

leakage in on-line mode. The adopted neural

network returns both the nominal behaviour

corresponding to the system without leakage and the

faulty system behaviour (occurrence of leakage).

3.2 Decision Block

The decision block is the most essential components

of the proposed scheme where the leaks in the intake

of air path are detected using the neural model

developed in the training step. Direct interactions are

established between the detection block and the

neural network model in order to estimate the actual

state of the system. The decision block works on

“detection mode” to distinguish between “No

Leakage” and “Leakage” modes.

4 APPLICATION

Critical operating mode system is considered to

illustrate the effectiveness of the proposed approach.

This mode concerns the case of low “load/engine

speed” couple, where the leak detection problem is

not systematically realized. In this application the

data acquisition is realized using xMOD tool

software.

ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics

320

4.1 xMOD Tool Software

xMOD is a software platform that was developed at

“IFP Energies Nouvelles” combining both

environmental of heterogeneous models integration

and a virtual experimentation laboratory. These

heterogeneous models are generated by different

simulation tools, like Matlab/Simulink, AMESim,

Dymola, SimulationX, GT Power … etc. An optimal

combination of these latter enables collecting the

advantages of each modeling and simulation tool,

and the user can freely select these tools.

In this work, xMOD is used to simulate the

diesel engine functioning, especially, the air path

behaviour. The simulation model produced by IFP

“Energies Nouvelles” is used on which a leak model

has been added. The diameter of the leak can freely

be adjusted. The results of simulation can be

recovered and stored on text files.

4.2 Mse Evolution: Selected Feature

Vs. All Features

Before presenting the results with selected features,

a comparison of MSE evaluation of a both all and

selected features is presented in table I.

In order to illustrate the advantage of feature

selection, the MSE values are jointly showed with

theirs training run times. In this table, we can firstly

observe that MSE values corresponding to the use of

all features are greater than MSE values when the

selected features are used. Secondly, we observe that

the run time corresponding to the use of all features

is always higher than when the selected features are

used. For example, when the torque value is set to

40Nm, all MSE values of the all features case are

greater than those corresponding to selected features

case. The same conclusion can be made for the

remained three cases except some values. The

detection task results are presented for the selected

features case.

4.3 Detection Task Results

The main property of the proposed approach is the

detection ability. In this situation, the neural network

trains two classes which are “No Leakage mode”

and “Leakage mode”. The training set consists of

10000 samples without leak and 10000 samples with

leak. We choose three values of leak, 0.1mm, 0.4mm

and 0.9mm.

1) Case1: Leak = 0.1mm:

Figure 2: Engine_speed = 1000 rpm & torque =110 Nm.

Table 1: MSE evolution with torque variation.

Torque

40Nm (MSE/Run Time)

110Nm (MSE/Run Time)

130Nm (MSE/ Run Time)

150Nm (MSE/Run Time)

Leaks

All Features

Selected

Features

All Features

Selected

Features

All Features

Selected

Features

All Features

Selected

Features

0.1mm

0.208/14’59”

0.0221/10’35”

0.168/12’40”

0.0166/10’41”

0.167/14’50”

0.0109/8’40”

0.00547/15’06”

0.00799/8’59”

0.2mm

0.0449/16’29”

0.0189/8’49”

0.00790/12:44

0.0147/8’55”

0.00988/13’10”

0.0163/8’45”

0.00083/15’47”

0.000812/10’15”

0.3mm

0.0270/18’44”

0.0114/9’19”

0.00608/12’52”

0.00871/9’44”

0.00298/16’23”

0.00266/8’45”

0.00430/14’17”

0.00165/10’03”

0.4mm

0.0205/15’24”

0.0204/9’25”

0.00239/14’39”

0.00413/10’15”

0.00219/13’55”

0.00401/10’53”

0.00273/13’23”

0.00419/11’38”

0.5mm

0.0186/18’41”

0.0127/8’34”

0.00458/14’02”

0.00369/7’57”

0.00156/12’22”

0.00222/10’23”

0.00208/13’44”

0.00428/11’25”

0.6mm

0.0121/15’50”

0.00807/9’30”

0.00447/14’17”

0.00315/9’52”

0.00351/14’12”

0.00104/9’34”

0.00656/15’00”

0.00221/11’04”

0.7mm

0.0180/15’57”

0.00404/9’43”

0.00669/16’11”

0.00234/9’07”

0.00100/8’02”

0.00102/8’40”

0.00601/14’33”

0.000873/10’01”

0.8mm

0.00867/17’26”

0.00727/8’21”

0.00584/13’42”

0.00198/9’54”

0.00099/7’18”

0.00101/9’56”

0.000897/13’35”

0.00193/9’52”

0.9mm

0.00618/14’30”

0.00910/8’56”

0.00451/13’32”

0.00415/9’33”

0.00211/13’28”

0.00131/8’58”

0.00270/13’36”

0.00087/11’29”

1.0mm

0.00989/13’11”

0.00638/9’14”

0.00217/15’01”

0.00111/9’43”

0.00102/13’10”

0.00099/5’09”

0.00392/13’15”

0.000896/8’41”

0 500 1000 1500 2000 2500 3000 3500

-0.2

0.2

0.4

0.6

0.8

1.2

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.0240

Feature Selection Combined with Neural Network for Diesel Engine Diagnosis

321

Figure 3: Engine_speed = 1000 rpm & torque =130 Nm.

Figure 4: Engine_speed = 1000 rpm & torque =150 Nm.

1) Case2: Leak = 0.4mm:

Figure 5: Engine_speed = 1000 rpm & torque =110 Nm.

Figure 6: Engine_speed = 1000 rpm & torque =130 Nm.

Figure 7: Engine_speed = 1000 rpm & torque =150 Nm.

1) Case3: Leak = 0.9mm:

Figure 8: Engine_speed = 1000 rpm & torque =110 Nm.

0 500 1000 1500 2000 2500 3000 3500

-8

-6

-4

-2

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.0103

0 500 1000 1500 2000 2500 3000 3500

-6

-4

-2

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.00746

0 500 1000 1500 2000 2500 3000 3500

-8

-6

-4

-2

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.00612

0 500 1000 1500 2000 2500 3000 3500

-8

-6

-4

-2

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.00392

0 500 1000 1500 2000 2500 3000 3500

-8

-6

-4

-2

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.0171

0 500 1000 1500 2000 2500 3000 3500

-1

-0.5

0.5

1.5

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.00226

ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics

322

Figure 9: Engine_speed = 1000 rpm & torque =130 Nm.

Figure 10: Engine_speed = 1000 rpm & torque =150 Nm.

Interpretation

The exposed figures (Fig.2 to Fig.10) show the

effectiveness of the proposed approach where we

can see that the leak is detected for all considered

diameters. Mean Squared Error (MSE) values give

information about the accuracy of the used neural

network. From the obtained results we can first

remark that the MSE values increase when the

torque values decrease. For example, in the first case

(Fig.2 to Fig.4) when the leak diameter is set to

0.1mm, the MSE value decreases from 0.0240

(2.4%) to 0.00746 (0.7%) when the torque increases

from 110Nm to 150Nm. this observation can be

explained by the fact that air path system

(compressor) works in reduced operating. On other

words, in the low speed, the mechanical compressor

of the air path is not solicited. The same remark is

applied in both cases 2 and 3.

Naturally, the leak is easily detected when it’s

important, but, it becomes strongly difficult to detect

it in very small case. The obtained results show that

proposed approach can effectively address the

problem and the leak is detected in all cases even

when it is equal to 0.1mm (almost negligible

leakage).

5 CONCLUSIONS

A leak detection approach for diesel air path has

been developed. The proposed approach contains

two blocks: training block and decision block. The

first one is realized off-line and combines feature

selection algorithm with neural network which based

on the Levenberg-Marquardt optimization. The L-M

function was chosen for its accuracy and adaptation;

it combines two different techniques according to

the current position of the solution compared to the

best one. The second block uses the neural model

obtained in training phase in order to detect leaks

that appear in air path system. The detection

capability is evaluated using the MSE index.

The proposed approach effectively achieves the

leak detection process, especially in the case of

small leaks in critical operating points (low

Torque/speed couple).

In a future work, this approach will be extended

in the case of leak characterisation in Diesel engine

air path. The final objective in this work is to

implement obtained algorithms in a real diesel

engine.

ACKNOWLEDGEMENTS

The authors thank the ANR (Agence Nationale de la

Recherche) for their support in this research project

DIVAS. The authors thank particularly Philippe

Moulin and Mongi Ben Gaid, research engineers

with "IFP Energies Nouvelles" for their help and

advice.

REFERENCES

Alexandrov, A., Gelbukh, A., and Lozovo, G, 2001. Chi-

square Classifier for Document Categorization. 2nd

International Conference on Intelligent Text

Processing and Computational Linguistics, Mexico

City.

Capriglione, D., Liguori, C., Pianese, C., and Pietrosanto,

A, 2003. On line sensor fault detection, isolation and

accommodation in automotive engines. IEEE Trans.

On Instruments and measurements. Vol. 52(4), pp.

1182-1189.

Chen, P. C., 2011. A novel diagnostic system for gasoline-

engine leakage detection. Journal of Automobile

Engineering. 225, Part D, pp. 673-685.

Isermann, R., 1984. Process fault detection based on

modelling and estimation methods: a survey. In

Automatica. Vol. 20(4), pp. 387-404.

Sangha, M., Yu, D. L., and Gomm, J. B, 2006. On-Board

0 500 1000 1500 2000 2500 3000 3500

-0.4

-0.2

0.2

0.4

0.6

0.8

1.2

1.4

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE = 0.00205

0 500 1000 1500 2000 2500 3000 3500

-4

-3

-2

-1

Samples

Class estimation

Unfaulty class estimation

Leakage class estimation

MSE 0.00160

Feature Selection Combined with Neural Network for Diesel Engine Diagnosis

323

monitoring and diagnosis for spark ignition air path

via adaptive neural networks. Journal of Automobile

Engineering. 220, Part D, pp. 1641-1655.

Sorsa, T., and Costin, H., N., 1993. Application of

artificial neural networks in process fault diagnosis. In

Automatica. Vol. 29(4), pp. 843-849.

Sugumaran, V., Muralidharan, V., and Ramachandran, K.

I, 2007. Feature selection using Decision Tree and

classification through Proximal Support Vector

Machine for fault diagnostics of roller bearing.

Mechanical Systems and Signal Processing, Vol. 21,

pp. 930–942.

Witten, I. H., and Eibe, F, 2005. Data Mining: Practical

Machine Learning Tools and techniques, Morgan

Kaufmann. 2

edition.

Xu, Z., Xuan, J., Shi, T., and Hu, Y, 2009. Application of

a modified fuzzy ARTMAP with feature-weight

learrning for the fault diagnosis of bearning. Expert

Systems with Applications, Vol. 36, pp. 9961–9968.

ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics

324