Feature Selection Combined with Neural Network for Diesel Engine
Diagnosis
M. Benkaci
1
and G. Hoblos
1,2
1
Institut de Recherche en Systèmes Electroniques Embarqués, Avenue Galilée, Saint-Etienne du Rouvray, France
2
Ecole Supérieure d’Ingénieurs en Génie Électrique, Avenue Galilée, Saint-Etienne du Rouvray, France
Keywords: Leaks Detection, Automotive Diagnosis, Feature Selection, Neural Data Classification, Diesel Air Path.
Abstract: The Feature selection is an essential step for data classification used in fault detection and diagnosis process.
In this work, a new approach is proposed which combines a feature selection algorithm and neural network
tool for leaks detection task in diesel engine air path. The Chi
2
is used as feature selection algorithm and the
neural network based on Levenberg-Marquardt is used in system behaviour modelling. The obtained neural
network is used for leaks detection. The model is learned and validated using data generated by xMOD.
This tool is used again for test. The effectiveness of proposed approach is illustrated in simulation when the
system operates on a low speed/load and the considered leak affecting the air path is very small.
1 INTRODUCTION
In order to reduce air pollution caused by
automotive engine, several legislations are
introduced. The first legislation is proposed by the
California Air Resources Board in 1970; it has been
continuously updated and it became very strict.
Since 1993, marked by the introduction of the Kyoto
Protocol, the European anti-pollution standards are
becoming more stringent where the authorized
emissions of a diesel vehicle are decreased from
(NOx = nil, CO =2720, HC+NOx=970, PM=140) in
Euro1 standard to (NOx = 80, CO =500,
HC+NOx=170, PM=5) in Euro6 standard. Typically,
each fault that increases the emission level must be
detected and isolated. The leaks in the intake canal
are among the most difficult faults to manage.
The ability of neural networks to approximate a
nonlinear function propels it to become one of the
best tools for fault detection and isolation. By
exploring these artificial intelligence techniques,
another class of fault detection and isolation
algorithms is appeared. In (Isermann, 1984),
Isermann discussed the superior features of the
neural networks in fault classification and
recognition. Sorsa and Costin (Sorsa and Costin,
1993) show the capabilities of supervised neural
networks as Multilayers Perceptron (MLP) and
Radial Basis Function (RBF) to perform a good and
effective fault detection and isolation tasks. Another
work using MLP network is presented in
(Capriglione et al., 2003). The RBF was used in on-
board fault diagnosis for the air path of spark
ignition engine (Sangha et al., 2006). The leakage
problem of gasoline-engine is treated in (Chen,
2011) where the neural network based on the
steepest-descent method combined with a back
propagation algorithm is developed to train three
detection systems.
Because of an increased complexity of the today
engines which are characterized by an important
number of sensors, the reduction of the acquired data
becomes essential. Feature selection is one of
important step before beginning a data classification
task, especially when this task is dedicated to fault
detection and isolation (FDI). It refers to the
problem of selecting the input features that are most
predictive for given outcome. The feature selection
problems are found in all supervised and
unsupervised machine learning which include
classification, regression, time-series, prediction and
clustering. The feature selection tasks try to achieve
three main purposes: reduce the cost of extracting
features, improve the classification accuracy and the
reliability of the estimated performances.
There are many works that used the feature
selection algorithm for fault detection and diagnosis.
In reference (Christina and Tshlizidzi, 2006), a new
approach for intrusion detection and diagnosis is
317
Benkaci M. and Hoblos G..
Feature Selection Combined with Neural Network for Diesel Engine Diagnosis.
DOI: 10.5220/0004042703170324
In Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2012), pages 317-324
ISBN: 978-989-8565-21-1
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
proposed. In (Sugumaran et al., 2007), the authors
use the decision tree to identify the best features in
classification task, they use a proximal support
vector machine characterized by its capability to
classify efficiently the faults in Roller Bearning
system.
In this paper, new methodology dealing with
small leaks detection problem in diesel air path is
developed. To achieve this goal, a new scheme
based on neural network technique is proposed. The
nominal mode (without leak) and leakage mode
corresponding to several diameters of leak were
trained using a Levenberg-Marquardt algorithm.
Before using the acquired data, a feature selection
task is proposed in order to reduce the complexity of
the problem. The main challenge of the proposed
approach is the use of a selected sensors leading to a
reduced cost. The data of two considered modes are
generated using xMOD platform which will be
described later.
The paper is organized in this way. First, the
considered problem is presented in section 2.
Section 3 describes the proposed approach in details
where a brief description of neural networks used,
which based on steepest-descent and Gauss-Newton
method, is given and main detection scheme is
illustrated. After a brief description of xMOD tool
used in engine data collecting, the section 4 gives
some obtained results using our approach. Then,
these results are discussed and commented in order
to illustrate the effectiveness of leakage detection.
2 PROBLEM STATEMENT
For several years, the anti-pollution standards are
dramatically increased and the constraints in
automotive industry become very complex. The
main objective of these standards is to reduce the
emissions level of cars. In the case of diesel engines,
there are several pollutants: carbon monoxide,
unburned hydrocarbons, nitrogen oxides (NOx) and
diesel particulates mater. Usually, the emissions
level proportionally increases with the appearance of
faults in diesel engines, more precisely in diesel air
path. These faults can be due to sensor failures,
actuator failures or system degradation. In this
paper, the last failures class is considered. More
precisely, the leakage detection in diesel air path is
studied. This failure can cause multiple non-desired
system behaviour. In addition to the high emissions
level, this failure causes multiple non-desired effects
such as:
Operating points changing of the air path
subsystems,
Incomplete combustion in cylinders,
Appearance of smoke and the reduction of
performances.
Often, this type of failure can be confused with
the two other types of faults, i.e. sensors or
actuators; consequently, it is very important to
distinguish this fault from others.
Additionally to the main objective of this paper,
the feature selection problem is considered. We all
know that today’s vehicles are characterized by an
increased complexity justified by the important
number of embedded sensors which grow
significantly. Consequently, the uses of selected
subset of sensors data which are correlate the
considered problem is widely desired in such
applications.
In this work, our main objective is to detect air
leaks in diesel air path regardless of their diameters.
Before performing leaks detection, we make a
feature selection in order to reduce the data
complexity.
It is important to specify that, for this
application, small leaks are hidden and are very
difficult to detect because of phenomenon of non-
solicitation system.
3 PROPOSED APPROACH
Nowadays, the neural network is an essential tool
used in many research activities for industrial
complex systems. An advantage of using neural
network to detect faults of systems is that it can get
the knowledge denoted by data. Over and above
remembering ability of learned information, a neural
network has both, ability to generalize an obtained
model and apply the associative property into the
available memory. The error tolerance,
characterizing the neural network, effectively treats
the errors of the model. Additionally, it can perform
a nonlinear mapping and also learn dynamic
behaviors in order to generalize the obtained models.
Generally, the collected data for detection
process are noisy, but, the error tolerance ability of
neural network makes the detection scheme be able
to differentiate the pattern from noise. So, the last
property is a huge advantage in fault detection and
isolation problem. Additionally, the similar patterns
are separated using a generalization property
characterizing a neural network.
The leak detection in intake system is very
difficult to achieve especially when the operating
point corresponds to low load-torque couple. In
ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics
318
these conditions, the compressor in air path is not
solicited by the driver, allowing to a similar pressure
between the intake system and atmospheric one.
This constraint needs an improved detection system.
The proposed system must, increase the accuracy of
model, enhance the performances of vehicle and
grantee the management of small leaks. In this
paper, the Levenberg-Marquardt (LM) algorithm
(Levenberg, 1944) is proposed to realize the
detection tasks. The LM algorithm is used to train
the air path diesel dynamics. Ones the dynamics are
modelled, the leak is detected by comparing the new
measurements with model established using neural
network. The proposed approach contains two
blocks which are the training block and the decision
block. This approach is shown in this diagram.
Nominal data
acquisition
Faulty data
acquisition
Real Data Acquisition of
Selected Sensors (Features)
Training Block
Offline
Testing Block
online
Neural Model
Detection Task
Interaction
Interaction
Decision Block
NN
Decision
Making
Feature
selection
Figure 1: Detection scheme.
The proposed approach is designed to operate on
on-line mode, thus, a classical process of real data
acquisition is adopted in this work. It is important to
remember that in this application we only use
sensors selected by feature selection algorithm and
summarize the intake behaviour of diesel air path.
The data acquired in this step are sent to the decision
block in order to detect the leaks affecting vehicle.
3.1 Training Block
3.1.1 Feature Selection
In this work, a most popular feature selection is
chosen; it is Chi2 algorithm (Alexandrov et al.,
2001). Chi2 is simple and general algorithm which
achieves feature ranking using a discretization
process. This algorithm is combined with neural
network classifier to select the feature that we must
keep.
The Chi2 algorithm is based on the X
2
which
runs on two stages in this manner:
Stage1:
1) Set the sigLevel to 0.5 for all features;
2) Sort each feature according to its values;
3) Compute the X
2
value for every pair of adjacent
intervals;






(1)
Where:
k: number of class;
A
ij
: number of samples in the ith interval and
jth class;
R
i
: number of samples in the ith interval;
G
j
: number of samples in the jth class;
N: total number of samples;


(2)
4) Merge the pair of adjacent interval with the
lowest X
2
value until the X
2
value of each pair
of adjacent intervals exceeds sigLevel;
This process is repeated by decreasing sigLevel until
inconsistency rate δ is exceeded in the discretized
data.
Stage2:
5) Start with the sigLevel0 corresponding to the
last value of sigLevel determined in the first
stage;
6) Associate sigLevel(i) with each features and
run merging;
7) Consistency test:
If inconsistency < δ merge intervals and
decrease sigLevel(i);
Else if inconsistency > δ eliminate the ith
features for the next step.
Firstly, the WEKA (Witten et al., 2005) data
mining tool is used to perform Chi2 ranking. The
features are sorted according to their rank. Secondly
the most important features are selected using the
neural network classifier which will be described
later. More precisely, the features will be eliminated
iteratively from least important to most important
and the weight of the eliminated feature is evaluated
according to the obtained classification Mean
Squared Error (MSE).
3.1.2 Training
Pattern classification using neural network aims to
determinate the class boundaries by the classifier.
The training phase of neural network achieves this
Feature Selection Combined with Neural Network for Diesel Engine Diagnosis
319
goal. In this paper, a gradient-based training
algorithm is used. This category of algorithms is
most commonly used by researchers. One of these
algorithms is Hessian-based algorithms; they can
significantly reduce the convergence time. The
Levenberg-Marquardt algorithm belongs to Hessian-
based techniques; it makes use the advantages of
Hessian-based algorithm in the optimization of
nonlinear least squares.
The Levenberg-Marquardt algorithm is a well-
known optimization technique. It locates the
minimum of a function which is expressed by the
sum of squares of nonlinear functions. This
algorithm, widely used in several disciplines, is a
combination of Steepest-Descent with Gauss-
Newton method. According to the current position
compared with the correct one, these techniques act
by intermittently; if the current position is far then
the correct one the steepest-Descent is applied, by
against, if it’s neighboring the current solution, the
Gauss-Newton takes over. The Steepest-Descent
technique used in LM algorithm is slow, but it
guarantees the convergence property. When the
current position becomes near to correct one the LM
algorithm switches to Gauss-Newton method which
converges rapidly.
For the neural network training the objective
function is the error of the type:






Where y
kl
are real data of diesel engines, a
kl
are a
network output, p is the total number of samples and
n
0
represents the total number of nodes in the output
layer.
In this work, the used neural network contains
five layers. The first layer is the input layer which
receives the data corresponding to the selected
sensors which are used in this application. The three
following layers are the hidden ones which represent
the network core. The last one is the output layer
which generates two signals corresponding to the
detection task (without leak or with leak).
The steps required in neural network using L-M
algorithm in batch-mode training are the following:
Compute the corresponding network outputs
and evaluate the mean square error for all
inputs as in equation (1);
Calculate the Jacobian matrix j(x), where x
represents the weights and biases of the
network;
Solve the equation which adapts weights in
order to obtain Δx, The update of the
weighted vector Δx is computed as follows:




(4)
Where µ is the training parameter and R is a vector
of size pn
0
computed as follows:










(5)
J
T
(x)J(x) is referred to as the Hessian matrix.
Recalculate the error using x + Δx. If there is
the reduction of the error calculated in step 1,
the training parameter µ is reduced by µ
-
, keep
x = x + Δx and return to the step 1. If there is
not reduction, increase µ by µ
+
and go back to
step 3. µ
+
and µ
-
are fixed by the user;
The algorithm is stopped in two cases; when
the gradient is less than the predefined value,
or when the error is reduced to some error
objective.
Generally, the training step in neural network is
very complex and it needs important computing
resources, especially in on-line case. In this work,
the training problem is realized in off-line mode,
then, the obtained neural model is used to detect
leakage in on-line mode. The adopted neural
network returns both the nominal behaviour
corresponding to the system without leakage and the
faulty system behaviour (occurrence of leakage).
3.2 Decision Block
The decision block is the most essential components
of the proposed scheme where the leaks in the intake
of air path are detected using the neural model
developed in the training step. Direct interactions are
established between the detection block and the
neural network model in order to estimate the actual
state of the system. The decision block works on
“detection mode” to distinguish between “No
Leakage” and “Leakage” modes.
4 APPLICATION
Critical operating mode system is considered to
illustrate the effectiveness of the proposed approach.
This mode concerns the case of low load/engine
speed couple, where the leak detection problem is
not systematically realized. In this application the
data acquisition is realized using xMOD tool
software.
ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics
320
4.1 xMOD Tool Software
xMOD is a software platform that was developed at
“IFP Energies Nouvelles combining both
environmental of heterogeneous models integration
and a virtual experimentation laboratory. These
heterogeneous models are generated by different
simulation tools, like Matlab/Simulink, AMESim,
Dymola, SimulationX, GT Power … etc. An optimal
combination of these latter enables collecting the
advantages of each modeling and simulation tool,
and the user can freely select these tools.
In this work, xMOD is used to simulate the
diesel engine functioning, especially, the air path
behaviour. The simulation model produced by IFP
Energies Nouvelles is used on which a leak model
has been added. The diameter of the leak can freely
be adjusted. The results of simulation can be
recovered and stored on text files.
4.2 Mse Evolution: Selected Feature
Vs. All Features
Before presenting the results with selected features,
a comparison of MSE evaluation of a both all and
selected features is presented in table I.
In order to illustrate the advantage of feature
selection, the MSE values are jointly showed with
theirs training run times. In this table, we can firstly
observe that MSE values corresponding to the use of
all features are greater than MSE values when the
selected features are used. Secondly, we observe that
the run time corresponding to the use of all features
is always higher than when the selected features are
used. For example, when the torque value is set to
40Nm, all MSE values of the all features case are
greater than those corresponding to selected features
case. The same conclusion can be made for the
remained three cases except some values. The
detection task results are presented for the selected
features case.
4.3 Detection Task Results
The main property of the proposed approach is the
detection ability. In this situation, the neural network
trains two classes which are “No Leakage mode”
and “Leakage mode”. The training set consists of
10000 samples without leak and 10000 samples with
leak. We choose three values of leak, 0.1mm, 0.4mm
and 0.9mm.
1) Case1: Leak = 0.1mm:
Figure 2: Engine_speed = 1000 rpm & torque =110 Nm.
Table 1: MSE evolution with torque variation.
Torque
40Nm (MSE/Run Time)
110Nm (MSE/Run Time)
130Nm (MSE/ Run Time)
150Nm (MSE/Run Time)
Leaks
All Features
Selected
Features
All Features
Selected
Features
All Features
Selected
Features
All Features
Selected
Features
0.1mm
0.208/14’59”
0.0221/10’35”
0.168/12’40”
0.0166/10’41”
0.167/14’50”
0.0109/8’40”
0.00547/15’06”
0.00799/8’59”
0.2mm
0.0449/16’29”
0.0189/8’49”
0.00790/12:44
0.0147/8’55”
0.00988/13’10”
0.0163/8’45”
0.00083/15’47”
0.000812/10’15”
0.3mm
0.0270/18’44”
0.0114/9’19”
0.00608/12’52”
0.00871/9’44”
0.00298/16’23”
0.00266/8’45”
0.00430/14’17”
0.00165/10’03”
0.4mm
0.0205/15’24”
0.0204/9’25”
0.00239/14’39”
0.00413/10’15”
0.00219/13’55”
0.00401/10’53”
0.00273/13’23”
0.00419/11’38”
0.5mm
0.0186/18’41”
0.0127/8’34”
0.00458/14’02”
0.00369/7’57”
0.00156/12’22”
0.00222/10’23”
0.00208/13’44”
0.00428/11’25”
0.6mm
0.0121/15’50”
0.00807/9’30”
0.00447/14’17”
0.00315/9’52”
0.00351/14’12”
0.00104/9’34”
0.00656/15’00”
0.00221/11’04”
0.7mm
0.0180/15’57”
0.00404/9’43”
0.00669/16’11”
0.00234/9’07”
0.00100/8’02”
0.00102/8’40”
0.00601/14’33”
0.000873/10’01”
0.8mm
0.00867/17’26”
0.00727/8’21”
0.00584/13’42”
0.00198/9’54”
0.00099/7’18”
0.00101/9’56”
0.000897/13’35”
0.00193/9’52”
0.9mm
0.00618/14’30”
0.00910/8’56”
0.00451/13’32”
0.00415/9’33”
0.00211/13’28”
0.00131/8’58”
0.00270/13’36”
0.00087/11’29”
1.0mm
0.00989/13’11”
0.00638/9’14”
0.00217/15’01”
0.00111/9’43”
0.00102/13’10”
0.00099/5’09”
0.00392/13’15”
0.000896/8’41”
0 500 1000 1500 2000 2500 3000 3500
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.0240
Feature Selection Combined with Neural Network for Diesel Engine Diagnosis
321
Figure 3: Engine_speed = 1000 rpm & torque =130 Nm.
Figure 4: Engine_speed = 1000 rpm & torque =150 Nm.
1) Case2: Leak = 0.4mm:
Figure 5: Engine_speed = 1000 rpm & torque =110 Nm.
Figure 6: Engine_speed = 1000 rpm & torque =130 Nm.
Figure 7: Engine_speed = 1000 rpm & torque =150 Nm.
1) Case3: Leak = 0.9mm:
Figure 8: Engine_speed = 1000 rpm & torque =110 Nm.
0 500 1000 1500 2000 2500 3000 3500
-8
-6
-4
-2
0
2
4
6
8
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.0103
0 500 1000 1500 2000 2500 3000 3500
-6
-4
-2
0
2
4
6
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.00746
0 500 1000 1500 2000 2500 3000 3500
-8
-6
-4
-2
0
2
4
6
8
10
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.00612
0 500 1000 1500 2000 2500 3000 3500
-8
-6
-4
-2
0
2
4
6
8
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.00392
0 500 1000 1500 2000 2500 3000 3500
-8
-6
-4
-2
0
2
4
6
8
10
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.0171
0 500 1000 1500 2000 2500 3000 3500
-1
-0.5
0
0.5
1
1.5
2
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.00226
ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics
322
Figure 9: Engine_speed = 1000 rpm & torque =130 Nm.
Figure 10: Engine_speed = 1000 rpm & torque =150 Nm.
Interpretation
The exposed figures (Fig.2 to Fig.10) show the
effectiveness of the proposed approach where we
can see that the leak is detected for all considered
diameters. Mean Squared Error (MSE) values give
information about the accuracy of the used neural
network. From the obtained results we can first
remark that the MSE values increase when the
torque values decrease. For example, in the first case
(Fig.2 to Fig.4) when the leak diameter is set to
0.1mm, the MSE value decreases from 0.0240
(2.4%) to 0.00746 (0.7%) when the torque increases
from 110Nm to 150Nm. this observation can be
explained by the fact that air path system
(compressor) works in reduced operating. On other
words, in the low speed, the mechanical compressor
of the air path is not solicited. The same remark is
applied in both cases 2 and 3.
Naturally, the leak is easily detected when it’s
important, but, it becomes strongly difficult to detect
it in very small case. The obtained results show that
proposed approach can effectively address the
problem and the leak is detected in all cases even
when it is equal to 0.1mm (almost negligible
leakage).
5 CONCLUSIONS
A leak detection approach for diesel air path has
been developed. The proposed approach contains
two blocks: training block and decision block. The
first one is realized off-line and combines feature
selection algorithm with neural network which based
on the Levenberg-Marquardt optimization. The L-M
function was chosen for its accuracy and adaptation;
it combines two different techniques according to
the current position of the solution compared to the
best one. The second block uses the neural model
obtained in training phase in order to detect leaks
that appear in air path system. The detection
capability is evaluated using the MSE index.
The proposed approach effectively achieves the
leak detection process, especially in the case of
small leaks in critical operating points (low
Torque/speed couple).
In a future work, this approach will be extended
in the case of leak characterisation in Diesel engine
air path. The final objective in this work is to
implement obtained algorithms in a real diesel
engine.
ACKNOWLEDGEMENTS
The authors thank the ANR (Agence Nationale de la
Recherche) for their support in this research project
DIVAS. The authors thank particularly Philippe
Moulin and Mongi Ben Gaid, research engineers
with "IFP Energies Nouvelles" for their help and
advice.
REFERENCES
Alexandrov, A., Gelbukh, A., and Lozovo, G, 2001. Chi-
square Classifier for Document Categorization. 2nd
International Conference on Intelligent Text
Processing and Computational Linguistics, Mexico
City.
Capriglione, D., Liguori, C., Pianese, C., and Pietrosanto,
A, 2003. On line sensor fault detection, isolation and
accommodation in automotive engines. IEEE Trans.
On Instruments and measurements. Vol. 52(4), pp.
1182-1189.
Chen, P. C., 2011. A novel diagnostic system for gasoline-
engine leakage detection. Journal of Automobile
Engineering. 225, Part D, pp. 673-685.
Isermann, R., 1984. Process fault detection based on
modelling and estimation methods: a survey. In
Automatica. Vol. 20(4), pp. 387-404.
Sangha, M., Yu, D. L., and Gomm, J. B, 2006. On-Board
0 500 1000 1500 2000 2500 3000 3500
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE = 0.00205
0 500 1000 1500 2000 2500 3000 3500
-4
-3
-2
-1
0
1
2
3
4
5
Samples
Class estimation
Unfaulty class estimation
Leakage class estimation
MSE 0.00160
Feature Selection Combined with Neural Network for Diesel Engine Diagnosis
323
monitoring and diagnosis for spark ignition air path
via adaptive neural networks. Journal of Automobile
Engineering. 220, Part D, pp. 1641-1655.
Sorsa, T., and Costin, H., N., 1993. Application of
artificial neural networks in process fault diagnosis. In
Automatica. Vol. 29(4), pp. 843-849.
Sugumaran, V., Muralidharan, V., and Ramachandran, K.
I, 2007. Feature selection using Decision Tree and
classification through Proximal Support Vector
Machine for fault diagnostics of roller bearing.
Mechanical Systems and Signal Processing, Vol. 21,
pp. 930942.
Witten, I. H., and Eibe, F, 2005. Data Mining: Practical
Machine Learning Tools and techniques, Morgan
Kaufmann. 2
nd
edition.
Xu, Z., Xuan, J., Shi, T., and Hu, Y, 2009. Application of
a modified fuzzy ARTMAP with feature-weight
learrning for the fault diagnosis of bearning. Expert
Systems with Applications, Vol. 36, pp. 99619968.
ICINCO 2012 - 9th International Conference on Informatics in Control, Automation and Robotics
324