A Deep Learning based Approach for Biometric Recognition using
Hybrid Features
Mrityunjay Kumar
1
, Arvind Kumar Tiwari
2
1,2
Department of CSE, Sultanpur(India)
Keywords: Biometrics, Biometric Recognition, Identification Techniques, Multimodal Identification, Deep Learning,
CNN.
Abstract: Biometric authentication and identification is most important and challenging problem in this evolved era of
computer technology. Goal of new technical developments is to make our task easy and life smoother. It is
important to develop an efficient computational method to recognize and identify biometrics more
efficiently with least time delay. This paper proposed a CNN based multimodal biometric identification
system using feature fusion of three biometric traits Faces, Fingerprints, and Iris. In this paper PCA and WT
are used for feature extraction and feature fusion respectively. The accuracy of the proposed approach is
about 96.67% on fused features of three biometric traits Faces, Fingerprints, and Iris. The proposed
approach in this paper provides better accuracy in compare to the existing method in literature.
1 INTRODUCTION
Biometrics is the method of identifying an individual
using physical or behavioral features of an
individual like fingerprints, faces, gait, and voice
etc. Base of biometrics systems exists in the
permanence and uniqueness of the physiological and
behavioral traits of individuals. The technological
developments in the modern era are very fast. This
fast developing era need more fast, secure and
reliable identification systems in variety of
requirements like Airports, International border
crossings, Law enforcement agencies, Commercial
places like banks and other business applications,
availing benefits from government's social service
schemes etc. Biometrics has capacity to handle large
identity management systems and for this reason
identification systems based on biometrics placed
themselves in the position where no competitor
exists. Though it is not completely a novel idea to
use biometrics for identification, there are evidences
for more than thousands years ago it was in use in
some form but not exactly same or nearly same as it
is in modern era. It is about 50 years ago when IBM
first proposed that a remote computer system may
use for the identification purpose of the human (Jain,
2007).
Evidences of biometrics are older than the
centuries. In some ancient caves, there were some
traces of claws in front of the artists who have been
created about 31000 years ago, such as the modern
painters used to prove their signature identity on
their created paintings.There is also proofs regarding
use of fingerprints for identifying individual’s
around 500 B.C. Business transactions in
Babylonian civilization used clay tablets to record
fingerprints (
Maio, 2004).
Biometrics recognition is most popular tool for
human identification and verification in modern era
for so many reasons. Some of the obvious reasons
are performance, reliability, real time computability,
and security. General thinking about biometric traits
are that they are unique and permanent. But in
reality in spite of having sufficient amount of
uniqueness the biometric systems are not sufficiently
reliable in terms of permanence of human biometric
traits, both behavioral as well as physiological.
There are numerous researches which proven the
degradability of common biometric traits. By this
reason the identification/recognition process using
biometrics becomes not to trusted fully. In most of
the cases the Genuine Acceptance Rate(GAR) is not
100%, and it always also contains some False
Acceptance Rate(FAR). So there is always a better
model possible with respect to a existing
Kumar, M. and Tiwari, A.
A Deep Learning based Approach for Biometric Recognition using Hybrid Features.
DOI: 10.5220/0010567900003161
In Proceedings of the 3rd International Conference on Advanced Computing and Software Engineering (ICACSE 2021), pages 273-282
ISBN: 978-989-758-544-9
Copyright
c
2022 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
273
identification/verification system using biometric
recognition.
In the modern era traits like fingerprints, faces,
iris, palm prints, hand geometry, DNA, voice
samples etc. may be used for biometric
authentication. A biometric recognition system using
only single trait as identification/verification tool has
a high frequency of failure because of the changing
nature for the considered trait. For e.g. suppose we
develop a system which uses fingerprint as a
biometric trait for identification for a group of
persons which include multiple job class persons
(Some of them are might physical workers like
labors etc). People who work in rugged conditions
have high rate of change in their fingerprints, which
will increase the failure rate of the system. Similarly
systems based on Face and Iris may suffer from
problems of their own.
Biometric traits may even used in various forms
like in a way where only one trait is used for
identification generally referred as uni-modal or any
combination of two or more traits generally referred
as multimodal of biometric identification. Systems
using multiple biometric traits for identification are
more reliable as compared to systems with single
biometric traits. This is because suppose fingerprint
of a person sufficiently changed due to his/her
working conditions their face and iris will be there
for the identification of the person. Similarly if there
is any major change in face then there are
fingerprints and iris are there for identification and
so on. There are a number of reasons which makes a
multimodal biometrics system more reliable, few of
them are:
1. A result from obtained from combination of
multiple traits is more acceptable than a single trait
system.
2. If a person somehow lost his any trait then we
are still capable of verifying his identity with the
help of remaining traits in multimodal system.
3. A multimodal provides high security against
forgery because spoofing becomes more difficult for
a person entering to the system and claiming a
registered identity.
This paper is about developing an identification
system based on multimodal which used feature
fusion technique. The traits which are used in this
method are faces, fingerprints and iris. For the
purpose of increasing efficiency of the system we
used intra class variations by using five various
poses of faces, five different fingerprints and two
different images of each eye. Fig.1 presents a
summary of proposed model.
This model used a deep learning based feature
fusion technique for identification. More specifically
it uses CNN of deep learning methods.
2 RELATED WORKS
Jain et al. (1996) described a two stage on line
verification system based on fingerprints. The first
stage is minutia extraction and the second one is
minutia matching. It used a fast and reliable
algorithm for feature minutia extraction, which
results improvement in Ratha et al. algorithm and
for minutia matching, they developed an elastic
matching algorithm based on alignment. It directly
correlates the stored template with the input image
omitting the expansive search. It is also capable of
dealing with the nonlinear deformations and inexact
pose transformations between fingerprints. This
method was very efficient in terms of reliability as
well as time complexity. The average verification
time is reduced up to about eight seconds using
SPARC20 workstation (
Jain, 1996).
Van der Putte et al.(2000) presented a paper
which examines how biometrics systems based on
fingerprints can be fooled. It categorizes the process
in two categories which includes with the co-
operation of fingerprint owner (In the cases of
attendance monitoring systems) and without the co-
operation of fingerprint owner( In the applications
involving authentication purposes eg. PDS systems
etc).It is possible to easily store the fingerprint
sequences on smart cards and it is very much
possible to read this smart card via a solid state
fingerprint scanner. It categorizes the counterfeiting
in two parts, first is duplication with Co-operation
and second is duplication without Co-operation. In
both the cases the duplicate image creation of
fingerprint is possible. In the first case with the help
of wafer thin silicon dummy is used to take samples
of fingerprint and further used when required. In the
second case it is possible always to make duplicate
copies of finger prints with the help of some storage
devices by associating it with scanner. It is also
possible to collect samples of fingerprints from a
surface by using stamp type materials (
Van, 2000).
Liu et al. (2001) presented a paper on face
recognition which combines shape and texture
features which is Enhanced Fisher Classifier (EFC).
Face geometry contains the shape while shape-free
normalized images are provided by texture.
Dimensions of shape and texture spaces are reduced
by PCA and enhanced Fisher linear discriminant
model is used enhancing generalization. The great
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
274
benefit of the method is that it achieves accuracy of
98.5% and using just 25 features (Liu, 2001).
Blanz and Vetter (2003) presented a mechanism
for face recognition, which is capable to work for
varying poses and illuminations. Wide range of
variations and varying illumination level requires to
simulation of image formation in 3D space. For this
simulation purpose computer graphics is used.
Efficiency of the method is judged on three different
views: front , side, and profile. The front view
performed better than two other with a success rate
of 95% , whether profile view is the lowest success
rate with 89% (
Blanz, 2003).
Daugman(2004) presented a study and
observations on working of iris recognition and its
performance. The author examined the problem of
finding the eye portion in an image in briefly by
developing concepts and appropriate equations. In
the later phase of the paper the author presented a
speed performance summary for various operations
performed during the process in which XOR
comparison of two Iris Codes takes minimum time
which is 10 micro seconds while Demodulation and
Iris Code creation takes a maximum of 102 mili
seconds (
Daugman, 2004).
Daugman(2006) presented a paper which
examined the randomness and uniqueness if Iris
Codes. The author of the paper had taken 200 billion
Iris pairs for their comparison work. This paper is
helpful in finding false matches in iris recognition
for large database. Daugman developed his own
algorithm for the purpose named Daugman
Algorithm and it is found that over 1 million
comparisons there is a maximum of 1 false match
occurred (
Daugman, 2006).
(Shams et.al. 2016) presented an experimental
work for biometric identification which used a
multimodal based on Face, Iris, and Fingerprints.
This experimental work used SDUMLA-hmt
database, where data is present in the form of
images. The images are preprocessed by using
Canny edge detection and Hough Circular
Transform. Further, they used Local Binary Pattern
with Variance(LBPV) histograms for feature
extraction. Separately extracted features are fused
together. Feature reduction is accomplished by
LBPV histograms. Combined Learning Vector
quantization classifier is used for classification and
matching purpose. The system was able to achieve
GAR 99.50% with minimum elapsed time 24
Seconds (
Shams, 2016).
(Choi et.al. 2015) presented a multimodal
biometric authentication system based on face and
gesture. Gesture is represented by various frames
from one pose to another. This work is capable of
accepting faces and gestures from moving videos.
HOG descriptor is used for representation of gesture.
4-Fold Cross Validation is used for validation in this
work. The performance of the system is about
97.59% -99.36% for multimodal using face and
gesture. The whole work is performed on a self
made database of 80 videos from 20 different
objects (
Choi, 2015).
(Khoo et.al. 2018) presented a multimodal
biometric system based on iris and fingerprints
which uses feature level fusion for modal
development. Indexing-First-One (IFO) hashing and
integer value mapping is used for the purpose.
CASIA-V3 Iris database and FVC 2002 fingerprint
database is used in model development. The main
reason behind use of IFO hash function is its
capacity survival against many attacks methods like
SHA and ARM. The equal error rate (EER) of the
system is provided in the paper which is 0.3842 for
Iris, 0.9308 for Fingerprints and 0.8 for IFO hash
function. There is no description provided about
elapsed time (Khoo, 2018).
(Ammour et.al. 2017) presented a paper for
biometric identification based on face and iris. Face
recognition is performed by three methods discrete
cosine transform (DCT), PCA and PCA in DCT, and
Iris recognition is also performed by three methods
which are Hough, Snake and distance regularized
level set (DRLS). They used ORL and CASIA-V3-
Interval dataset for their experimental work. Fusion
is applied at matching score level in this work. Face
recognition results with PCA is 91%, with DCT is
94% and with PCA in DCT is 93% with recognition
times 0.055s, 2.623s and 3.012s respectively. Iris
recognition results with Hough are 81%, with Snake
is 87% and with DRLS is 80% with recognition time
15.82s, 15.78s, and 16.52s respectively. In the
multimodal the recognition rate of Z-score
normalization is maximum and it is 98% (Ammour,
2017).
(Parkavi et.al. 2017) presented a biometric
identification system based on two traits fingerprint
and iris. Separate templates of fingerprints and iris
are obtained by minutiae matching and edge
detection. Decision level score fusion is applied for
decision making. They are able achieve accuracy of
97%, but the size of dataset and time complexity is
mentioned nowhere (Parkavi, 2017).
(Sultana et.al. 2017) presented a multimodal
biometrics system based on face, fingerprint and a
very rare trait social behavior. The social behavioral
trait is obtained by a social network and combined
with traditional traits faces and fingerprints. The
A Deep Learning based Approach for Biometric Recognition using Hybrid Features
275
social behavioral data is obtained by various social
media platforms a user is associated with. The key
idea is that two people having similar social behavior
profile has very less chance of similarity of face and
ear and vice- versa. The model performance accuracy
is about 92%, while time complexity related aspects
have not been discussed (Sultana, 2017).
(Gunasekran et.al. 2019) presented a multimodal
biometrics recognition using deep learning approach
for traits like faces, fingerprints and iris. Images are
taken from CASIA dataset. Contourlet transform is
used for preprocessing of images. Histograms are
obtained and weighted rank level fusion is applied
for combining the key features. Deep learning is
applied for matching. It achieved up to 96% of
accuracy when dataset size reaches to 500. The best
thing is that they achieved time in milliseconds. The
maximum time it took was 49.2 milliseconds. The
key outcome of this work is that, deep learning
based approach performs better with increasing size
of dataset, while time complexity increased very
slightly (Gunasekran, 2019).
(Cheniti et.al. 2017) presented a multimodal
biometric system using face and fingerprint using
symmetric sum-based biometric score fusion. Score
level fusion is tested on two different partitions of
NIST-BSSR1 database, which are NIST-Multimodal
database and NIST-Fingerprint database. GAR of
99.8% is obtained by S-sum generated by Schweizer
& Sklar t-norm (Cheniti, 2017).
(Zhang et.al. 2017) proposed a multi-task and
multivariate model for biometric recognition using
low rank and joint sparse representations. This
experimental work used three different datasets,
WVU, UMDAA-01, and Pascal-Sentence in non-
weighted and weighted categories. This work is
based on multiple traits and traits are not specialized.
They used different combinations of biometric traits
from different datasets, like fingerprint, iris and
faces. The modal performed variably for different
datasets and variance is about 18% with changing
datasets. Its recognition rate is 99.80% for both non-
weighted and weighted categories in WVU dataset,
for UMDAA-01 dataset its performance is 89.51%
and 90.45% for non-weighted and weighted
respectively, for Pascal-Sentence dataset the
recognition rates are 81.48% and 82.72% for non-
weighted and weighted respectively (Zhang, 2017).
3 PROPOSED WORK
A deep learning based approach which uses CNN
for the multimodal biometric identification using
feature fusion is presented in this section. Using
deep learning on feature fused images of multiple
traits is a good idea. The method uses PCA for
feature extraction, inverse wavelet transform for
feature fusion and CNN for classification and
matching. The biometric traits which are used in the
work are faces, fingerprints, and iris of SDUMLA-
HMT database (Yin, 2011). The proposed work is an
integration of a series of works like preprocessing,
feature extraction, feature fusion, training and
testing, and decision making. A general overview is
presented in the Fig.1 and a corresponding flow
chart is presented in Fig.2.
Figure 1: An overview of proposed Mode
3.1 Preprocessing
The images obtained from sensors are not directly
operable for so many reasons. The primary task is to
identify the desired portion of the image on which
operation can be performed. The preprocessing
generally includes localization, segmentation and
normalization. The objective of the various
operations is detection of interested parts, finding
the patterns in images and to resize all the images to
uniform size so that similar operations are performed
easily. The resizing is necessary when we have
samples from various input devices, because the
image which contains the extracted features may
vary in size due to a variation in size of input images
and this cause the problem in matching. General
tasks involved in preprocessing are depicted in
Fig.3.
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
276
Figure 2: A flow chart of proposed Model
Figure 3: General tasks involved in Pre-processing
3.1.1 Fingerprints
RGB to Gray is applied to input finger images, if the
input image is a colored image. If input image is a
black and white image, then this operation is not
required at all. Mask operation is applied to image to
display meaningful area of image. Mask basically
provides our desired blocks of the image on which
further operations is to be performed. The result of
masking is shown in the Fig.4.
Original Image Mask
Figure 4: Result of Masking on a Fingerprint Image
3.1.2 Faces
The desired area is detected by Viola Jones
algorithm of the input face image. The key principle
of using Viola Jones algorithm is that it is capable of
detecting faces in a sub window of the input image,
while the standard face detection algorithm always
try to detect faces from the whole input image which
is time consuming. RGB to Gray is applied to the
cropped image for eliminating the hue and saturation
information while persisting luminance. Effects of
preprocessing on face images are depicted in Fig. 5.
(a) (b) (c)
Figure 5: Effect of pre-processing on Face Images (a)
Original Image (b) Localized Image, (c) RGB to Gray
Image
3.1.3 Iris
RGB to Gray is applied to the input iris images if
they are color images. In case of a black and white
iris input images this operation is not required at all.
Black hole search is used to localize the portion of
iris image which is desired. Masking operation is
performed to spot pupil inside iris image. Operation
of normalization is performed to increase the
intensity of the pixels present in spotted area. Effect
of pre-processing on Iris Images is depicted in Fig.6.
(a) (b)
(c) (d)
Figure 6: Effect of pre-processing on Iris Images (a)
Original Image, (b) Localization, (c) Masking, (d)
Normalization
A Deep Learning based Approach for Biometric Recognition using Hybrid Features
277
3.2 Feature Extraction
After pre-processing, Principal Component Analysis
(PCA) is applied on the obtained images for feature
extraction. PCA is used because of its simplicity and
performance. It finds new and small set of variable
while retaining the original ones. It is a statistical
method.
Any large data set consists many variables which
are interrelated to each other in some way, thus the
dimensionality of the data set is very high.
Dimensionality of the data set is reduced by PCA,
while retaining highest degree of variance as
possible. Once we have a reduced dimensionality
with maximum variance, it is possible to retain the
original motive of the data set. The goal is achieved
by transforming to a new set of variables, referred as
principal components. These obtained variables are
uncorrelated and organized in a way that some of
them retain most of the variation present in all of the
original variables.
Suppose we have a set of n random variables and
v is a vector of it. If n is very large then it is not
feasible to analyze n variances and 1/2(n(n-1)
covariance. Then it is wise to look for a new set of
variables which contains less than n elements and
which contains most of the variance represented by
the original set.
To achieve this we have to look first a linear
function f
1
'
(v) of the vector v, which contain the
maximum variance, where f
1
is a vector n constants
f
11
, f
12
, ........... f
1n
and
'
denotes the transpose, i.e
f
'
(v) = f
11
v
1 +
f
12
v
2 +
........... +f
1n
v
n
Continuing this we look for a linear function
f
2
'
(v), which is not related to f
1
'
(v), with maximum
variance and so on. By this way at the k
th
stage a
linear function f
k
'
(v) is obtained which contained
maximum variance. f
k
'
(v) is the k
th
principal
component . We have to iterate this up to n principal
components, but in general it is observed that we
have to never reach up to n. Most of the variance in
v is obtained for i principal components, where i ≤ n
(Jolliffe, 2003)( Takane, 2016)( Sanguansat, 2012).
3.2.1 Fingerprint
Feature extraction from fingerprint includes location
of minutiae points in the input image. The operation
is shown in the Fig.7 (a) .Once minutia points are
located we just store only those minutiae points
ignoring all other points from the image. A resultant
feature image is depicted in Fig. 7 (c).
(a) (b) (c)
Figure 7: Feature Extraction From Fingerprints (a)
Detection of Minutiae Points, (b) Original Image, (c)
Bifurcation Points
3.2.2 Faces
PCA is applied on pre-processed face images for
feature extraction. Eigen faces are computed from
the input image which contains a small set of
essential characteristics of the face image. Once the
Eigen values of the face image are computed the
projection is found in the new set of dimensions. A
resulting image with extracted features is shown in
Fig.8 (b).
(a) Original Image (b) Eigen Face Image
Figure 8: Representation of a Eigen Face of a Input Face
Image
3.2.3 Iris
Effect of PCA on iris image is depicted in Fig.9.
(a) Original Image (b) Feature Extracted
Image
Figure 9: Representation of an iris image to resulting
feature image.
3.3 Feature Fusion
The wavelet transform is a mechanism by which
data or operators or functions were cuts up into
various frequency components and analyzes each
component with a resolution matched to its scale.
The aim of a pattern recognition system is to obtain
the best possible classification performance. The
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
278
better classification is obtained if the feature set is
optimal. The three prime fusion strategies are:
Information or Data Fusion:- More meaningful
raw data is produced by data fusion, by
combining obtained data from various sources.
Feature Fusion:- The extracted feature set
contains irrelevant and redundant features. If
two features are of similar types then one of
them must be redundant and we need only to
keep any one of them. A feature is irrelevant if
it does not strongly correlate the class
information. The aim of feature fusion
technique is to obtain a better feature set by
fusing features, which may further given to
classifier to obtain the final result.
Decision Fusion:- A set of classifiers are used to
provide unbiased and better result. Classifiers
may be the same or different.
Wavelet Transformation (WT) is used for
feature fusion. WT is an efficient method of image
fusion which is capable of combining images which
are from different sensors and sensing environments.
The wavelet transform of the image is first
computed. The computed wavelet transform of
images contains different type of bands like high-
high, high-low and low-high at various scales. To
make this uniform the average is computed of all
computed transform values. Max rule is applied to
compute the larger value because the larger absolute
transform coefficient corresponds to the sharp
brightness changes to the image, which is the salient
feature of the image (Li, 1995)( Andra, 2002)(
Mangai, 2010)( Hubbard, 1998).
Effects of feature fusion using wavelet transform
on fingerprints, faces, and iris images are shown in
Fig.10, Fig.11, and Fig.12 respectively.
3.4 Training and Testing
Performance of Machine learning is derived from
the fact that how well system is trained. A well
trained system more sensitive to error detection.
First we test the system for the data set and then test
are performed to measure the performance of the
system. How a system behaves for unseen data
determines its performance. If the system behaves
well to an unseen data and successfully predicts its
class then it improves the performance otherwise it
generates error. Deep leaning is applied here for the
training and testing of the proposed model. More
specifically Convolution Neural Network (CNN) is
used. The reason behind the use of deep learning
method is that neural networks are a powerful
technology for classification of visual inputs. The
most important practice is getting a training set as
larger as possible (Simard, 2003).
Recognition is performed by using Convolution
Neural Network (CNN) in the proposed work. CNN
is a very powerful tool for character, speech and
visuals which includes image as well as video
recognition. A CNN is composed of many
processing layers which gives it power to minutely
observe the object and making it a powerful tool. It
uses a hierarchy of layers to extract features where
output of one layer became the input for the next
layer in hierarchy. There are four key concepts
behind convolution neural networks, which are: Use
of multiple layers, Local Connections, shared
weights and pooling.
(a) Input Fingerprint Images
(b) Image Representing Fused Features for all 5 fingerprint
samples
Figure 10: Fingerprint Feature Fusion
(a) Input Face Images of Varying Gestures
(b) Image Representation of Eigen Faces of all Face Inputs
Figure 11: Representation of Fused Feature Eigen Image
of Input Face Images.
A Deep Learning based Approach for Biometric Recognition using Hybrid Features
279
(a) (b)
(c) Image Representing Fused Features of Both Eye Images
Figure 12: Representation of Fused Feature Image of Input
Eye Images: (a) Input Image of Left Eye, (b) Input Image
of Right Eye
Many natural signals are combination of
hierarchies, and deep neural networks get the benefit
of this composition. In these hierarchies, lower level
features generates the higher level features. In
images local combinations of edges form some
specific patterns, parts are obtained by assembling
these patterns, and parts form objects. When the
elements vary in their position and appearance in
previous layer, it allows to vary very little in the next
layer with the help of pooling.
There is a concept of simple cells and complex
cells in visual neuroscience. In CNN the convolution
layer is inspired by simple cells and pooling layer is
inspired by complex cells. Convolution and pooling
layer compose the few early layers of CNN. Feature
maps organize the units of convolution layer.
Feature maps contain the patches. The pooling layer
is used to merge features whose semantics are same.
The method used by CNN is very similar to
animal’s visual cortex. The image is processed in
the form of independent small portions which is
generally termed as visual fields. Each visual field is
processed by separate neurons which are stacked in
layers. Some of most used layers are: Convolution
Layer, Pooling Layer, Locally Connect Layer and
Fully Connected Layer (Le, 2015)( Krizhevsky,
2012)( Dũng, 2014).
4 RESULTS AND DISCUSSION
Experimental work is performed using SDUMLA-
HMT database. This database contains biometrics
samples of total 106 persons with 5 traits per person
which are face, fingerprint, iris, finger veins, and
gait. It includes 61 males and 45 females with the
age between 17 and 31. Out of 5 biometric traits
present in the database only 3 are of our interests
which are Face, Fingerprint, and Iris. From the
database, a separate cluster of 3 traits of all 106
persons is created which includes 5 random samples
of faces with different gestures, 5 fingerprints, and 2
iris images left and right. Hence the database
contains a total of 1272 images.
Out of 106 entries from database 100 persons are
registered and their corresponding fused feature
images are stored 3 per person, 1 fused image of all
5 face gestures, 1 fused image of all 5 fingerprints
samples and 1 fused image of two iris samples. Out
of 100 sets 80 are used for training and 20 are used
for testing. When CNN is applied for train test its
accuracy was 96.67%, while the error was 3.33%
with elapsed time for .77(approx.) for testing where
1 epoch is used and for training it is .55 and .33
seconds where number of epoch used is 2.
A plot between number of epoch multiplied by
number of batches versus CNN loss. Since number
of epoch in testing is 1, hence graph is considered
between number of batches and CNN loss. The
vertical axis represents the CNN loss and horizontal
axis representing the number of batches. The graph
is shown in Fig.13.
Figure 13: Plot CNN Loss (Vertical) Vs No. of Batches
(Horizontal)
Table-I shows some result comparisons with
existing latest Multimodal systems available. This
table contains 5 columns as depicted. Here four key
parameters are selected for result comparison, which
are: Traits, Method, Accuracy and Elapsed time.
Most multimodals use three traits, but few with 2
traits also exist. Use of only two traits may perform
better in terms of time and recognition rates, but
reliability may get affected in some cases. It is easily
observed that accuracy of some existing modals is
slightly better than accuracy achieved in this work.
This is because of two reasons. The first reason is
that, use of different datasets. It is observed in
literature survey that recognition rate of a particular
method varies large with the change of datasets. The
second reason is the kind of methods being used. A
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
280
method with high recognition rate may take large
extent of time in comparison with deep learning
method. Deep learning method is much faster than
other methods and this can be easily observed by the
Table-I. The only limitation of deep learning method
is that it requires large data sets to perform better.
And when we use a fused modal with deep learning
then it may get a much reduced dataset in
comparison with actual one. Here we have used a
large data set but because of feature fusion the size
of dataset is reduced up to 75% and deep learning is
applied only 25% of the actual dataset. But the
proposed work is able to perform better than existing
modals, which are based on deep learning method.
The elapsed time of deep learning approach is far
better than other methods.
Table 1.
Title Traits Method Accuracy
Elapsed
Time(S)
Shams M.,
Tolba A, &
Sarhan S
(2016).
Face,
Fingerprint,
Iris
LBPV 99.50% 24 Sec.
Choi H. &
Park H.
(2015).
Face,
Gesture
4-Fold
Cross
Validation
97.59%-
99.36 %
--
Khoo Y.
H., et.al.
(2018, June
Iris,
Fingerprint
IFO-
Hashing
99.2% --
Ammour B,
Bouden T,
& Amira-
Biad S
(2017
Face, Iris
Snake,
DCT, Z-
Score
87%,
94%, 98%
2.623Sec.,
15.82Sec.,
--
This Work
Face,
Fingerprint,
Iris
PCA, WT,
Deep
Learning
96.67% .77 Sec
5 CONCLUSIONS
This paper proposed a biometric identification
system based on a multimodal by using feature
fusion technique. Identifying a person using only a
single trait is not optimal always due to several
problems like physical loss of traits, medical reasons
or any other reasons. Using a multimodal minimizes
that risk in comparisons to uni-modals by providing
extra set of information. Here, three most frequent
traits of human such as Faces, Fingerprints and Iris
have been used for biometric recognition. In this
paper PCA and WT were used feature extraction and
feature fusion respectively. This paper has been
proposed a CNN based model with fused feature of
three biometric traits to recognize the biometrics.
The proposed approach has been achieved the
accuracy up to 96.67%.
REFERENCES
Ammour B, Bouden T, & Amira-Biad S (2017, October).
Multimodal biometric identification system based on
the face and iris. In 2017 5th International Conference
on Electrical Engineering-Boumerdes (ICEE-B) (pp.
1-6). IEEE
Andra K, Chakrabarti C, & Acharya T (2002). A VLSI
architecture for lifting-based forward and inverse
wavelet transform. IEEE transactions on signal
processing, 50(4), 966-977.
Blanz, V, & Vetter T. (2003). Face recognition based on
fitting a 3D morphable model. IEEE Transactions on
pattern analysis and machine intelligence, 25(9), 1063-
1074.
Cheniti M., Boukezzoula N. E. & Akhtar Z. (2017).
Symmetric sum-based biometric score fusion. IET
Biometrics, 7(5), 391-395.
Choi H. & Park H. (2015). A multimodal user
authentication system using faces and gestures.
BioMed research international, 2015.
Daugman J (2006). Probing the uniqueness and
randomness of Iris Codes: Results from 200 billion iris
pair comparisons. Proceedings of the IEEE, 94(11),
1927-1935.
Daugman J, & PhD, O. B. E. (2004). University of
Cambridge. How Iris Recognition Works.
Dũng PV (2014). Multiple Convolution Neural Networks
for an Online Handwriting Recognition System.
Gunasekaran K., Raja J., & Pitchai R. (2019). Deep
multimodal biometric recognition using contourlet
derivative weighted rank fusion with human face,
fingerprint and iris images. Automatika, 60(3), 253-
265.
Hubbard BB (1998). The world according to wavelets: the
story of a mathematical technique in the making. AK
Peters/CRC Press.
Jain A, & Hong L(1996, August). On-line fingerprint
verification. In Pattern Recognition, 1996.,
Proceedings of the 13th International Conference on
(Vol. 3, pp. 596-600). IEEE.
Jain AK, Flynn P, & Ross AA (Eds.). (2007). Handbook
of biometrics. Springer Science & Business Media.
Jolliffe IT (2003) Principal component analysis.
Technometrics, 45(3), 276.
KhooY. H., Goi B. M., Chai T. Y., Lai Y. L., & Jin Z.
(2018, June). Multimodal biometrics system using
feature-level fusion of iris and fingerprint. In
Proceedings of the 2nd International Conference on
Advances in Image Processing (pp. 6-10).
Krizhevsky A, Sutskever I, & Hinton GE (2012).
Imagenet classification with deep convolutional neural
networks. In Advances in neural information
processing systems (pp. 1097-1105).
A Deep Learning based Approach for Biometric Recognition using Hybrid Features
281
Le QV (2015). A tutorial on deep learning part 2:
Autoencoders, convolutional neural networks and
recurrent neural networks. Google Brain, 1-20.
Li H, Manjunath BS, & Mitra SK (1995). Multisensor
image fusion using the wavelet transform. Graphical
models and image processing, 57(3), 235-245.
Liu C, & Wechsler H (2001). A shape-and texture-based
enhanced Fisher classifier for face recognition. IEEE
transactions on image processing, 10(4), 598-608.
Maio D, Maltoni D, Cappelli R, Wayman JL, & Jain AK
(2004, July). FVC2004: Third fingerprint verification
competition. In International Conference on Biometric
Authentication (pp. 1-7). Springer, Berlin,
Heidelberg.
Mangai UG, Samanta S, Das S, & Chowdhury PR (2010).
A survey of decision fusion and feature fusion
strategies for pattern classification. IETE Technical
review, 27(4), 293-307.
Parkavi R, Babu K C, & Kumar J A (2017, January).
Multimodal biometrics for user authentication. In 2017
11th International Conference on Intelligent Systems
and Control (ISCO) (pp. 501-505). IEEE.
Sanguansat P (Ed.) (2012). Principal Component Analysis:
Multidisciplinary Applications. BoD–Books on
Demand.
Shams M., Tolba A, & Sarhan S (2016). Face, iris, and
fingerprint multimodal identification system based on
local binary pattern with variance histogram and
combined learning vector quantization. Journal of
Theoretical & Applied Information Technology, 89(1).
Simard P. Y.Steinkraus D. & Platt J. C. (2003, August).
Best practices for convolutional neural networks
applied to visual document analysis. In Icdar (Vol. 3,
No. 2003).
Sultana M, Paul P P, & Gavrilova M L (2017). Social
behavioral information fusion in multimodal
biometrics. IEEE Transactions on Systems, Man, and
Cybernetics: Systems, 48(12), 2176-2187.
Takane Y (2016). Constrained principal component
analysis and related techniques. Chapman and
Hall/CRC.
Van der Putte T, & Keuning J. (2000). Biometrical
fingerprint recognition: don’t get your fingers burned.
In Smart Card Research and Advanced Applications
(pp. 289-303). Springer, Boston, MA.
Yin Y, Liu L, & Sun X (2011, December). SDUMLA-
HMT: a multimodal biometric database. In Chinese
Conference on Biometric Recognition (pp. 260-268).
Springer, Berlin, Heidelberg.
Zhang H., Patel V. M. & Chellappa R. (2017). Low-rank
and joint sparse representations for multi-modal
recognition. IEEE Transactions on Image Processing,
26(10), 4741-4752.
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
282