On the Statistical Independence of Parametric Representations in

Biometric Cryptosystems: Evaluation and Improvement

Riccardo Musto, Emanuele Maiorana, Ridvan Salih Kuzu, Gabriel Emile Hine and Patrizio Campisi

Roma Tre University, Rome, Italy

Keywords:

Biometric Recognition, Template Protection, Biometric Cryptosystems, Statistical Independence, Finger Vein.

Abstract:

Biometric recognition is nowadays employed in several real-world applications to automatically authenticate

legitimate users. Nonetheless, using biometric traits as personal identiﬁers raises many privacy and security

issues, not affecting traditional approaches performing automatic people recognition. In order to cope with

such concerns, and to guarantee the required level of security to the employed biometric templates, several

protection schemes have been designed and proposed. The robustness against possible attacks brought to

such approaches has been typically investigated under the assumption that the employed biometric represen-

tations comprise mutually independent coefﬁcients. Unfortunately, the parametric representations adopted in

most biometric recognition systems commonly consist of strongly correlated features, which may be there-

fore unsuitable to be used in biometric cryptosystems since they would lower the achievable security. In this

paper we propose a framework for evaluating the statistical independence of features employed in biometric

recognition systems. Furthermore, we investigate the feasibility of improving the mutual independence of rep-

resentations deﬁned through deep learning approaches by resorting to architectures involving autoencoders,

and evaluate the characteristics of the novel templates through the introduced metrics. Tests performed using

templates derived from ﬁnger-vein patterns are performed to evaluate the introduced framework for statistical

independence and the proposed template generation strategies.

1 INTRODUCTION

Biometric recognition systems rely on personal char-

acteristics to deﬁne unique identiﬁers, through which

a user can be automatically recognized and granted

physical or logical access to speciﬁc goods or services

(Jain et al., 2011). Thanks to the improved comfort

and security this technology offers, with respect to

traditional recognition techniques using passwords or

tokens, biometric applications are frequently encoun-

tered in our daily lives, from using ﬁngerprint or face

to unlock a mobile device, to employing iris patterns

to enter restricted areas in airports.

Nevertheless, it has to be remarked that, along

with several advantages, the use of biometric traits for

recognition purposes also brings many potential secu-

rity and privacy issues. Actually, in case a biometric

trait is compromised, as it may happen if covertly ac-

quired or stolen by an attacker, it cannot be revoked

or reissued, being it an intrinsic and permanent char-

acteristic of its owner. If the compromised trait is em-

ployed in different applications, all of them are not

secure anymore. Since the number of biometric traits

a subject can exploit is limited, losing the possibility

of relying on one of them is a serious limitation. As

far as privacy is concerned, biometric traits could re-

veal sensitive information about their owners, which

could be exploited for purposes not related to recog-

nition and potentially discriminatory. Exploiting the

uniqueness of biometric traits, it could be also pos-

sible to track the activities of a subject across multi-

ple applications. Actually, the EU General Data Pro-

tection Regulation (GDPR) states that biometric traits

are sensitive and personal data, and should be there-

fore processed ensuring adequate levels of security.

For all the aforementioned reasons, it is therefore

extremely important to design and implement biomet-

ric template protection (BTP) schemes, with the aim

of preserving the identiﬁers stored in every databases,

guaranteeing the desired properties of renewability,

security, and performance (Nandakumar and Jain,

2015). BTP approaches have been typically cate-

gorized into two major classes: cancelable biomet-

rics (Patel et al., 2015) and biometric cryptosystems

(Rathgeb and Uhl, 2011). The former methods typi-

cally employ non-invertible functions, whose deﬁning

parameters may be made publicly available or not, to

480

Musto, R., Maiorana, E., Kuzu, R., Hine, G. and Campisi, P.

On the Statistical Independence of Parametric Representations in Biometric Cryptosystems: Evaluation and Improvement.

DOI: 10.5220/0010988300003122

In Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2022), pages 480-487

ISBN: 978-989-758-549-4; ISSN: 2184-4313

transform the original representations into protected

ones. The latter ones either directly extract binary

keys from the considered biometric traits, or adopt

key-binding approaches, exploiting hashing protocols

to bind biometric templates with binary cryptographic

keys. Biometric cryptosystems typically rely on some

public information, known as helper data, during the

recognition process.

It is worth mentioning that BTP methods rely-

ing on cancelable biometrics are commonly analyzed,

in terms of robustness against attacks trying to re-

cover the original information from the transformed

one, in quite superﬁcial modalities, due to the difﬁ-

culty in quantitatively evaluating the achievable non-

invertibility, and to the heterogeneity of the proposed

methods, which makes it arduous to deﬁne general

metrics upon which quantifying the provided security.

Conversely, rigorous evaluations of biometric cryp-

tosystems security have been often provided in litera-

ture (Simoens et al., 2009), with in-depth information

theoretic studies speciﬁcally dedicated to key-binding

approaches, trying to evaluate the amount of informa-

tion leaked by the stored helper data about the original

secret sources (Ignatenko and Willems, 2015).

It has yet to be observed that all the theoretical

evaluations so far carried out to evaluate the security

of biometric cryptosystems have assumed ideal char-

acteristics for the biometric templates to be protected.

Speciﬁcally, it is typically assumed that the biomet-

ric representations employed in key-binding schemes

consist of mutually-independent features, an hypoth-

esis allowing to derive conclusions regarding a sin-

gle coefﬁcient and automatically extending them to

the whole set of available features. Unfortunately,

real-world data are not characterized by such an ideal

property, and the loss in security of a biometric cryp-

tosystem is greater the further the available data are

from the ideal condition. It is therefore required to

perform proper investigations to evaluate the extent of

the differences between ideal and real qualities of the

employed biometric representations, having in mind

their exploitation in biometric cryptosystems. Fur-

thermore, it would be desirable to design strategies

for generating biometric representations having prop-

erties as close as possible to the ideal ones, while pre-

serving their discriminative capabilities.

Within the context of biometric cryptosystems, the

present paper addresses the aforementioned aspects,

performing an analysis regarding the statistical inde-

pendence of the features comprised within biometric

templates. Speciﬁcally, a novel framework, through

which it is possible to derive quantitative evaluations

on the independence of the considered representa-

tions, is proposed in Section 2, and applied to ana-

lyze templates obtained applying deep learning strate-

gies to biometric data. In more detail, ﬁnger-vein pat-

ters are considered as biometric traits in this study.

Furthermore, approaches relying on autoencoders are

employed with the aim of improving the non-ideal

characteristics of the considered templates, as de-

scribed in Section 3. The usability of the generated

templates within a biometric cryptosystem is eval-

uated considering a recently-introduced key-binding

scheme with zero-leakage capabilities (Hine et al.,

2017), summarized in Section 4. The statistical inde-

pendence of the considered representations is tested

through the proposed metrics in Section 5, where

their effectiveness in terms of security and recogni-

tion rates is also evaluated. Conclusions are eventu-

ally drawn in Section 6.

2 FRAMEWORK FOR THE

ANALYSIS OF STATISTICAL

INDEPENDENCE

The proposed framework for the evaluation of statisti-

cal independence in biometric templates relies on the

Hilbert-Schmidt Independence Criterion (HSIC) sta-

tistical test (Gretton et al., 2007), described in Section

2.1. The information gained running HSIC tests are

then processed by methods derived from graph the-

ory (Bondy and Murty, 2008), as detailed in Section

2.2, to provide quantitative metrics about the inde-

pendence of the considered coefﬁcients. In the fol-

lowing discussion, it is assumed that a dataset D of

biometric templates, collected from u subjects for a

total of n samples, each expressed as a feature vector

with length m, is available for the conducted statisti-

cal analysis. The considered data are arranged as an

n × m matrix, with each row being a biometric tem-

plate.

2.1 HSIC Statistical Test

Given two random variables X and Y , the HSIC

test estimates the squared Hilbert-Schmidt norm of

the population of interest, that is, HSIC(P

,F ,G),

where P

is the joint distribution of Z = (X , Y ), and

F and G are two reproducing kernel Hilbert spaces

(RKHS). The null and research hypotheses of the

HSIC test are deﬁned as:

J (Z) : (X × Y )

7→ 0,1, (1)

: P

= P

: P

6= P

(2)

that is, the null hypothesis correspond to independent

X and Y variables.

On the Statistical Independence of Parametric Representations in Biometric Cryptosystems: Evaluation and Improvement

481

Figure 1: Visual depiction of the proposed framework for statistical independence evaluation.

The desired statistics can be estimated from an ob-

served sample Z = (X,Y ) in a biased version as:

HSIC

(Z) =

trace(KHLH),

(3)

that is, computing the sum of elements on the main

diagonal of a square matrix obtained as the product of

the n × n matrices H = I −

, K, and L, with the

latter two having as elements:

K(i, j)

= exp(−σ

−2

||x

− x

)

L(i, j)

= exp(−σ

−2

||y

− y

),(i, j) = 1,.. .,n.

(4)

Having set the desired value of the signiﬁcance

level α for the upper bound of the type I error,

the asymptotic distribution of the empirical estimate

HSIC

(Z) is derived under H

, and the quantile 1−α

of this distribution, indicated δ(Z), can be used as a

threshold to determine the test outcome. Speciﬁcally,

in case HSIC

(Z) < δ(Z), it is not possible to reject

the null hypothesis H

, thus the two random variables

are assumed as independent.

In our framework the HSIC test is performed for

every possible pairs of features in the considered

m-dimensional biometric representation. An m × m

square and symmetrical independence matrix I

∈

m×m

can be thus obtained as:

[i, j] =

(

1 if HSIC

) < δ(F

)

0 otherwise,

(5)

where F

and F

represent any two features in the set

of the m available ones, with samples lying in a R

space.

2.2 Graph Theory

Concepts stemming from graph theory are employed

to derive quantitative metrics expressing the degree

of independence of the coefﬁcients in the considered

biometric representation. The computed binary inde-

pendence matrix I

can be in fact interpreted as an

adjacency matrix A

, thus deﬁning an independence

undirected graph G, with edges connecting nodes as-

sociated to independent coefﬁcients. The overall pro-

cess performed to derive the desired independence

graphs is visually depicted in Figure 1.

In the present paper we consider three different

metrics which can be computed from any undirected

graph G = (V,E,ψ) having nodes N, edges E, and in-

cidence function ψ, namely normalized edge count,

normalized maximum clique size, and normalized de-

gree centrality.

2.2.1 Normalized Edge Count

This metric simply computes the number of edges in

the considered graph G, normalized with respect to

the maximum number of edges in a complete graph

with the same number m of nodes as G, being there-

fore

NEC

m(m − 1)

∑

i, j

[i, j]. (6)

The obtained value can be interpreted as a per-

centage of independent coefﬁcients. Yet, despite its

simplicity, this metric is weak and not very exhaus-

tive. As a matter of fact, for example, a value of

NEC

= 0.9 does not mean that 90% of the features

are mutually statistically independent, but only that

that the independence matrix I

contains 90% of uni-

tary entries.

2.2.2 Normalized Maximum Clique Size

A clique of G is deﬁned as a complete subgraph G

of G, such that every two distinct nodes in the clique

are adjacent. The clique is said to be maximal if it

is not a subset of another clique, and maximum if it

has the largest number of nodes. In the context here

considered, the size S of the maximum clique of G

is employed as a metric for biometric feature inde-

pendence, once normalized with respect to the largest

possible value, thus obtaining the value NMCS

S/m.

Since a clique represents a complete subgraph,

this metric gives an effective measure of indepen-

dence: the features in the maximum clique will be

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

482

actually mutually independent. However, this met-

ric presents some criticalities. First, the number of

elements of the maximum clique may not be large,

especially as the number of features increases, return-

ing extremely low independence values that are not

suitable for comparison. Moreover, in the event that

independence is extremely high, the search for this

clique may require a long processing time. Finally,

for a given graph there may be multiple maximum

cliques with the same number of elements, making it

difﬁcult to understand which of them is the best one.

2.2.3 Normalized Degree Centrality

Degree centrality (Freeman, 1978) is a metric which

can be associated to each node i in a graph G, by

computing as d

(i) the number of edges incident to

the node itself, that is, its number of connections,

thus quantifying its importance within the graph. The

more important a node is, the more easily it will be

crossed by the information ﬂow. A normalized value

of degree centrality can be computed dividing it by

the maximum feasible degree of the graph, thus ob-

taining

NDC

(i) =

(i)

m − 1

. (7)

In the ideal case of having all the m available

features mutually independent, the maximum clique

would correspond to the graph G itself, and every

node would have a value of normalized degree cen-

trality equal to 1. In the following, when reporting

this metric for the experimental tests described in Sec-

tion 5, the values computed for each node are orga-

nized in a descending order to form a curve, thus bet-

ter representing the deviation from the ideal scenario

of the features under examination. In more detail, the

more the curve of the nodes centrality deviates from

the ideal one with values only at 1, the less the fea-

tures are independent. Such representation also al-

lows to give an indication regarding the number of

more important nodes, that is, the features indepen-

dent of most other coefﬁcients.

3 BIOMETRIC

REPRESENTATIONS

The proposed framework for statistical independence

evaluation has been tested on biometric representa-

tions obtained applying deep learning strategies to

ﬁnger-vein patterns. In more detail, the baseline sys-

tem here exploited is the one described in (Kuzu et al.,

2020a), where ﬁnger-vein images have been pro-

cessed through a convolutional neural network (CNN)

derived from DenseNet-161 (Huang et al., 2018), with

the addition of a custom set of ﬁnal layers. In order to

generate representations suitable for veriﬁcation tasks

in open-set conditions, the loss function employed for

training relies on a cross-entropy function with ad-

ditive angular margin penalty (AAMP) (Deng et al.,

2019). An additional conﬁguration has been here

taken into account as baseline, exploiting ResNext-

101 (Xie et al., 2017) instead of DenseNet-161, while

keeping the same custom set of ﬁnal layers, and the

same size for the generated representations, compris-

ing 1024 coefﬁcients.

In order to verify whether it is possible to im-

prove the independence of the features generated by

the aforementioned baseline approaches, we have also

evaluated the effectiveness of adding autoencoders

in cascade to the employed networks. An analo-

gous approach has been already proposed in (Kuzu

et al., 2020b), yet the purpose there was improving the

achievable recognition performance, while here the

intended goal would be generating biometric repre-

sentations with more independent features. The ratio-

nale behind the proposed approach resides in the fact

that autoencoders are typically employed to automat-

ically learn efﬁcient encodings of the processed data,

with the aim of reducing the dimensionality of the

input representation while keeping all its informative

content. While pursuing this target, it is also possible

to force the learned representations to assume useful

properties, such as sparseness for example. Within the

context of the present research, the objective would be

therefore to design an autoencoder able to increase the

inner independence of the treated representations.

As autoencoder, we have considered the same

densely-connected convolutional autoencoder (DC-

CAE) proposed in (Kuzu et al., 2020b), consisting

of a total of 55 layers, with an input layer receiving

representations with 1024 coefﬁcients and an inner

encoding producing 256 features, and trained it with

different loss functions. In more detail, the employed

DCCAE has been trained with the aim of minimizing

a loss deﬁned as

L = L

+ β · L

, (8)

where L

represents the reconstruction loss, com-

puted through the cosine dissimilarity

∑

i=1

[1 − cos(f

)], (9)

being f

the i-th feature representation generated by

the baseline CNN,

its counterpart reconstructed by

the autoencoder, and B the employed batch size.

In order to evaluate whether distinct approaches

may have different effectiveness for the sought tar-

get of improving independence, three different loss

On the Statistical Independence of Parametric Representations in Biometric Cryptosystems: Evaluation and Improvement

483

functions have been here considered for the compo-

nent L

, namely Kullback-Leibler divergence (KLD),

spectral restricted isometry property (SRIP) (Bansal

et al., 2018), and DeCov (Cogswell et al., 2016).

The L

term based on KLD is deﬁned as:

KLD

∑

h∈L

(h)

∑

j=1



ρk

(h)



(h)

∑

i=1

(h)

)

(10)

where a

(h)

is the j-th activation output of the h-th hid-

den layer of the DCCAE when f

is fed as input to the

DCCAE, with j = 1,...,N

(h)

, being N

(h)

the num-

ber of activation units in the h-th hidden layer, and

ρ ∈ [0, 1] is the sparsity parameter. The set L repre-

sents the layers of the DCCAE dedicated to the inner

encoder and decoder, with L = {26 −29} for the DC-

CAE described in (Kuzu et al., 2020b).

The SRIP regularization forces the weights of the

network to be near-orthogonal, and it is computed on

the weights of each convolutional layer of the pro-

posed DCCAE:

SRIP

∑

h=1

σ(W

(h)

)



(h)>

(h)

− I



, (11)

where W

(h)>

is a matrix with the weights of the h-

th layer, I is the identity matrix and σ is the spectral

norm, deﬁned as the largest singular value of W

(h)

The DeCov loss pushes the network to learn non-

redundant representations by minimizing the cross-

covariance of hidden activations through a regulariza-

tion operation. Considering the h-th layer of the em-

ployed DCCAE generating the inner encodings, and

its activations a

(h)

, the desired cross-covariance C is

obtained computing, for all the possible pairs of acti-

vations j and k,

C[ j,k] =

∑

i=1



(h)

) − µ



(h)

) − µ



, (12)

being µ

the sample mean of activation j over the

batch, that is,

∑

i=1

(h)

). (13)

The DeCov loss is then computed as:

DeCov

(||C||

− ||diag(C)||

), (14)

where || · ||

is the Frobenius norm.

4 CONSIDERED BIOMETRIC

CRYPTOSYSTEM

In order to verify whether representations generated

as described in Section 3 could be actually em-

Figure 2: Considered zero-leakage biometric cryptosystem

from (Hine et al., 2017).

ployed in a biometric cryptosystems while guarantee-

ing proper recognition and security performance, tests

have been conducted considering the zero-leakage

BTP scheme presented in (Hine et al., 2017), that is, a

code-offset approach inspired by the digital modula-

tion paradigm. As a reference, this method is graphi-

cally depicted in Figure 2.

In brief, it is assumed that a secret binary key

and a biometric representation f ∈ R

are available

during enrolment. A point-wise function t(·) is ap-

plied to each coefﬁcient of f in order to derive, from

each of them, a variable with a probability density

function following a raised cosine distribution, de-

scribed by a roll-off parameter γ ∈ [0, 1]. As shown

in (Hine et al., 2017), such transformation guarantees

that the stored helper data cannot reveal any infor-

mation about the employed cryptographic key, thus

implementing a zero-leakage protection scheme. The

roll-off γ of the employed raised cosine distribution

determines the privacy achievable by the proposed

scheme, that is, the minimum reconstruction error an

attacker can commit when trying to estimate the in-

put biometric sample, given the stored helper data.

Speciﬁcally, the privacy of the proposed scheme in-

creases with the use of larger values of γ. On the other

hand, as shown in (Hine et al., 2017), large γ values

reduce the capacity of the considered BTP scheme,

that is, the maximum size of the cryptographic key

which can be bind with the employed biometric tem-

plate, thus setting the security of the whole system.

The stored helper data is generated from the input

key and biometric template adopting a quantization

index modulation (QIM) approach, as z = [t(f) − q]

2π

where q ∈ {0, 2π/P,...,(P − 1)2π/P}

, with P ∈ N

is a set of symbols belonging to a phase-shift keying

(PSK) constellation of size P, obtained encoding the

input binary key with an error correcting code.

The reverse process is performed during recogni-

tion, with an inverse QIM applied to the combination

of the stored helper data z and the newly acquired bio-

metric representation

f, after which a soft decoding

process with decisions based on log-likelihood ratio

(LLR) criteria is performed with the purpose of re-

covering the originally employed cryptographic key.

In case f and

f are similar, the hash of the recovered

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

484

Table 1: Comparative analysis of statistical independence

of biometric representations obtained using DenseNet-161

as baseline network. Best results reported in bold.

Metric Baseline ICA

DCCAE

KLD SRIP DeCov

NEC

0.152 0.410 0.254 0.250 0.690

NMCS

6 9 7 8 19

key corresponds to that of the secret one, and the sub-

ject can be this recognized as a legitimate user.

5 EXPERIMENTAL TESTS

Tests have been performed on ﬁnger-vein biometric

traits from the SDUMLA database (Yin et al., 2011),

comprising data from 106 subjects, with three sam-

ples taken from each of three ﬁngers of both hands.

The available data have been divided into two disjoint

datasets of equal size, one used to train the consid-

ered deep learning architectures and the other one to

perform the required evaluations. The employed net-

works have been trained using stochastic gradient de-

scent with momentum (SGDM) and a batch size of

128. The hyperparameters of the DCCAE have been

set, for each considered loss L

, with the aim of guar-

anteeing the best achievable performance in terms of

independence of the generated representations.

The independence metrics proposed in Section

2 have been computed performing HSIC tests with

signiﬁcance level α = 2.5%. A comparative analy-

sis has been conducted in order to evaluate the ef-

fectiveness of the methods in Section 3 to generate

representations with improved independence, which

could be properly used in biometric cryptosystems.

In more detail, in addition to comparing the repre-

sentations generated through the proposed DCCAEs

against those obtained using only the considered base-

line CNNs, we have also taken into account transfor-

mations of the original coefﬁcients obtained through

independent component analysis (ICA) (Hyvarinen

and Oja, 2000). The considered FastICA approach

applies an orthogonal rotation to prewhitened data

in order to maximize a measure of non-Gaussianity,

used as a proxy for statistical independence.

The computed NEC

and NMCS

independence

metrics are reported in Tables 1 and 2, respectively

for biometric representations obtained when using the

DenseNet-161 and ResNext-101 baseline networks.

The metrics NDC

are instead reported in Figure 3,

which shows the behaviors obtained for all the con-

sidered nodes (coefﬁcients), in order to better illus-

trate the deviation from the ideal conditions with all

values set at 1. It is possible to observe that the pro-

posed approaches relying on KLD and SRIP are able

Table 2: Comparative analysis of statistical independence

of biometric representations obtained using ResNext-101 as

baseline network. Best results reported in bold.

Metric Baseline ICA

DCCAE

KLD SRIP DeCov

NEC

0.187 0.256 0.221 0.214 0.415

NMCS

8 8 7 7 10

to provide an improvement in terms of NEC

and

NDC

with respect to the use of representations ob-

tained through the baseline networks. The NMCS

is improved when using DenseNet-161 as baseline.

Yet, an ICA transformation guarantee an even fur-

ther independence improvement. Training a DCCAE

with DeCov loss represents the best approach to en-

hance the independence of the considered representa-

tion, with notable improvements in terms of NEC

NMCS

, and NDC

for templates obtained through

both DenseNet-161 and ResNext-101. The gains are

especially signiﬁcant for templates generated with the

DenseNet-161 network, with notable increase of inde-

pendence with respect to ICA too.

In addition to the analysis of the proposed ap-

proaches in terms of independence, other aspects rele-

vant to design biometric cryptosystems are also eval-

uated. The obtained recognition results, in terms of

false rejection rate (FRR) and false acceptance rate

(FAR), are shown by the detection error tradeoffs

(DET) curves of Figure 4, and summarized in terms

of equal error rate (EER) in Table 3. It is impor-

tant to mention that, in this work, the DCCAEs have

been trained to maximize feature independence, dif-

ferently from (Kuzu et al., 2020b) where the networks

have been trained to maximize the achievable recog-

nition performance. Therefore, the DCCAEs here

employed cannot improve the recognition capabilities

offered by the baseline CNN networks as in (Kuzu

et al., 2020b). Actually, the use of SRIP and KLD

losses guarantee results similar to those of baseline

networks, while exploiting the DeCov loss worsen the

recognition performance achievable with DenseNet-

161. As commonly observed in biometric cryptosys-

tem, an improvement in terms of security (here ex-

pressed through the achieved independence) can be

achieved at the cost of a worsening in terms of recog-

nition performance, with security and recognition in-

volved in a trade-off relationship.

The achievable security is also evaluated by com-

Table 3: EERs (in %) obtained with the considered biomet-

ric representations.

Network Baseline ICA

DCCAE

KLD SRIP DeCov

DenseNet-161 0.044 0.586 0.252 0.126 1.072

ResNext-101 0.168 0.230 0.419 0.439 0.209

On the Statistical Independence of Parametric Representations in Biometric Cryptosystems: Evaluation and Improvement

485

(a) (b)

Figure 3: Normalized degree centrality (NDC

) computed for the considered representations. (a): DenseNet-161 used as

baseline; (b): ResNext-101 used as baseline.

(a) (b)

Figure 4: DET curves reporting the recognition performance achievable with the considered representations. (a): DenseNet-

161 used as baseline; (b): ResNext-101 used as baseline.

(a) (b)

Figure 5: Average embedding capacity achievable with the considered representations. (a): DenseNet-161 used as baseline;

(b): ResNext-101 used as baseline.

puting the capacities of the employed representations,

reported in Figure 5 in terms of the average number

of bits of a cryptographic key that can be embedded

within the coefﬁcients of the employed representa-

tion. As detailed in (Hine et al., 2017) and mentioned

in Section 4, the achievable capacity depends on the

roll-off parameter γ of the raised cosine distribution

employed in the considered biometric cryptosystem.

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

486

The DeCov loss signiﬁcantly improves the embedding

capacity with respect to the baseline system relying

on DenseNet-161. It is also worth noting that using an

ICA transformation worsens the capacity achievable

with representations derived with both DenseNet-161

and ResNext-101. The KLD and SRIP loss functions

lead to similar results in terms of channel capacity,

with limited improvements with respect to a baseline

system only for ResNext-101.

6 CONCLUSIONS

In this paper we have proposed a framework to quan-

titatively evaluate the statistical independence of fea-

tures employed within biometric cryptosystems, and

used it to analyze the effectiveness of using cascade

networks including DCCAEs to produce discrimina-

tive and independent biometric representations. Since

each of the proposed metric has its own criticalities, it

is recommended to evaluate a given biometric repre-

sentation considering all of them, rather than drawing

conclusions based on only one. Further developments

could be studied in order to design other metrics with

more informative content. For instance, it would be

possible to deﬁne a new centrality measure, possibly

taking into account the difference between the statis-

tics and the thresholds computed by an HSIC test,

and associating greater relevance to pair of features

with greater differences. This way, evaluations could

be performed taking into account the signiﬁcance of

each test, and not only its binary output. Such met-

ric could be also employed to deﬁne effective feature

selection strategies based on statistical independence.

Furthermore, it would be highly desirable to design

novel approaches to automatically learn how to gen-

erate biometric representations with both discrimina-

tive and independence characteristics. To this aim, the

proposed metrics could be integrated within the loss

functions employed during the learning process.

REFERENCES

Bansal, N., Chen, X., and Wang, Z. (2018). Can we gain

more from orthogonality regularizations in training

deep cnns? In Proceedings of the 32nd Interna-

tional Conference on Neural Information Processing

Systems, NIPS’18, page 4266–4276, Red Hook, NY,

USA. Curran Associates Inc.

Bondy, J. and Murty, U. (2008). Graph Theory. Springer

Publishing Company, Incorporated, 1st edition.

Cogswell, M., Ahmed, F., Girshick, R., Zitnick, L., and Ba-

tra, D. (2016). Reducing overﬁtting in deep networks

by decorrelating representations.

Deng, J., Guo, J., Xue, N., and Zafeiriou, S. (2019). Ar-

cface: Additive angular margin loss for deep face

recognition.

Freeman, L. C. (1978). Centrality in social networks con-

ceptual clariﬁcation. Social Networks, 1(3):215–239.

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf,

B., and Smola, A. (2007). A kernel statistical test of

independence. In Proceedings of the 2007 Conference

on Advances in Neural Information Processing Sys-

tems.

Hine, G., Maiorana, E., and Campisi, P. (2017). A zero-

leakage fuzzy embedder from the theoretical formula-

tion to real data. IEEE Transactions on Information

Forensics and Security, 12(7):1724–1734.

Huang, G., Liu, Z., van der Maaten, L., and Weinberger,

K. Q. (2018). Densely connected convolutional net-

works.

Hyvarinen, A. and Oja, E. (2000). Independent component

analysis: Algorithms and applications. Neural Net-

works, 13(4-5):411–430.

Ignatenko, T. and Willems, F. (2015). Fundamental lim-

its for privacy-preserving biometric identiﬁcation sys-

tems that support authentication. IEEE Trans. on In-

formation Theory, 61(10):5583–5594.

Jain, A., Ross, A., and Nandakumar, K. (2011). Introduc-

tions to biometrics. SPRINGER.

Kuzu, R., Maiorana, E., and Campisi, P. (2020a). Loss func-

tions for cnn-based biometric vein recognition. In Eu-

ropean Signal Processing Conference (EUSIPCO).

Kuzu, R. S., Maiorana, E., and Campisi, P. (2020b). Vein-

based biometric veriﬁcation using densely-connected

convolutional autoencoder. IEEE Signal Processing

Letters, 27:1869–1873.

Nandakumar, K. and Jain, A. K. (2015). Biometric tem-

plate protection: Bridging the performance gap be-

tween theory and practice. IEEE Signal Processing

Magazine, 32(5):88–100.

Patel, V. M., Ratha, N. K., and Chellappa, R. (2015). Cance-

lable biometrics: A review. IEEE Signal Processing

Magazine: Special Issue on Biometric Security and

Privacy, 32(5):54–65.

Rathgeb, C. and Uhl, A. (2011). A survey on biometric

cryptosystems and cancelable biometrics. EURASIP

Journal on Information Security, 3:1–25.

Simoens, K., Tuyls, P., and Preneel, B. (2009). Privacy

weaknesses in biometric sketches. In 2009 30th IEEE

Symposium on Security and Privacy, pages 188–203.

Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K. (2017).

Aggregated residual transformations for deep neural

networks.

Yin, Y., Liu, L., and Sun, X. (2011). Sdumla-hmt: A mul-

timodal biometric database. In Sun, Z., Lai, J., Chen,

X., and Tan, T., editors, Biometric Recognition, pages

260–268, Berlin, Heidelberg. Springer Berlin Heidel-

berg.

On the Statistical Independence of Parametric Representations in Biometric Cryptosystems: Evaluation and Improvement

487