Evolutional Normal Maps: 3D Face Representations for 2D-3D Face

Recognition, Face Modelling and Data Augmentation

Michael Danner

1,2

, Thomas Weber

, Patrik Huber

1,3

, Muhammad Awais

, Matthias Raetsch

and Josef Kittler

Centre for Vision, Speech & Signal Processing, University of Surrey, Guildford, U.K.

ViSiR, Reutlingen University, Reutlingen, Germany

Department of Computer Science, University of York, York, U.K.

patrik.huber@york.ac.uk

Keywords:

Deep Learning, Visual Understanding, Machine Vision, Pattern Recognition, 2D/3D Face Recognition, Local

Describers, Normal-Vector-Map Representation.

Abstract:

We address the problem of 3D face recognition based on either 3D sensor data, or on a 3D face reconstructed

from a 2D face image. We focus on 3D shape representation in terms of a mesh of surface normal vectors.

The ﬁrst contribution of this work is an evaluation of eight different 3D face representations and their mul-

tiple combinations. An important contribution of the study is the proposed implementation, which allows

these representations to be computed directly from 3D meshes, instead of point clouds. This enhances their

computational efﬁciency. Motivated by the results of the comparative evaluation, we propose a 3D face shape

descriptor, named Evolutional Normal Maps, that assimilates and optimises a subset of six of these approaches.

The proposed shape descriptor can be modiﬁed and tuned to suit different tasks. It is used as input for a deep

convolutional network for 3D face recognition. An extensive experimental evaluation using the Bosphorus 3D

Face, CASIA 3D Face and JNU-3D Face datasets shows that, compared to the state of the art methods, the

proposed approach is better in terms of both computational cost and recognition accuracy.

1 INTRODUCTION

Face recognition and matching are important tech-

nologies for many application scenarios, including

identity veriﬁcation, public security, human-computer

interactions, person tracking and re-identiﬁcation for

process monitoring such as passenger progression

through airports, secondary authentication for mo-

bile devices, and for indexing into large multimedia

archives of media and entertainment companies. Dur-

ing the last decade, face biometrics research has been

dominated by 2D face recognition. This is primar-

ily the consequence of the recent advent of deep neu-

ral networks, which can learn face image representa-

tions that are largely invariant to nuisance factors such

as illumination and pose changes. However, since

the ﬁrst computer assisted face recognition system of

Kanade (Kanade, 1977) half a century ago, there has

always been interest in 3D face recognition as the ul-

timate target technology, that has the capacity to dis-

entangle face skin texture from the 3D shape of this

intrinsically 3D object, and use these two sources of

biometric information in the most productive way.

The interest in 3D face recognition is evident not

only from the continuing research on this topic in the

literature (Kittler et al., 2005), but also from the ad-

vances in the development of 3D capture systems. 3D

capable cameras based on binocular stereo vision and

time of ﬂight (ToF) technology are becoming more

and more affordable. For example, Apple have been

shipping user-facing depth cameras in their consumer

mobile phones for a number of years now, and Sony

have started developing next-generation 3D sensors

with ToF technology. Clearly, the quality of images

captured by such user devices is not comparable to the

output of high-resolution 3D scanners and decreases

rapidly with increasing distance, so that the working

distance is typically just a few meters. However, if

the lighting conditions are difﬁcult, and if the subject

is unconstrained in terms of facial expression and oc-

clusions, then the recognition can be more effective

than using purely 2D RGB images. Moreover, 3D

sensing is also more robust to spooﬁng attacks.

Danner, M., Weber, T., Huber, P., Awais, M., Raetsch, M. and Kittler, J.

Evolutional Normal Maps: 3D Face Representations for 2D-3D Face Recognition, Face Modelling and Data Augmentation.

DOI: 10.5220/0010912000003124

In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 5: VISAPP, pages

267-274

ISBN: 978-989-758-555-5; ISSN: 2184-4321

267

Our own interest in 3D face recognition is from

the point of view of 3D assisted 2D face recogni-

tion. Sinha et al.(Sinha et al., 2006) showed that both

photo-metric and shape cues are equally used by hu-

mans to recognise faces. The motivation for the 3D

assisted 2D face recognition approach is the disentan-

glement of shape and texture, achieved by 3D face

model ﬁtting. The ﬁtted 3D face should offer a bet-

ter control and aggregation of the different sources of

biometric information (shape and texture), as well as

suppression of the effect of illumination, expression

and pose. The 3D assisted 2D face recognition ap-

proach can also beneﬁt from the availability of large

2D face databases, which are essential for effective

machine learning.

In comparison with 2D, the progress in 3D face

recognition has been hampered not only by expensive

sensor hardware and lack of data for training, but also

by the incongruence of 3D face representations in the

form of 3D meshes with data structures enabling ef-

ﬁcient processing by convolutional neural networks.

However, this problem has recently been overcome

by means of graph neural networks (Scarselli et al.,

2009), and 3D face image remapping onto a 2D mage

structure such as isomap or Laplacian map (Feng

et al., 2018; Kittler et al., 2018). The additional prob-

lem of errors caused by reconstructing a 3D from its

2D projection has also been recently mitigated (Dan-

ner et al., 2019). Thus we are reaching the point when

the vision of 3D assisted 2D face recognition is be-

coming realistic.

This paper is concerned with 3D face recognition

in the context of 3D face biometrics per se, or as the

ultimate step in 3D assisted 2D face recognition. Our

approach involves mapping a 3D face mesh into 2D

for CNN based matching. We conﬁne our interest to

the 3D shape information only, and investigate, how

the face shape should be represented and in what form

it should be provided to the neural network. We shall

explore a number of alternatives to the raw 3D mea-

surement information, and propose a novel represen-

tation, called evolutional normal map, which is shown

to be very effective from the 3D face shape recog-

nition point of view. We compare it with a number

of existing representation methods, and demonstrate

on several 3D databases, that it delivers impressive

recognition accuracy.

The rest of the paper is organised as follows: Sec-

tion 2 reviews the related work and recent approaches

to 3D face representations and 3D assisted 2D face

recognition. Section 3 discusses normal vector maps

as an alternative to raw 3D measurement informa-

tion and introduces our proposed Evolutional Normal

Maps. Section 4 presents the results of an extensive

evaluation of the proposed system and its compari-

son with the state of the art methods. We conduct

experiments with Bosphorus 3D Face (Aly

uz et al.,

2008), CASIA 3D Face Database (Zhong et al., 2008)

and JNU-3D dataset (Koppen et al., 2018) datasets to

benchmark the 3D representations compared. Con-

clusions and future work are presented in section 5.

2 RELATED WORK

3D Face Recognition. A conventional 3D Face

recognition approach comprises methods like 3D face

landmarking, 3D face registration and facial feature

extraction. The 3D face landmarking locates the geo-

metric positions of reference points for the face. The

3D face registration aims to register 3D face scans in

a coordinate system so that the adjustment of facial

features can be carried out in a consistent manner.

Extracting facial features means creating a distinctive

face representation that should fully describe each 3D

face scan. Kakadiaris et al. (Kakadiaris et al., 2017)

already shows the effectiveness of a 3D-2D frame-

work for face recognition. The main advantage is the

more practical use than 3D-3D and higher accuracy

for 2D-2D face recognition.

Face Reconstruction. Dou et al.proposed a way for

Monocular 3D facial shape reconstruction from 2D

facial images (Dou et al., 2017) in a more effective

way than other approaches. Huber et al. (Huber et al.,

2016) also showed the power of 3D Morphable Face

Models in computer vision and the widely utilisable

reconstruction of a 3D face from a single 2D image.

These 3D Morphable Face Models can be used for

pose estimation, analysis and recognition and also for

facial landmark detection and tracking.

Depth Images and Point Clouds. Depth images,

depth point clouds, or 3D meshes have emerged as

an important tool and principle in biometrics and

face recognition research. According to Kakadi-

aris et al. (Kakadiaris et al., 2017) the existing frame-

works for face recognition vary across approaches

(e.g. model-based, data-driven and perceptual) or fa-

cial data domains (e.g. images, point clouds, depth

maps). The main beneﬁts for the use of 3D facial

scans such as depth maps are the insensitivity to ambi-

ent inﬂuences and the colour of the skin which could

lead to missing details in 2D face images.

ToF- and Stereo-cameras. Structured-light RGB-

D and Time-of-Flight (ToF) cameras are used in 3D

VISAPP 2022 - 17th International Conference on Computer Vision Theory and Applications

268

perception tasks at close distances. While RGB-D

cameras beneﬁt from high resolution and frame rate,

ToF-cameras have the capability to operate outdoors

and perceive details (Aleny

a et al., 2014).

Local Descriptors. For 3D face recognition tasks,

local descriptors have been established due to their

robustness in variations of illumination and facial ex-

pressions and achieved remarkable recognition rates.

Much work is inspired by local descriptors like

Radon transform (Jafari-Khouzani and Soltanian-

Zadeh, 2005), Textons (Lazebnik et al., 2005) and

Local Binary Patterns (LBP) (Ahonen et al., 2004).

Thanks to the high performance and computation-

ally low complexity and its ﬂexibility to adapt, LBP

has a huge number of improved and extended succes-

sors (Pietik

ainen et al., 2011; Huang et al., 2006; Oua-

mane et al., 2017). High-order local pattern descrip-

tors, like local derivative pattern on normal maps cap-

ture more detailed information by encoding various

distinctive spatial relationships (Soltanpour and Wu,

2017; Soltanpour and Wu, 2019; Zhang et al., 2010).

LBPs and other local descriptors will not be discussed

in this work due to the huge number of variations and

to the fact that this is a lossy conversion of 3D shape

information. Further use in face modeling based on

descriptors is therefore difﬁcult. However, they can

easily be adapted on the evolutional normal maps to

further improve face recognition performance.

Normal-vector Map Representations. 3D polygo-

nal surfaces are represented through their correspond-

ing normal map, a bidimensional array which stores

mesh normal vectors as the pixel’s RGB components

of a colour image. Abate et al.calculate the differ-

ence of two normal maps (Abate et al., 2005) and

achieved remarkable results in face recognition based

on these difference maps. In 3D face recognition, nor-

mal maps having x,y,z mapped to RGB for visualisa-

tion is a common task (Kakadiaris et al., 2006). Gi-

lani et al. (Gilani and Mian, 2018; Gilani et al., 2018)

used depth images and normal vectors with azimuth

and elevation in the form of RGB images for training

a deep convolutional neural network.

3 METHODOLOGY

The values of the normal map matrix are determined

by a partial binary operation that maps the x and y co-

ordinates into unit normal vectors. The horizontal and

vertical coordinates deﬁne the resolution of the nor-

mal map and the range of this operation. In addition,

the normal map matrix elements can be represented

Figure 1: Each triangle is projected into the x,y-plane. the

enclosed pixels are determined and the normal vector com-

ponents are assigned to these pixels.

Figure 2: Portrait picture and normal maps created by con-

ventional methods on Bosphorus 3D Face Database.

by other 2D parameterisations, such as the spherical,

angle-based domain. The normal vectors can be com-

puted from the positions of the 3D points and their

neighbours while the surface is in a point cloud, a 3D

mesh or a range image representation (Daoudi et al.,

2013). The representation of the 3D face shape on

a uniform square matrix that is needed for convolu-

tional neural network, leads to the fact that all normal

maps have the same resolution.

The surface normals contain more detailed and

robust information compared to depth image for 3D

data (Li et al., 2014). Our objective is to explore dis-

criminative facial 3D representations in order to apply

them to 3D facial machine vision tasks. We propose

a reﬁned feature extraction using the surface normals

that provides richer and distinct information, as com-

pared to the other 2D mesh representations.

3.1 Surface Normal Map

This work is inspired by recent algorithms (Mo-

hammadzade and Hatzinakos, 2013; Li et al., 2014;

Evolutional Normal Maps: 3D Face Representations for 2D-3D Face Recognition, Face Modelling and Data Augmentation

269

Figure 3: Evolutional normal maps on Bosphorus 3D Face

Database.

Emambakhsh and Evans, 2017) in which surface

normals have been applied for 3D face recognition.

These works follow normal surface estimation meth-

ods presented by Klasing et al. (Klasing et al., 2009).

Although the use of surface normals is a good prac-

tice for point clouds, it is not necessary in popular 3D

Face Databases and registered scan scenarios. A 3D

mesh consists of vertices and faces which can be used

to calculate the normal vectors, instead of estimating.

At ﬁrst, we have to ensure the mesh is registered

to the x,y-plane and deﬁne the space where the ver-

tices are located. The vertices x,y-coordinates will

be mapped to the resulting normal map determining a

transformation. This transformation assigns the coor-

dinates from the domain of the mesh to the domain of

the normal matrix.

Then, for each polygon in the mesh, we calculate

triangles, if necessary. Each triangle consists of three

vertices points P = {p

, p

} where p

∈ R

. Two

vectors are determined with v

= p

− p

and v

− p

. Then, a unit normal vector ˆn can be calculated

ˆn =

||n||

× v

||v

|| ||v

|| | sin θ|

(1)

where ˆn = [ ˆn

, ˆn

]

Furthermore, each triangle is projected into the

x,y-plane and the enclosed pixels are determined by

giving consideration to the targeted image resolution.

The process in described in Figure 1.

Azimuth and Elevation. Given the surface unit

normal vector ˆn = ( ˆn

, ˆn

) at a point, the azimuth

angle α is deﬁned as the angle between the positive x-

axis and the projection of n to the x-y plane. The ele-

vation angle φ is the angle between n and the vector’s

orthogonal projection onto the xy plane. The eleva-

tion angle is positive when going toward the positive

z-axis from the xy plane.

α = tan

−1



ˆn



(2)

φ = tan

−1





ˆn



ˆn

+ ˆn







(3)

For each pixel the normal vector component is

stored in a 2D-matrix, resulting in ﬁve images for x, y,

z, azimuth and elevation values. Optionally, the mini-

mum and maximum of each matrix can be determined

and the histogram values can be stretched to the full

range.

Pixels in these matrices with no assigned values

are considered as background and the background

value is assigned to them, which is 0 by default. Fi-

nally, the matrices N

, N

are surface normal maps

having respectively x, y and z dimensions.

3.2 Evolutional Normal Maps

We now have several methods for calculation and es-

timation of normals and the corresponding algorithms

for depth map and multiple normal maps. The ques-

tion is, can the normal maps and representations based

on them still be improved without loss of informa-

tion? Applying local descriptors lead to increased his-

togram information which is beneﬁcial to face recog-

nition but 3D shape information is lost. Indeed, based

on the fact that increasing histogram information is

helpful, the minimum and maximum values of the

normal component matrices are used to stretch them

to the full range. Additionally, on the matrices with

normal component x and y, the absolute value is

used which doubles the gradients from bright to dark.

These techniques are loss-free and can be facilely re-

versed.

Another method to manipulate histogram values

is used in inverting the values of the matrix to swap

bright and dark grey tones. In the experimental eval-

uation, diverse operations are applied to prove this as

a valuable operation on normal-maps.

It should be emphasised that the methods men-

tioned only represent part of the possibilities with

improving normal maps. That is why we call

the set of 3D representations ‘evolutional’ normal

maps because there are lots of parameters to ad-

just the outcome and they can be ﬂexibly adapted

to many facial 3D machine vision tasks: (a) calcu-

late N

, N

, Depth map or any combination

of these. (b) Invert greyscale of the normal map.

VISAPP 2022 - 17th International Conference on Computer Vision Theory and Applications

270

Figure 4: Example ENM set of images from Bosphorus 3D (left) and CASIA 3D (right): 1

row from left to right: Absolute-

Values-Normal-X-map, Absolute-Values-Normal-Y-map, Normal-Z-map, Combination of avN

, avN

, N

; 2

row from left

to right: Normal-X-map, Normal-Y-map, Normal-Z-map, Combination of N

, N

; 3

row from left to right: depth map,

Azimuth-Normal-map, Elevation-Normal-map, Combination of depth map, N

, N

or absolute normal values. (e) Adjust image resolu-

tion. (f) Horizontal and vertical rotation on the mesh

for data augmentation. (g) Accessible for local de-

scribers.

4 EXPERIMENTAL EVALUATION

The datasets used in this work are CASIA, Bosphorus

and JNU-3D. The CASIA 3D FaceV1 Database con-

tains 4,674 scans of 123 subjects, where each subject

is captured with more than 35 different expressions

and poses. Bosphorus 3D Face Database comprises

105 identities and 4,666 scans and is popular because

of its rich repertoire of expressions. Beside 3D face

recognition, this database is often used for expression

recognition and facial action unit detection. The JNU-

3D data set consists of 774 3D faces and is used for

augmenting the 3D database and for accuracy tests.

Evaluation Protocol. We evaluate the depth image

and the normal maps on face recognition accuracy.

For the benchmark training the normal map size is

ﬁxed to 300x300 since previous experiments showed

this as an efﬁcient resolution on recognition rate and

training duration.

In total, we evaluate the accuracy with 16 different

experiments shown in ﬁg. 4 and ﬁg. 5 . Therefore, as a

run of the experiment we deﬁned the basis, the ”pure”

depth map D. Followed up by the 8 different nor-

mal maps in x-dimension N

and avN

, y-dimension

and avN

, z-dimension N

and avN

, azimuth an-

gle N

and the elevation angle N

. Accordingly the

dimension of the network input layer is (1, 300, 300).

Figure 5: Multiple combinations of input images for the

face recogniser from single channel grey scale images up to

six channel D + avN

+ N

zae

In the second run we evaluate the combination of

avN

, avN

, called avN

xyz

, the combination of

, N

, called N

xyz

and the combination of D, N

, N

, called D + N

also shown in the ﬁg. 4. Accord-

ingly the dimension of the network input layer is (3,

300, 300).

The ﬁnal run includes the combination of the

depth map and the ﬁve normal maps, called D +

xyzae

. Accordingly the dimension of the network

input layer is (6, 300, 300). This input for the face

recognition network is shown in ﬁg. 5.

Each representation is trained on our face recog-

nition network for 60 epochs on Bosphorus 3D

Database, CASIA 3D and JNU-3D, respectively. On

Bosphorus, all images are used, including facial ex-

pression and partial occlusion. The ﬁrst neutral sam-

ple of each subject is used as the gallery and the re-

maining scans as the probe (neutral vs. all).

Since we want to evaluate the pure 3D represen-

tations we neither apply any data augmentation meth-

ods for the training data nor use local binary pattern

Evolutional Normal Maps: 3D Face Representations for 2D-3D Face Recognition, Face Modelling and Data Augmentation

271

Figure 6: Recognition rate on CASIA 3D data set.

Figure 7: Recognition rate on Bosphorus 3D Face.

layers. Actually, the experiments show that with com-

bining normal maps, we are able to achieve recogni-

tion rates equal to or better than previous work with

lots of augmentation and adapted layers.

Secondary evaluation, using CASIA 3D Database,

is used under same conditions. The ﬁrst scan of each

subject is considered as gallery and ﬁnally the JNU-

3D data set is used for accuracy tests on small net-

works.

Results. To illustrate recognition efﬁciency, the Cu-

mulative Match Characteristics (CMC) curves on CA-

SIA 3D are presented in ﬁg. 6. Results from table 1

show our ENM achieves the best results. High detec-

tion rates in CASIA and Bosphorus are already scored

without data augmentation or the use of local binary

patterns or similar descriptors. Table 2 presents the

Table 1: Rank 1 recognition rate [%] in results on Bospho-

rus and CASIA 3D.

Bosphorus CASIA

depth map 48.6 37.4

78.1 63.4

74.3 60.2

87.6 83.7

85.7 82.1

77.1 69.9

D + N

91.4 92.7

ENM avN

81.9 74.8

ENM avN

xyz

89.5 93.5

ENM 6-layer 94.3 95.9

Table 2: Rank 1 recognition rate [%] results on Bosphorus

and CASIA 3D with augmented data sets.

Bosphorus CASIA

GoogleNet

RGB 63.4 85.9

Resnet152

RGB 7.1 52.9

VGG-Face

RGB 96.4 94.1

GoogleNet 3D 26.8 50.8

Resnet152 3D 3.8 25.3

VGG-Face 3D 48.1 72.0

D + N

FR3DNet

100.0 99.7

ENM avN

xyz

100.0 97.3

ENM 6-layer 100.0 99.8

performance of the proposed method in recognising

the 1

rank of the 3D faces by training with aug-

mented data. It achieves the best performance for CA-

SIA 3D faces and measures up to FR3DNet on Bosh-

porus 3D with 100% rank 1 recognition rate. The su-

periority of proposed ENM is also reﬂected in ﬁg. 7.

5 CONCLUSIONS AND FUTURE

WORK

We addressed the problem of 3D face shape represen-

tation with a focus on 2D normal maps. These maps

are various functions of the 3D surface normals, that

are deﬁned on a 3D face mesh and mapped systemat-

ically onto a 2D image. We showed that these normal

maps can be computed very efﬁciently from the 3D

shape mesh.

An extensive comparative evaluation of multiple

variants of these normal maps and their combinations

has been carried out using face recognition accuracy

as a measure of their effectiveness. Motivated by the

results of the comparative study, we proposed a 3D

VISAPP 2022 - 17th International Conference on Computer Vision Theory and Applications

272

face shape descriptor, referred to as Evolutional Nor-

mal Maps, that assimilates, modiﬁes and optimises a

subset composed of six of these normal map represen-

tations. The proposed descriptor is extensively eval-

uated on three benchmarking 3D face datasets with

very promising results. The descriptor is computa-

tionally efﬁcient and most importantly, it outperforms

the state of the art methods in 3D face recognition.

The proposed Evolutional Normal Maps have

many potential applications, apart from 3D face

recognition. For instance, can can be used for data

augmentation and in generative adversarial networks

to render and develop synthetic 3D data. The future

plans also include their use for 3D face reconstruc-

tion and 3D face analysis for emotion, pose and age

estimation.

ACKNOWLEDGEMENTS

Part of the research in this paper uses the CASIA-3D

FaceV1 collected by the Chinese Academy of Sci-

ences Institute of Automation (CASIA).

This work is partially supported by a grant of the

BMWi ZIM program, no. ZF4029424HB9.

REFERENCES

Abate, A. F., Nappi, M., Ricciardi, S., and Sabatino, G.

(2005). Fast 3d face recognition based on normal

map. In Proceedings of the 2005 International Con-

ference on Image Processing, ICIP 2005, Genoa,

Italy, September 11-14, 2005, pages 946–949. IEEE.

Ahonen, T., Hadid, A., and Pietik

ainen, M. (2004). Face

recognition with local binary patterns. In Pajdla,

T. and Matas, J., editors, Computer Vision - ECCV

2004, 8th European Conference on Computer Vision,

Prague, Czech Republic, May 11-14, 2004. Proceed-

ings, Part I, volume 3021 of Lecture Notes in Com-

puter Science, pages 469–481. Springer.

Aleny

a, G., Foix, S., and Torras, C. (2014). Using tof and

RGBD cameras for 3d robot perception and manip-

ulation in human environments. Intelligent Service

Robotics, 7(4):211–220.

Aly

uz, N., G

okberk, B., Dibeklioglu, H., Savran, A., Salah,

A. A., Akarun, L., and Sankur, B. (2008). 3d face

recognition benchmarks on the bosphorus database

with focus on facial expressions. In Schouten, B.

A. M., Juul, N. C., Drygajlo, A., and Tistarelli, M., ed-

itors, Biometrics and Identity Management, First Eu-

ropean Workshop, BIOID 2008, Roskilde, Denmark,

May 7-9, 2008. Revised Selected Papers, volume 5372

of Lecture Notes in Computer Science, pages 57–66.

Springer.

Danner, M., Raetsch, M., Huber, P., Awais, M., Feng, Z.,

and Kittler, J. (2019). Texture-based 3D face recog-

nition using deep neural networks for unconstrained

human-machine interaction. In Proceedings of the

15th International Joint Conference on Computer Vi-

sion, Imaging and Computer Graphics Theory and

Applications VISAPP 2020. SCITEPRESS.

Daoudi, M., Srivastava, A., and Veltkamp, R. (2013). 3D

Face Modeling, Analysis and Recognition. Wiley.

Dou, P., Shah, S. K., and Kakadiaris, I. A. (2017). End-to-

end 3d face reconstruction with deep neural networks.

In 2017 IEEE Conference on Computer Vision and

Pattern Recognition, CVPR 2017, Honolulu, HI, USA,

July 21-26, 2017, pages 1503–1512. IEEE Computer

Society.

Emambakhsh, M. and Evans, A. N. (2017). Nasal patches

and curves for expression-robust 3d face recognition.

IEEE Trans. Pattern Anal. Mach. Intell., 39(5):995–

1007.

Feng, Z.-H., Huber, P., Kittler, J., Hancock, P., Wu, X.-J.,

Zhao, Q., Koppen, P., and R

atsch, M. (2018). Evalua-

tion of dense 3D reconstruction from 2D face images

in the wild. In 2018 13th IEEE International Confer-

ence on Automatic Face & Gesture Recognition (FG

2018), pages 780–786. IEEE.

Gilani, S. Z. and Mian, A. (2018). Learning from mil-

lions of 3d scans for large-scale 3d face recogni-

tion. In 2018 IEEE Conference on Computer Vision

and Pattern Recognition, CVPR 2018, Salt Lake City,

UT, USA, June 18-22, 2018, pages 1896–1905. IEEE

Computer Society.

Gilani, S. Z., Mian, A. S., Shafait, F., and Reid, I. (2018).

Dense 3d face correspondence. IEEE Trans. Pattern

Anal. Mach. Intell., 40(7):1584–1598.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-

ual learning for image recognition. In 2016 IEEE Con-

ference on Computer Vision and Pattern Recognition,

CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016,

pages 770–778. IEEE Computer Society.

Huang, Y., Wang, Y., and Tan, T. (2006). Combining

statistics of geometrical and correlative features for 3d

face recognition. In Chantler, M. J., Fisher, R. B.,

and Trucco, E., editors, Proceedings of the British

Machine Vision Conference 2006, Edinburgh, UK,

September 4-7, 2006, pages 879–888. British Machine

Vision Association.

Huber, P., Hu, G., Tena, J. R., Mortazavian, P., Koppen,

W. P., Christmas, W. J., R

atsch, M., and Kittler, J.

(2016). A multiresolution 3D morphable face model

and ﬁtting framework. In Magnenat-Thalmann, N.,

Richard, P., Linsen, L., Telea, A., Battiato, S., Imai,

F. H., and Braz, J., editors, Proceedings of the 11th

Joint Conference on Computer Vision, Imaging and

Computer Graphics Theory and Applications (VISI-

GRAPP 2016) - Volume 4: VISAPP, Rome, Italy,

February 27-29, 2016, pages 79–86. SciTePress.

Jafari-Khouzani, K. and Soltanian-Zadeh, H. (2005). Radon

transform orientation estimation for rotation invariant

texture analysis. IEEE Trans. Pattern Anal. Mach. In-

tell., 27(6):1004–1008.

Kakadiaris, I. A., Passalis, G., Toderici, G., Murtuza, M. N.,

and Theoharis, T. (2006). 3d face recognition. In

Evolutional Normal Maps: 3D Face Representations for 2D-3D Face Recognition, Face Modelling and Data Augmentation

273

Chantler, M. J., Fisher, R. B., and Trucco, E., edi-

tors, Proceedings of the British Machine Vision Con-

ference 2006, Edinburgh, UK, September 4-7, 2006,

pages 869–878. British Machine Vision Association.

Kakadiaris, I. A., Toderici, G., Evangelopoulos, G., Pas-

salis, G., Chu, D., Zhao, X., Shah, S. K., and Theo-

haris, T. (2017). 3d-2d face recognition with pose and

illumination normalization. Comput. Vis. Image Un-

derst., 154:137–151.

Kanade, T. (1977). Computer recognition of human faces.

Interdisciplinary Systems Research, 47.

Kittler, J., Hilton, A., Hamouz, M., and Illingworth, J.

(2005). 3d assisted face recognition: A survey of

3d imaging, modelling and recognition approachest.

In IEEE Conference on Computer Vision and Pattern

Recognition, CVPR Workshops 2005, San Diego, CA,

USA, 21-23 September, 2005, page 114. IEEE Com-

puter Society.

Kittler, J., Koppen, P., Kopp, P., Huber, P., and R

atsch, M.

(2018). Conformal mapping of a 3D face representa-

tion onto a 2D image for CNN based face recognition.

In 2018 International Conference on Biometrics, ICB

2018, Gold Coast, Australia, February 20-23, 2018,

pages 124–131. IEEE.

Klasing, K., Althoff, D., Wollherr, D., and Buss, M. (2009).

Comparison of surface normal estimation methods for

range sensing applications. In 2009 IEEE Interna-

tional Conference on Robotics and Automation, ICRA

2009, Kobe, Japan, May 12-17, 2009, pages 3206–

3211. IEEE.

Koppen, W. P., Feng, Z., Kittler, J., Awais, M., Christmas,

W. J., Wu, X., and Yin, H. (2018). Gaussian mixture

3d morphable face model. Pattern Recognit., 74:617–

628.

Lazebnik, S., Schmid, C., and Ponce, J. (2005). A sparse

texture representation using local afﬁne regions. IEEE

Trans. Pattern Anal. Mach. Intell., 27(8):1265–1278.

Li, H., Huang, D., Morvan, J., Chen, L., and Wang,

Y. (2014). Expression-robust 3d face recognition

via weighted sparse representation of multi-scale and

multi-component local normal patterns. Neurocom-

puting, 133:179–193.

Mohammadzade, H. and Hatzinakos, D. (2013). Iterative

closest normal point for 3d face recognition. IEEE

Trans. Pattern Anal. Mach. Intell., 35(2):381–397.

Ouamane, A., Boutellaa, E., Bengherabi, M., Taleb-Ahmed,

A., and Hadid, A. (2017). A novel statistical and mul-

tiscale local binary feature for 2d and 3d face veriﬁca-

tion. Comput. Electr. Eng., 62:68–80.

Parkhi, O. M., Vedaldi, A., and Zisserman, A. (2015).

Deep face recognition. In Xie, X., Jones, M. W., and

Tam, G. K. L., editors, Proceedings of the British Ma-

chine Vision Conference 2015, BMVC 2015, Swansea,

UK, September 7-10, 2015, pages 41.1–41.12. BMVA

Press.

Pietik

ainen, M., Hadid, A., Zhao, G., and Ahonen, T.

(2011). Computer Vision Using Local Binary Pat-

terns, volume 40 of Computational Imaging and Vi-

sion. Springer.

Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and

Monfardini, G. (2009). The graph neural network

model. IEEE Trans. Neural Networks, 20(1):61–80.

Sinha, P., Balas, B. J., Ostrovsky, Y., and Russell, R.

(2006). Face recognition by humans: Nineteen results

all computer vision researchers should know about.

Proceedings of the IEEE, 94(11):1948–1962.

Soltanpour, S. and Wu, Q. J. (2019). Weighted ex-

treme sparse classiﬁer and local derivative pattern for

3d face recognition. IEEE Trans. Image Process.,

28(6):3020–3033.

Soltanpour, S. and Wu, Q. M. J. (2017). High-order local

normal derivative pattern (LNDP) for 3d face recogni-

tion. In 2017 IEEE International Conference on Im-

age Processing, ICIP 2017, Beijing, China, September

17-20, 2017, pages 2811–2815. IEEE.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E.,

Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-

novich, A. (2015). Going deeper with convolutions.

In IEEE Conference on Computer Vision and Pattern

Recognition, CVPR 2015, Boston, MA, USA, June 7-

12, 2015, pages 1–9. IEEE Computer Society.

Zhang, B., Gao, Y., Zhao, S., and Liu, J. (2010). Local

derivative pattern versus local binary pattern: Face

recognition with high-order local pattern descriptor.

IEEE Trans. Image Process., 19(2):533–544.

Zhong, C., Sun, Z., and Tan, T. (2008). Learning efﬁcient

codes for 3d face recognition. In Proceedings of the

International Conference on Image Processing, ICIP

2008, October 12-15, 2008, San Diego, California,

USA, pages 1928–1931. IEEE.

VISAPP 2022 - 17th International Conference on Computer Vision Theory and Applications

274