Authors:
Carter Ung
;
Pranav Mantini
and
Shishir Shah
Affiliation:
Department of Computer Science, University of Houston, Houston, TX, U.S.A.
Keyword(s):
Face Recognition, Computer Vision.
Abstract:
In unconstrained environments, extreme pose variations of the face are a long-standing challenge for person identification systems. The natural occlusion of necessary facial landmarks is notable to model performance degradation in face recognition. Pose-invariant models are data-hungry and require large variations of pose in training data to achieve comparable accuracy in recognizing faces from extreme viewpoints. However, data collection is expensive and time-consuming, resulting in a scarcity of facial datasets with large pose variations for model training. In this study, we propose a training framework to enhance pose-invariant face recognition by identifying the minimum number of poses for training deep convolutional neural network (CNN) models, enabling higher accuracy with minimum cost for training data. We deploy ArcFace, a state-of-the-art recognition model, as a baseline to evaluate model performance in a probe-gallery matching task across groups of facial poses categorized
by pitch and yaw Euler angles. We perform training and evaluation of ArcFace on varying pose bins to determine the rank-1 accuracy and observe how recognition accuracy is affected. Our findings reveal that: (i) a group of poses at -45◦, 0◦, and 45◦yaw angles achieve uniform rank-1 accuracy across all yaw poses, (ii) recognition performance is better with negative pitch angles than positive pitch angles, and (iii) training with image augmentations like horizontal flips results in similar or better performance, further minimizing yaw poses to a frontal and 3 4 view.
(More)