Towards Pose-free Tracking of Non-rigid Face using Synthetic Data

Ngoc-Trung Tran; Fakhreddine Ababsa; Maurice Charbit

doi:10.5220/0005179300370044

Towards Pose-free Tracking of Non-rigid Face using Synthetic Data

Ngoc-Trung Tran, Fakhreddine Ababsa, Maurice Charbit

2015

Abstract

The non-rigid face tracking has been achieved many advances in recent years, but most of empirical experiments are restricted at near-frontal face. This report introduces a robust framework for pose-free tracking of non-rigid face. Our method consists of two phases: training and tracking. In the training phase, a large offline synthesized database is built to train landmark appearance models using linear Support Vector Machine (SVM). In the tracking phase, a two-step approach is proposed: the first step, namely initialization, benefits 2D SIFT matching between the current frame and a set of adaptive keyframes to estimate the rigid parameters. The second step obtains the whole set of parameters (rigid and non-rigid) using a heuristic method via pose-wise SVMs. The combination of these aspects makes our method work robustly up to 90° of vertical axial rotation. Moreover, our method appears to be robust even in the presence of fast movements and tracking losses. Comparing to other published algorithms, our method offers a very good compromise of rigid and non-rigid parameter accuracies. This study gives a promising perspective because of the good results in terms of pose estimation (average error is less than 4°on BUFT dataset) and landmark tracking precision (5.8 pixel error compared to 6.8 of one state-of-the-art method on Talking Face video). These results highlight the potential of using synthetic data to track non-rigid face in unconstrained poses.

References

Aggarwal, G., Veeraraghavan, A., and Chellappa, R. (2005). 3d facial pose tracking in uncalibrated videos. In Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence.
Ahlberg, J. (2001). Candide-3 - an updated parameterised face. Technical report, Dept. of Electrical Engineering, Linkoping University, Sweden.
Alonso, J., Davoine, F., and Charbit, M. (2007). A linear estimation method for 3d pose and facial animation tracking. In CVPR.
An, K. H. and Chung, M.-J. (2008). 3d head tracking and pose-robust 2d texture map-based face recognition using a simple ellipsoid model. IROS, pages 307-312.
Asteriadis, S., Karpouzis, K., and Kollias, S. (2014). Visual focus of attention in non-calibrated environments using gaze estimation. IJCV.
Besl, P. J. and McKay, N. D. (1992). A method for registration of 3-d shapes. TPAMI, 14(2):239-256.
Cao, X., Wei, Y., Wen, F., and Sun, J. (2012). Face alignment by explicit shape regression. In CVPR.
Cascia, M. L., Sclaroff, S., and Athitsos, V. (2000). Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3d models. TPAMI, 22(4):322-336.
Chen, Y. and Davoine, F. (2006). Simultaneous tracking of rigid head motion and non-rigid facial animation by analyzing local features statistically. In BMVC.
Cootes, T. F., Edwards, G. J., and Taylor, C. J. (2001). Active appearance models. TPAMI, 23(6):681-685.
Cristinacce, D. and Cootes, T. F. (2006). Feature detection and tracking with constrained local models. In BMVC.
Dementhon, D. F. and Davis, L. S. (1995). Model-based object pose in 25 lines of code. IJCV, 15:123-141.
Dollar, P., Welinder, P., and Perona, P. (2010). Cascaded pose regression. In CVPR.
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008). LIBLINEAR: A library for large linear classification. JMLR, 9:1871-1874.
Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. volume 24, pages 381-395.
Gross, R., Matthews, I., and Baker, S. (2006). Active appearance models with occlusion. IVC, 24(6):593-604.
Gross, R., Matthews, I., Cohn, J. F., Kanade, T., and Baker, S. (2010). Multi-pie. IVC, 28(5):807-813.
Gu, L. and Kanade, T. (2006). 3d alignment of face in a single image. In CVPR.
Jang, J.-S. and Kanade, T. (2008). Robust 3d head tracking by online feature registration. In FG.
Koestinger, M., Wohlhart, P., Roth, P. M., and Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies.
Lee, K., Ho, J., Yang, M., and Kriegman, D. (2003). Videobased face recognition using probabilistic appearance manifolds. volume 1, pages 313-320.
Lefevre, S. and Odobez, J.-M. (2009). Structure and appearance features for robust 3d facial actions tracking. In ICME.
Lewis, J. P. (1995). Fast normalized cross-correlation. volume 1995, pages 120-123.
Matthews, I. and Baker, S. (2004). Active appearance models revisited. IJCV, 60(2):135 - 164.
Morency, L.-P., Whitehill, J., and Movellan, J. R. (2008). Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation. In FG.
Ojala, T., Pietikäinen, M., and Harwood, D. (1996). A comparative study of texture measures with classification based on featured distributions. PR, 29(1):51-59.
Sanderson, C. (2002). The VidTIMIT Database. Technical report, IDIAP.
Saragih, J. M., Lucey, S., and Cohn, J. F. (2011). Deformable model fitting by regularized landmark meanshift. IJCV, 91:200-215.
Ström, J. (2002). Model-based real-time head tracking. EURASIP, 2002(1):1039-1052.
Su, Y., Ai, H., and Lao, S. (2009). Multi-view face alignment using 3d shape model for view estimation. In Proceedings of the Third International Conference on Advances in Biometrics.
Tran, N.-T., eddine Ababsa, F., Charbit, M., Feldmar, J., Petrovska-Delacrtaz, D., and Chollet, G. (2013). 3d face pose and animation tracking via eigendecomposition based bayesian approach. In ISVC.
Vacchetti, L., Lepetit, V., and Fua, P. (2004). Stable realtime 3d tracking using online and offline information. TPAMI, 26(10):1385-1391.
Wang, H., Davoine, F., Lepetit, V., Chaillou, C., and Pan, C. (2012). 3-d head tracking via invariant keypoint learning. IEEE Transactions on Circuits and Systems for Video Technology, 22(8):1113-1126.
Wang, Y., Lucey, S., and Cohn, J. (2008). Enforcing convexity for improved alignment with constrained local models. In CVPR.
Xiao, J., Baker, S., Matthews, I., and Kanade, T. (2004). Real-time combined 2d+3d active appearance models. In CVPR, volume 2, pages 535 - 542.
Xiao, J., Moriyama, T., Kanade, T., and Cohn, J. (2003). Robust full-motion recovery of head by dynamic templates and re-registration techniques. International Journal of Imaging Systems and Technology, 13:85 - 94.
Xiong, X. and la Torre Frade, F. D. (2013). Supervised descent method and its applications to face alignment. In CVPR.

Download

Paper Citation

in Harvard Style

Tran N., Ababsa F. and Charbit M. (2015). Towards Pose-free Tracking of Non-rigid Face using Synthetic Data . In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-758-077-2, pages 37-44. DOI: 10.5220/0005179300370044

in Bibtex Style

@conference{icpram15,
author={Ngoc-Trung Tran and Fakhreddine Ababsa and Maurice Charbit},
title={Towards Pose-free Tracking of Non-rigid Face using Synthetic Data},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2015},
pages={37-44},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005179300370044},
isbn={978-989-758-077-2},
}

in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - Towards Pose-free Tracking of Non-rigid Face using Synthetic Data
SN - 978-989-758-077-2
AU - Tran N.
AU - Ababsa F.
AU - Charbit M.
PY - 2015
SP - 37
EP - 44
DO - 10.5220/0005179300370044