DETECTOR OF FACIAL LANDMARKS LEARNED BY THE STRUCTURED OUTPUT SVM

Michal Uřičař, Vojtěch Franc, Václav Hlaváč

2012

Abstract

In this paper we describe a detector of facial landmarks based on the Deformable Part Models. We treat the task of landmark detection as an instance of the structured output classification problem. We propose to learn the parameters of the detector from data by the Structured Output Support Vector Machines algorithm. In contrast to the previous works, the objective function of the learning algorithm is directly related to the performance of the resulting detector which is controlled by a user-defined loss function. The resulting detector is real-time on a standard PC, simple to implement and it can be easily modified for detection of a different set of landmarks. We evaluate performance of the proposed landmark detector on a challenging “Labeled Faces in the Wild” (LFW) database. The empirical results demonstrate that the proposed detector is consistently more accurate than two public domain implementations based on the Active Appearance Models and the Deformable Part Models. We provide an open-source implementation of the proposed detector and the manual annotation of the facial landmarks for all images in the LFW database.

References

  1. Beumer, G., Tao, Q., Bazen, A., and Veldhuis, R. (2006). A landmark paper in face recognition. In 7th International Conference on Automatic Face and Gesture Recognition (FGR-2006). IEEE Computer Society Press.
  2. Beumer, G. and Veldhuis, R. (2005). On the accuracy of EERs in face recognition and the importance of reliable registration. In 5th IEEE Benelux Signal Processing Symposium (SPS-2005), pages 85-88. IEEE Benelux Signal Processing Chapter.
  3. Bordes, A., Bottou, L., and Gallinari, P. (2009). Sgdqn: Careful quasi-newton stochastic gradient descent. Journal of Machine Learning Research, 10:1737- 1754.
  4. Cootes, T., Edwards, G. J., and Taylor, C. J. (2001). Active appearance models. IEEE Trans. Pattern Analysis and Machine Intelligence, 23(6):681-685.
  5. Crandall, D., Felzenszwalb, P., and Huttenlocher, D. (2005). Spatial priors for part-based recognition using statistical models. In In CVPR, pages 10-17.
  6. Cristinacce, D. and Cootes, T. (2003). Facial feature detection using adaboost with shape constraints. In 14th Proceedings British Machine Vision Conference (BMVC-2003), pages 231-240.
  7. Cristinacce, D., Cootes, T., and Scott, I. (2004). A multistage approach to facial feature detection. In 15th British Machine Vision Conference (BMVC-2004), pages 277-286.
  8. Erukhimov, V. and Lee, K. (2008). A bottom-up framework for robust facial feature detection. In 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG2008), pages 1-6.
  9. Everingham, M., Sivic, J., and Zisserman, A. (2006). “Hello! My name is... Buffy” - automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference.
  10. Everingham, M., Sivic, J., and Zisserman, A. (2008). Willow project, automatic naming of characters in tv video. MATLAB implementation, www: http://www.robots.ox.ac.uk/ vgg/research/nface/inde x.html.
  11. Everingham, M., Sivic, J., and Zisserman, A. (2009). Taking the bite out of automatic naming of characters in TV video. Image and Vision Computing, 27(5).
  12. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2009). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 99(1).
  13. Felzenszwalb, P. F. and Huttenlocher, D. P. (2005). Pictorial structures for object recognition. Internatinal Journal of Computer Vision, 61:55-79.
  14. Fischler, M. A. and Elschlager, R. A. (1973). The representation and matching of pictorial structures. IEEE Transactions on Computers, C-22(1):67-92.
  15. Franc, V. and Sonnenburg, S. (2010). Libocas - library implementing ocas solver for training linear svm classifiers from large-scale data. www: http://cmp.felk.cvut.cz/ xfrancv/ocas/html/index.html.
  16. Heikkilä, M., Pietikäinen, M., and Schmid, C. (2009). Description of interest regions with local binary patterns. Pattern Recognition, 42(3):425-436.
  17. Huang, G. B., Ramesh, M., Berg, T., and Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.
  18. Kroon, D.-J. (2010). Active shape model (ASM) and active appearance model (AAM). MATLAB Central, www: http://www.mathworks.com/matlabcentral/fileexchan ge/26706-active-shape-model-asm-and-activeappearance-model-aam.
  19. Nordstrøm, M. M., Larsen, M., Sierakowski, J., and Stegmann, M. B. (2004). The imm face database - an annotated dataset of 240 face images. Technical report, Informatics and Mathematical Modelling, Technical University of Denmark, DTU.
  20. Riopka, T. and Boult, T. (2003). The eyes have it. In Proceedings of ACM SIGMM Multimedia Biometrics Methods and Applications Workshop, pages 9-16.
  21. Sivic, J., Everingham, M., and Zisserman, A. (2009). “Who are you?” - learning person specific classifiers from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  22. Teo, C. H., Vishwanthan, S., Smola, A. J., and Le, Q. V. (2010). Bundle methods for regularized risk minimization. J. Mach. Learn. Res., 11:311-365.
  23. Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y., and Singer, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6:1453-1484.
  24. Viola, P. and Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2):137-154.
  25. Wu, J. and Trivedi, M. (2005). Robust facial landmark detection for intelligent vehicle system. In IEEE International Workshop on Analysis and Modeling of Faces and Gestures.
Download


Paper Citation


in Harvard Style

Uřičař M., Franc V. and Hlaváč V. (2012). DETECTOR OF FACIAL LANDMARKS LEARNED BY THE STRUCTURED OUTPUT SVM . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012) ISBN 978-989-8565-03-7, pages 547-556. DOI: 10.5220/0003863705470556


in Bibtex Style

@conference{visapp12,
author={Michal Uřičař and Vojtěch Franc and Václav Hlaváč},
title={DETECTOR OF FACIAL LANDMARKS LEARNED BY THE STRUCTURED OUTPUT SVM},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012)},
year={2012},
pages={547-556},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003863705470556},
isbn={978-989-8565-03-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2012)
TI - DETECTOR OF FACIAL LANDMARKS LEARNED BY THE STRUCTURED OUTPUT SVM
SN - 978-989-8565-03-7
AU - Uřičař M.
AU - Franc V.
AU - Hlaváč V.
PY - 2012
SP - 547
EP - 556
DO - 10.5220/0003863705470556