Robust Face Identification with Small Sample Sizes using Bag of Words and Histogram of Oriented Gradients

Mahir Faik Karaaba, Olarik Surinta, L. R. B. Schomaker, Marco A. Wiering

Abstract

Face identification under small sample conditions is currently an active research area. In a case of very few reference samples, optimally exploiting the training data to make a model which has a low generalization error is an important challenge to create a robust face identification algorithm. In this paper we propose to combine the histogram of oriented gradients (HOG) and the bag of words (BOW) approach to use few training examples for robust face identification. In this HOG-BOW method, from every image many sub-images are first randomly cropped and given to the HOG feature extractor to compute many different feature vectors. Then these feature vectors are given to a K-means clustering algorithm to compute the centroids which serve as a codebook. This codebook is used by a sliding window to compute feature vectors for all training and test images. Finally, the feature vectors are fed into an L2 support vector machine to learn a linear model that will classify the test images. To show the efficiency of our method, we also experimented with two other feature extraction algorithms: HOG and the scale invariant feature transform (SIFT). All methods are compared on two well-known face image datasets with one to three training examples per person. The experimental results show that the HOG-BOW algorithm clearly outperforms the other methods.

References

  1. Ahonen, T., Hadid, A., and Pietikinen, M. (2004). Face recognition with local binary patterns. In Pajdla, T. and Matas, J., editors, Computer Vision - ECCV 2004, volume 3021 of Lecture Notes in Computer Science, pages 469-481. Springer Berlin Heidelberg.
  2. Azeem, A., Sharif, M., Raza, M., and Murtaza, M. (2014). A survey: face recognition techniques under partial occlusion. Int. Arab J. Inf. Technol., 11(1):1-10.
  3. Belhumeur, P., Hespanha, J., and Kriegman, D. (1997). Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 19(7):711- 720.
  4. Chu, B., Romdhani, S., and Chen, L. (2014). 3d-aided face recognition robust to expression and pose variations. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1907-1914.
  5. Coates, A., Ng, A. Y., and Lee, H. (2011). An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011, pages 215-223.
  6. Csurka, G., Dance, C. R., Fan, L., Willamowski, J., and Bray, C. (2004). Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV, pages 1-22.
  7. Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 1, pages 886- 893.
  8. Deng, N., Tian, Y., and Zhang, C. (2012). Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions. Chapman & Hall/CRC, 1st edition.
  9. Karaaba, M. F., Surinta, O., Schomaker, L. R. B., and Wiering, M. A. (2015). In-plane rotational alignment of faces by eye and eye-pair detection. In Proceedings of the 10th International Conference on Computer Vision Theory and Applications, pages 392-399.
  10. Koshiba, Y. and Abe, S. (2003). Comparison of L1 and L2 Support Vector Machines. In Neural Networks, 2003. Proceedings of the International Joint Conference on, volume 3, pages 2054-2059 vol.3.
  11. Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Pereira, F., Burges, C., Bottou, L., and Weinberger, K., editors, Advances in Neural Information Processing Systems 25, pages 1097-1105. Curran Associates, Inc.
  12. Li, Z., Imai, J., and Kaneko, M. (2010). Robust face recognition using block-based bag of words. In Pattern Recognition (ICPR), 20th International Conference on, pages 1285-1288.
  13. Lowe, D. G. (2004). Distinctive image features from scaleinvariant keypoints. International Journal of Computer Vision, 60:91-110.
  14. Lu, J., Tan, Y.-P., and Wang, G. (2013). Discriminative multi-manifold analysis for face recognition from a single training sample per person. volume 35, pages 39-51.
  15. Montazer, G., Soltanshahi, M., and Giveki, D. (2015). Extended bag of visual words for face detection. Advances in Computational Intelligence, 9094:503-510.
  16. Parkhi, O. M., Vedaldi, A., and Zisserman, A. (2015). Deep face recognition. In Proceedings of the British Machine Vision Conference (BMVC).
  17. Perronnin, F., Sánchez, J., and Mensink, T. (2010). Improving the fisher kernel for large-scale image classification. In Proceedings of the 11th European Conference on Computer Vision: Part IV, ECCV'10, pages 143- 156, Berlin, Heidelberg. Springer-Verlag.
  18. Phillips, P. J., Wechsler, H., Huang, J., and Rauss, P. (1998). The FERET database and evaluation procedure for face recognition algorithms. Image and Vision Computing, 16(5):295-306.
  19. Shekhar, R. and Jawahar, C. (2012). Word image retrieval using bag of visual words. In Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on, pages 297-301.
  20. Simonyan, K., Parkhi, O. M., Vedaldi, A., and Zisserman, A. (2013). Fisher Vector Faces in the Wild. In British Machine Vision Conference.
  21. Su, Y., Shan, S., Chen, X., and Gao, W. (2010). Adaptive generic learning for face recognition from a single sample per person. In Computer Vision and Pattern Recognition (CVPR), the Twenty-Third IEEE Conference on, pages 2699-2706.
  22. Turk, M. and Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71-86.
  23. Vapnik, V. (1998). Statistical learning theory. Wiley.
  24. Wei, J., Jian-qi, Z., and Xiang, Z. (2011). Face recognition method based on support vector machine and particle swarm optimization. Expert Systems with Applications, 38(4):4390 - 4393.
  25. Wu, Y.-S., Liu, H.-S., Ju, G.-H., Lee, T.-W., and Chiu, Y.- L. (2012). Using the visual words based on affine-sift descriptors for face recognition. In Signal Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific , pages 1-5.
  26. Yan, Y., Wang, H., and Suter, D. (2014). Multi-subregion based correlation filter bank for robust face recognition. Pattern Recognition, 47(11):3487 - 3501.
  27. Zhang, X. and Gao, Y. (2009). Face recognition across pose: A review. Pattern Recognition, 42(11):2876 - 2896.
Download


Paper Citation


in Harvard Style

Karaaba M., Surinta O., Schomaker L. and Wiering M. (2016). Robust Face Identification with Small Sample Sizes using Bag of Words and Histogram of Oriented Gradients . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 582-589. DOI: 10.5220/0005722305820589


in Bibtex Style

@conference{visapp16,
author={Mahir Faik Karaaba and Olarik Surinta and L. R. B. Schomaker and Marco A. Wiering},
title={Robust Face Identification with Small Sample Sizes using Bag of Words and Histogram of Oriented Gradients},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={582-589},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005722305820589},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - Robust Face Identification with Small Sample Sizes using Bag of Words and Histogram of Oriented Gradients
SN - 978-989-758-175-5
AU - Karaaba M.
AU - Surinta O.
AU - Schomaker L.
AU - Wiering M.
PY - 2016
SP - 582
EP - 589
DO - 10.5220/0005722305820589