Toward a Real Time View-invariant 3D Action Recognition

Mounir Hammouche; Enjie Ghorbel; Anthony Fleury; Sébastien Ambellouis

doi:10.5220/0005843607450754

Toward a Real Time View-invariant 3D Action Recognition

Mounir Hammouche, Enjie Ghorbel, Anthony Fleury, Sébastien Ambellouis

2016

Abstract

In this paper we propose a novel human action recognition method, robust to viewpoint variation, which combines skeleton- and depth-based action recognition approaches. For this matter, we first build several base classifiers, to independently predict the action performed by a subject. Then, two efficient combination strategies, that take into account skeleton accuracy and human body orientation, are proposed. The first is based on fuzzy switcher where the second uses a combination between fuzzy switcher and aggregation. Moreover, we introduce a new algorithm for the estimation of human body orientation. To perform the test we have created a new Multiview 3D Action public dataset with three viewpoint angles (30°,0°,-30°). The experimental results show that an efficient combination strategy of base classifiers improves the accuracy and the computational efficiency for human action recognition.

References

Chen, C., Liu, K., and Kehtarnavaz, N. (2013). Realtime human action recognition based on depth motion maps. Journal of Real-Time Image Processing, pages 1-9.
Dwi Ade Riandayani, Ketut Gede Darma Putra, P. W. B. (2014). Comparing fuzzy logic and fuzzy c-means (fcm) on summarizing indonesian language document. Journal of Theoretical and Applied Information Technology, 59(3):718-724.
Kim, H.-C., Pang, S., Je, H.-M., Kim, D., and Bang, S.-Y. (2002). Support vector machine ensemble with bagging. In Pattern recognition with support vector machines, pages 397-408.
Laptev, I. and Lindeberg, T. (2003). Space-time interest points. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pages 432-439.
Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., and Han, J. (2015). A survey on truth discovery. arXiv preprint arXiv:1505.02463.
Liu, W., Zhang, Y., Tang, S., Tang, J., Hong, R., and Li, J. (2013). Accurate estimation of human body orientation from rgb-d sensors. IEEE Transactions on Cybernetics, 43(5):1442-1452.
Ohn-Bar, E. and Trivedi, M. (2013). Joint angles similarities and hog2 for action recognition. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on, pages 465-470.
Oreifej, O. and Liu, Z. (2013). Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 716- 723.
Ozturk, O., Yamasaki, T., and Aizawa, K. (2009). Tracking of humans and estimation of body/head orientation from top-view single camera for visual focus of attention analysis. In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pages 1020-1027.
Pasternack, J. and Roth, D. (2010). Knowing what to believe (when you already know something). In Proceedings of the 23rd International Conference on Computational Linguistics, pages 877-885.
Peng, X., Wang, L., Qiao, Y., and Peng, Q. (2014). A joint evaluation of dictionary learning and feature encoding for action recognition. In Pattern Recognition (ICPR), 2014 22nd International Conference on, pages 2607- 2612.
Rahmani, H., Mahmood, A., Q Huynh, D., and Mian, A. (2014). Hopc: Histogram of oriented principal components of 3d pointclouds for action recognition. In Fleet, D., Pajdla, T., Schiele, B., and Tuytelaars, T., editors, Computer Vision ECCV 2014, volume 8690 of Lecture Notes in Computer Science, pages 742- 757.
Shinmura, F., Deguchi, D., Ide, I., Murase, H., and Fujiyoshi, H. (2015). Estimation of human orientation using coaxial rgb-depth images. In International Conference on Computer Vision Theory and Applications (VISAPP), pages 113-120.
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1297-1304.
Singhala, P., Shah, D. N., and Patel, B. (2014). Temperature control using fuzzy logic. International Journal of Instrumentation and Control Systems, 4(1):110.
Vemulapalli, R., Arrate, F., and Chellappa, R. (2014). Human action recognition by representing 3d skeletons as points in a lie group. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 588-595.
Wang, J., Liu, Z., Chorowski, J., Chen, Z., and Wu, Y. (2012a). Robust 3d action recognition with random occupancy patterns. In Computer vision-ECCV 2012 , pages 872-885.
Wang, J., Liu, Z., Wu, Y., and Yuan, J. (2012b). Mining actionlet ensemble for action recognition with depth cameras. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 1290- 1297.
Xia, L. and Aggarwal, J. (2013). Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2834-2841.
Xia, L., Chen, C.-C., and Aggarwal, J. (2012). View invariant human action recognition using histograms of 3d joints. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pages 20-27.
Yang, X. and Tian, Y. (2012). Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pages 14-19.
Yang, X., Zhang, C., and Tian, Y. (2012). Recognizing actions using depth motion maps-based histograms of oriented gradients. In Proc. of the 20th ACM international conference on Multimedia, pages 1057-1060.

Download

Paper Citation

in Harvard Style

Hammouche M., Ghorbel E., Fleury A. and Ambellouis S. (2016). Toward a Real Time View-invariant 3D Action Recognition . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISION4HCI, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 745-754. DOI: 10.5220/0005843607450754

in Bibtex Style

@conference{vision4hci16,
author={Mounir Hammouche and Enjie Ghorbel and Anthony Fleury and Sébastien Ambellouis},
title={Toward a Real Time View-invariant 3D Action Recognition},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISION4HCI, (VISIGRAPP 2016)},
year={2016},
pages={745-754},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005843607450754},
isbn={978-989-758-175-5},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISION4HCI, (VISIGRAPP 2016)
TI - Toward a Real Time View-invariant 3D Action Recognition
SN - 978-989-758-175-5
AU - Hammouche M.
AU - Ghorbel E.
AU - Fleury A.
AU - Ambellouis S.
PY - 2016
SP - 745
EP - 754
DO - 10.5220/0005843607450754