A Priori Data and A Posteriori Decision Fusions for Human Action Recognition

Julien Cumin, Grégoire Lefebvre

Abstract

In this paper, we tackle the challenge of human action recognition using multiple data sources by mixing a priori data fusion and a posteriori decision fusion. Our strategy applied from 3 main classifiers (Dynamic Time Warping, Multi-Layer Perceptron and Siamese Neural Network) using several decision fusion methods (Voting, Stacking, Dempster-Shafer Theory and Possibility Theory) on two databases (MHAD (Ofli et al., 2013) and ChAirGest (Ruffieux et al., 2013)) outperforms state-of-the-art results with respectively 99.85%±0:53 and 96.40%±3:37 of best average correct classification when evaluating a leave-one-subject-out protocol.

References

  1. Akl, A. and Valaee, S. (2010). Accelerometer-based gesture recognition via dynamic-time warping, affinity propagation, & compressive sensing. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
  2. Berlemont, S., Lefebvre, G., Duffner, S., and Garcia, C. (2015). Siamese neural network based similarity metric for inertial gesture classification and rejection. IEEE International Conference on Automatic Face and Gesture Recognition.
  3. Cao, C., Zhang, Y., and Lu, H. (2015). Multi-modal learning for gesture recognition. In Multimedia and Expo (ICME), 2015 IEEE International Conference on, pages 1-6.
  4. Chen, C., Jafari, R., and Kehtarnavaz, N. (2015). Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Transactions on Human-Machine Systems, 45(1):51-61.
  5. Cho, S.-J., Choi, E., Bang, W.-C., Yang, J., Sohn, J., Kim, D. Y., Lee, Y.-B., and Kim, S. (2006). Twostage Recognition of Raw Acceleration Signals for 3- D Gesture-Understanding Cell Phones. In Lorette, G., editor, Tenth International Workshop on Frontiers in Handwriting Recognition.
  6. Fauvel, M., Chanussot, J., and Benediktsson, J. A. (2007). Decision fusion for hyperspectral classification. John Wiley & Sons, New York, NY, USA.
  7. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The weka data mining software: An update. In SIGKDD Explorations, volume 11.
  8. IBM, Zikopoulos, P., and Eaton, C. (2011). Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media.
  9. Kumar, A. and Raj, B. (2015). Unsupervised fusion weight learning in multiple classifier systems. CoRR arXiv.
  10. Lefebvre, G., Berlemont, S., Mamalet, F., and Garcia, C. (2015). Inertial gesture recognition with BLSTMRNN. In Artificial Neural Networks, volume 4 of Springer Series in Bio-/Neuro-informatics, pages 393-410. Springer International Publishing.
  11. Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., and Bajcsy, R. (2013). Berkeley MHAD: A comprehensive Multimodal Human Action Database. In IEEE Workshop on Applications of Computer Vision, pages 53-60.
  12. Pylvänäinen, T. (2005). Accelerometer Based Gesture Recognition Using Continuous HMMs Pattern Recognition and Image Analysis. volume 3522 of Lecture Notes in Computer Science, chapter 77, pages 413- 430. Berlin, Heidelberg.
  13. Ruffieux, S., Lalanne, D., and Mugellini, E. (2013). ChAirGest: A Challenge for Multimodal Mid-air Gesture Recognition for Close HCI. In Proceedings of the 15th ACM International Conference on Multimodal Interaction, ICMI 7813, pages 483-488.
  14. Smets, P. (1990). The combination of evidence in the transferable belief model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5):447-458.
  15. Vemulapalli, R., Arrate, F., and Chellappa, R. (2014). Human action recognition by representing 3d skeletons as points in a Lie group. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  16. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2):241-259.
  17. Wu, J., Pan, G., Zhang, D., Qi, G., and Li, S. (2009). Gesture recognition with a 3-d accelerometer. In Ubiquitous Intelligence and Computing, volume 5585 of Lecture Notes in Computer Science, pages 25-38. Springer Berlin Heidelberg.
  18. Xia, L., Chen, C.-C., and Aggarwal, J. (2012). View invariant human action recognition using histograms of 3d joints. In IEEE Computer Vision and Pattern Recognition Workshops (CVPRW), pages 20-27.
  19. Yin, Y. and Davis, R. (2013). Gesture spotting and recognition using salience detection and concatenated hidden markov models. In Proceedings of the 15th ACM on International Conference on Multimodal Interaction, ICMI 7813, pages 489-494.
  20. Zhou, F. and De la Torre Frade, F. (2012). Generalized time warping for multi-modal alignment of human motion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Download


Paper Citation


in Harvard Style

Cumin J. and Lefebvre G. (2016). A Priori Data and A Posteriori Decision Fusions for Human Action Recognition . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 493-500. DOI: 10.5220/0005680204930500


in Bibtex Style

@conference{visapp16,
author={Julien Cumin and Grégoire Lefebvre},
title={A Priori Data and A Posteriori Decision Fusions for Human Action Recognition},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={493-500},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005680204930500},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - A Priori Data and A Posteriori Decision Fusions for Human Action Recognition
SN - 978-989-758-175-5
AU - Cumin J.
AU - Lefebvre G.
PY - 2016
SP - 493
EP - 500
DO - 10.5220/0005680204930500