Recognition of Human Actions using Edit Distance on Aclet Strings

Luc Brun, Pasquale Foggia, Alessia Saggese, Mario Vento

2015

Abstract

In this paper we propose a novel method for human action recognition based on string edit distance. A two layer representation is introduced in order to exploit the temporal sequence of the events: a first representation layer is obtained by using a feature vector obtained from depth images. Then, each action is represented as a sequence of symbols, where each symbol corresponding to an elementary action (aclet) is obtained according to a dictionary previously defined during the learning phase. The similarity between two actions is finally computed in terms of string edit distance, which allows the system to deal with actions showing different length as well as different temporal scales. The experimentation has been carried out on two widely adopted datasets, namely the MIVIA and the MHAD datasets, and the obtained results, compared with state of the art approaches, confirm the effectiveness of the proposed method.

References

  1. Brun, L., Percannella, G., Saggese, A., and Vento, M. (2014). Hack: Recognition of human actions by kernels of visual strings. In Advanced Video and SignalBased Surveillance (AVSS), 2014 IEEE International Conference on.
  2. Carletti, V., Foggia, P., Percannella, G., Saggese, A., and Vento, M. (2013). Recognition of human actions from rgb-d videos using a reject option. In ICIAP 2013, volume 8158, pages 436-445.
  3. Cheema, M. S., Eweiwi, A., and Bauckhage, C. (2013). Human activity recognition by separating style and content. Pattern Recognition Letters, (0):-.
  4. Chen, Y., Wu, Q., and He, X. (2011). Human action recognition based on radon transform. In Multimedia Analysis, Processing and Communications, volume 346, pages 369-389. Springer Berlin Heidelberg.
  5. Dollar, P., Rabaud, V., Cottrell, G., and Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pages 65-72.
  6. Foggia, P., Percannella, G., Saggese, A., and Vento, M. (2013). Recognizing human actions by a bag of visual words. In IEEE SMC 2013.
  7. Foggia, P., Saggese, A., Strisciuglio, N., and Vento, M. (2014). Exploiting the deep learning paradigm for recognizing human actions. In Advanced Video and Signal-Based Surveillance (AVSS), 2014 IEEE International Conference on.
  8. Li, W., Yu, Q., Sawhney, H., and Vasconcelos, N. (2013). Recognizing activities via bag of words for attribute dynamics. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2587- 2594.
  9. Megavannan, V., Agarwal, B., and Babu, R. (2012). Human action recognition using depth maps. In SPCOM 2012, pages 1 -5.
  10. Mokhber, A., Achard, C., Qu, X., and Milgram, M. (2005). Action recognition with global features. In Sebe, N., Lew, M., and Huang, T., editors, Computer Vision in Human-Computer Interaction, volume 3766 of Lecture Notes in Computer Science, pages 110-119. Springer Berlin Heidelberg.
  11. Navarro, G. (2001). A guided tour to approximate string matching. ACM Comput. Surv., 33(1):31-88.
  12. Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., and Bajcsy, R. (2012). Sequence of the most informative joints (smij): A new representation for human skeletal action recognition. In CVPRW 2012, pages 8-13.
  13. Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., and Bajcsy, R. (2013). Berkeley mhad: A comprehensive multimodal human action database. In WACV.
  14. Poppe, R. (2010). A survey on vision-based human action recognition. Image Vision Comput., 28(6):976-990.
  15. Sung, J., Ponce, C., Selman, B., and Saxena, A. (2012). Unstructured human activity detection from rgbd images. In ICRA, pages 842-849. IEEE.
  16. Turaga, P., Chellappa, R., Subrahmanian, V. S., and Udrea, O. (2008). Machine recognition of human activities: A survey. IEEE T Circuits Syst, 18(11):1473-1488.
  17. Vishwakarma, S. and Agrawal, A. (2013). A survey on activity recognition and behavior understanding in video surveillance. Visual Comput., 29(10):983-1009.
  18. Wagner, R. A. and Fischer, M. J. (1974). The string-tostring correction problem. J. ACM, 21(1):168-173.
  19. Wang, Y., Huang, K., and Tan, T. (2007). Human activity recognition based on r transform. In CVPR 2007, pages 1 -8.
  20. Weinland, D., Ronfard, R., and Boyer, E. (2011). A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Und., 115(2):224 - 241.
  21. Xiao, C., Wang, W., and Lin, X. (2008). Ed-join: An efficient algorithm for similarity joins with edit distance constraints. Proc. VLDB Endow., 1(1):933-944.
  22. Yuan, C., Li, X., Hu, W., Ling, H., and Maybank, S. (2013). 3d r transform on spatio-temporal interest points for action recognition. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 724-730.
  23. Zhang, J. and Gong, S. (2010). Action categorization with modified hidden conditional random field. Pattern Recogn., 43(1):197-203.
Download


Paper Citation


in Harvard Style

Brun L., Foggia P., Saggese A. and Vento M. (2015). Recognition of Human Actions using Edit Distance on Aclet Strings . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 97-103. DOI: 10.5220/0005304700970103


in Bibtex Style

@conference{visapp15,
author={Luc Brun and Pasquale Foggia and Alessia Saggese and Mario Vento},
title={Recognition of Human Actions using Edit Distance on Aclet Strings},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={97-103},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005304700970103},
isbn={978-989-758-090-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - Recognition of Human Actions using Edit Distance on Aclet Strings
SN - 978-989-758-090-1
AU - Brun L.
AU - Foggia P.
AU - Saggese A.
AU - Vento M.
PY - 2015
SP - 97
EP - 103
DO - 10.5220/0005304700970103