Semi-automatic Hand Annotation Making Human-human Interaction Analysis Fast and Accurate

Stijn De Beugher, Geert Brône, Toon Goedemé

2016

Abstract

The detection of human hands is of great importance in a variety of domains including research on humancomputer interaction, human-human interaction, sign language and physiotherapy. Within this field of research one is interested in relevant items in recordings, such as for example faces, human body or hands. However, nowadays this annotation is mainly done manual, which makes this task extremely time consuming. In this paper, we present a semi-automatic alternative for the manual labeling of recordings. Our system automatically searches for hands in images and asks for manual intervention if the confidence of a detection is too low. Most of the existing approaches rely on complex and computationally intensive models to achieve accurate hand detections, while our approach is based on segmentation techniques, smart tracking mechanisms and knowledge of human pose context. This makes our approach substantially faster as compared to existing approaches. In this paper we apply our semi-automatic hand detection to the annotation of mobile eye-tracker recordings on human-human interaction. Our system makes the analysis of such data tremendously faster (244×) while maintaining an average accuracy of 93.68% on the tested datasets.

References

  1. Bo, N., Dailey, M. N., and Uyyanonvara, B. (2007). Robust hand tracking in low-resolution video sequences. In Proc. of IASTED, pages 228-233, Anaheim, CA, USA.
  2. De Beugher, S., Broˆne, G., and Goedemé, T. (2014). Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In Proc. of VISAPP, pages 625-633.
  3. De Beugher, S., Broˆne, G., and Goedemé, T. (2015). Semiautomatic hand detection - a case study on real life mobile eye-tracker data. In Proc. of VISAPP, pages 121-129.
  4. Dollár, P., Wojek, C., Schiele, B., and Perona, P. (2012). Pedestrian detection: An evaluation of the state of the art. IEEE Transactions on PAMI, 34(4):743-761.
  5. Eichner, M., Marin-Jimenez, M., Zisserman, A., and Ferrari, V. (2012). 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. International Journal of Computer Vision, 99:190- 214.
  6. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on PAMI, 32(9):1627-1645.
  7. Ferrari, V., Marin-Jimenez, M., and Zisserman, A. (2009). Pose search: Retrieving people using their pose. In Proc. of CVPR, pages 1-8.
  8. Kolsch, M. and Turk, M. (2004). Fast 2d hand tracking with flocks of features and multi-cue integration. In Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 10 - Volume 10, CVPRW 7804, pages 158 - 158, Washington, DC, USA. IEEE Computer Society.
  9. Mittal, A., Zisserman, A., and Torr, P. (2011). Hand detection using multiple proposals. In Proc. of BMVC, pages 75.1-75.11. BMVA Press.
  10. Raheja, J., Chaudhary, A., and Singal, K. (2011). Tracking of fingertips and centers of palm using kinect. In Proc. of CIMSiM, pages 248-252.
  11. Rahim, N. A. A., Kit, C. W., and See, J. (2006). RGB-HCbCr skin colour model for human face detection. In Proc. of M2USIC, Petaling Jaya, Malaysia.
  12. Ren, Z., Yuan, J., Meng, J., and Zhang, Z. (2013). Robust part-based hand gesture recognition using kinect sensor. IEEE Transactions on Multimedia, 15(5):1110- 1120.
  13. Shan, C., Tan, T., and Wei, Y. (2007). Real-time hand tracking using a mean shift embedded particle filter. Pattern Recognition, 40(7):1958 - 1970.
  14. Spruyt, V., Ledda, A., and Philips, W. (2013). Real-time, long-term hand tracking with unsupervised initialization. In Proc. of ICIP, pages 3730-3734.
  15. Stiefmeier, T., Ogris, G., Junker, H., Lukowicz, P., and Troster, G. (2006). Combining motion sensors and ultrasonic hands tracking for continuous activity recognition in a maintenance scenario. In Proc. of ISWC, pages 97-104.
  16. Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. pages 511-518. Proc. of CVPR.
  17. Wang, R. Y. and Popovic, J. (2009). Real-time handtracking with a color glove. ACM Transactions on Graphics, 28(3).
  18. Yang, Y. and Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Proc of CVPR, pages 1385-1392. IEEE.
Download


Paper Citation


in Harvard Style

Beugher S., Brône G. and Goedemé T. (2016). Semi-automatic Hand Annotation Making Human-human Interaction Analysis Fast and Accurate . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 552-559. DOI: 10.5220/0005718505520559


in Bibtex Style

@conference{visapp16,
author={Stijn De Beugher and Geert Brône and Toon Goedemé},
title={Semi-automatic Hand Annotation Making Human-human Interaction Analysis Fast and Accurate},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={552-559},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005718505520559},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - Semi-automatic Hand Annotation Making Human-human Interaction Analysis Fast and Accurate
SN - 978-989-758-175-5
AU - Beugher S.
AU - Brône G.
AU - Goedemé T.
PY - 2016
SP - 552
EP - 559
DO - 10.5220/0005718505520559