Mind the Regularized GAP, for Human Action Classification and Semi-supervised Localization based on Visual Saliency

Marc Moreaux, Natalia Lyubova, Isabelle Ferrané, Frédéric Lerasle

2018

Abstract

This work addresses the issue of image classification and localization of human actions based on visual data acquired from RGB sensors. Our approach is inspired by the success of deep learning in image classification. In this paper, we describe our method and how the concept of Global Average Pooling (GAP) applies in the context of semi-supervised class localization. We benchmark it with respect to Class Activation Mapping initiated in (Zhou et al., 2016), propose a regularization over the GAP maps to enhance the results, and study whether a combination of these two ideas can result in a better classification accuracy. The models are trained and tested on the Stanford 40 Action dataset (Yao et al., 2011) describing people performing 40 different actions such as drinking, cooking or watching TV. Compared to the aforementioned baseline, our model improves the classification accuracy by 5.3 percent points, achieves a localization accuracy of 50.3%, and drastically diminishes the computation needed to retrieve the class saliency from the base convolutional model.

Download


Paper Citation


in Harvard Style

Moreaux M., Lyubova N., Ferrané I. and Lerasle F. (2018). Mind the Regularized GAP, for Human Action Classification and Semi-supervised Localization based on Visual Saliency. In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 4: VISAPP; ISBN 978-989-758-290-5, SciTePress, pages 307-314. DOI: 10.5220/0006548303070314


in Bibtex Style

@conference{visapp18,
author={Marc Moreaux and Natalia Lyubova and Isabelle Ferrané and Frédéric Lerasle},
title={Mind the Regularized GAP, for Human Action Classification and Semi-supervised Localization based on Visual Saliency},
booktitle={Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 4: VISAPP},
year={2018},
pages={307-314},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006548303070314},
isbn={978-989-758-290-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 4: VISAPP
TI - Mind the Regularized GAP, for Human Action Classification and Semi-supervised Localization based on Visual Saliency
SN - 978-989-758-290-5
AU - Moreaux M.
AU - Lyubova N.
AU - Ferrané I.
AU - Lerasle F.
PY - 2018
SP - 307
EP - 314
DO - 10.5220/0006548303070314
PB - SciTePress