Video Action Classification through Graph Convolutional Networks

Felipe Costa, Priscila Saito, Pedro Bugatti

Abstract

Video classification methods have been evolving through proposals based on end-to-end deep learning architectures. Several works have testified that end-to-end models are effective for the learning of intrinsic video features, especially when compared to the handcrafted ones. In general, convolutional neural networks are used for deep learning in videos. Usually, when applied to such contexts, these vanilla deep learning networks cannot identify variations based on temporal information. To do so, memory-based cells (e.g. long-short term memory), or even optical flow techniques are used in conjunction with the convolutional process. However, despite their effectiveness, those methods neglect global analysis, processing only a small quantity of frames in each batch during the learning and inference process. Moreover, they also completely ignore the semantic relationship between different videos that belong to the same context. Thus, the present work aims to fill these gaps by using information grouping concepts and contextual detection through graph-based convolutional neural networks. The experiments show that our method achieves up to 87% of accuracy in a well-known public video dataset.

Download


Paper Citation


in Harvard Style

Costa F., Saito P. and Bugatti P. (2021). Video Action Classification through Graph Convolutional Networks.In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, ISBN 978-989-758-488-6, pages 490-497. DOI: 10.5220/0010321304900497


in Bibtex Style

@conference{visapp21,
author={Felipe Costa and Priscila Saito and Pedro Bugatti},
title={Video Action Classification through Graph Convolutional Networks},
booktitle={Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,},
year={2021},
pages={490-497},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010321304900497},
isbn={978-989-758-488-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,
TI - Video Action Classification through Graph Convolutional Networks
SN - 978-989-758-488-6
AU - Costa F.
AU - Saito P.
AU - Bugatti P.
PY - 2021
SP - 490
EP - 497
DO - 10.5220/0010321304900497