TWIN-GRU: Twin Stream GRU Network for Action Recognition from RGB Video

Hajer Essefi, Olfa Ben Ahmed, Christel Bidet-Ildei, Yannick Blandin, Christine Fernandez-Maloigne

Abstract

Human Action Recognition (HAR) is an important task for numerous computer vision applications. Recently, deep learning approaches have shown proficiency in recognizing actions in RGB video. However, existing models rely mainly on global appearance and could potentially under perform in real world applications, such as sport events and clinical applications. Refereeing to domain knowledge in how human perceive action, we hypothesis that observing the dynamic of a 2D human body joints representation extracted from RGB video frames is sufficient to recognize an action in video. Moreover, body joints contain structural information with a strong spatial (intra-frame) and temporal (inter-frame) correlation between adjacent joints. In this paper, we propose a psychology-inspired twin stream Gated Recurrent Unit network for action recognition based on the dynamic of 2D human body joints in RGB videos. The proposed model achieves a classification accuracy of 89,97% in a subject-specific experiment and outperforms the baseline method that fuses depth and inertial sensor data on the UTD-MHAD dataset. The proposed framework is more cost effective and highly competitive than depth 3D skeleton based solutions and therefore can be used outside capture motion labs for real world applications.

Download


Paper Citation


in Harvard Style

Essefi H., Ben Ahmed O., Bidet-Ildei C., Blandin Y. and Fernandez-Maloigne C. (2021). TWIN-GRU: Twin Stream GRU Network for Action Recognition from RGB Video.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 351-359. DOI: 10.5220/0010324703510359


in Bibtex Style

@conference{icaart21,
author={Hajer Essefi and Olfa Ben Ahmed and Christel Bidet-Ildei and Yannick Blandin and Christine Fernandez-Maloigne},
title={TWIN-GRU: Twin Stream GRU Network for Action Recognition from RGB Video},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={351-359},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010324703510359},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - TWIN-GRU: Twin Stream GRU Network for Action Recognition from RGB Video
SN - 978-989-758-484-8
AU - Essefi H.
AU - Ben Ahmed O.
AU - Bidet-Ildei C.
AU - Blandin Y.
AU - Fernandez-Maloigne C.
PY - 2021
SP - 351
EP - 359
DO - 10.5220/0010324703510359