loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Luca Ballan 1 ; 2 ; Ombretta Strafforello 3 ; 2 and Klamer Schutte 2

Affiliations: 1 Department of Math, University of Padova, Italy ; 2 Intelligent Imaging, TNO, Oude Waalsdorperweg 63, The Hague, The Netherlands ; 3 Delft University of Technology, The Netherlands

Keyword(s): Action Recognition, Region Attention, 3D Convolutional Neural Networks, Video Classification.

Abstract: Long-Term activities involve humans performing complex, minutes-long actions. Differently than in traditional action recognition, complex activities are normally composed of a set of sub-actions, that can appear in different order, duration, and quantity. These aspects introduce a large intra-class variability, that can be hard to model. Our approach aims to adaptively capture and learn the importance of spatial and temporal video regions for minutes-long activity classification. Inspired by previous work on Region Attention, our architecture embeds the spatio-temporal features from multiple video regions into a compact fixed-length representation. These features are extracted with a 3D convolutional backbone specially fine-tuned. Additionally, driven by the prior assumption that the most discriminative locations in the videos are centered around the human that is carrying out the activity, we introduce an Actor Focus mechanism to enhance the feature extraction both in training and i nference phase. Our experiments show that the Multi-Regional fine-tuned 3D-CNN, topped with Actor Focus and Region Attention, largely improves the performance of baseline 3D architectures, achieving state-of-the-art results on Breakfast, a well known long-term activity recognition benchmark. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.138.120.17

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Ballan, L.; Strafforello, O. and Schutte, K. (2021). Long-term Behaviour Recognition in Videos with Actor-focused Region Attention. In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP; ISBN 978-989-758-488-6; ISSN 2184-4321, SciTePress, pages 362-369. DOI: 10.5220/0010215803620369

@conference{visapp21,
author={Luca Ballan. and Ombretta Strafforello. and Klamer Schutte.},
title={Long-term Behaviour Recognition in Videos with Actor-focused Region Attention},
booktitle={Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP},
year={2021},
pages={362-369},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010215803620369},
isbn={978-989-758-488-6},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP
TI - Long-term Behaviour Recognition in Videos with Actor-focused Region Attention
SN - 978-989-758-488-6
IS - 2184-4321
AU - Ballan, L.
AU - Strafforello, O.
AU - Schutte, K.
PY - 2021
SP - 362
EP - 369
DO - 10.5220/0010215803620369
PB - SciTePress