Multimodal Dance Recognition

Monika Wysoczańska; Monika Wysoczańska; Tomasz Trzciński; Tomasz Trzciński

doi:10.5220/0009326005580565

Multimodal Dance Recognition

Monika Wysoczańska, Monika Wysoczańska, Tomasz Trzciński, Tomasz Trzciński

2020

Abstract

Video content analysis is still an emerging technology, and the majority of work in this area extends from the still image domain. Dance videos are especially difficult to analyse and recognise as the performed human actions are highly dynamic. In this work, we introduce a multimodal approach for dance video recognition. Our proposed method combines visual and audio information, by fusing their representations, to improve classification accuracy. For the visual part, we focus on motion representation, as it is the key factor in distinguishing dance styles. For audio representation, we put the emphasis on capturing long-term dependencies, such as tempo, which is a crucial dance discriminator. Finally, we fuse two distinct modalities using a late fusion approach. We compare our model with corresponding unimodal approaches, by giving exhaustive evaluation on the Let’s Dance dataset. Our method yields significantly better results than each single-modality approach. Results presented in this work not only demonstrate the strength of integrating complementary sources of information in the recognition task, but also indicate the potential of applying multimodal approaches within specific research areas.

Download

Paper Citation

in Harvard Style

Wysoczańska M. and Trzciński T. (2020). Multimodal Dance Recognition. In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 5: VISAPP; ISBN 978-989-758-402-2, SciTePress, pages 558-565. DOI: 10.5220/0009326005580565

in Bibtex Style

@conference{visapp20,
author={Monika Wysoczańska and Tomasz Trzciński},
title={Multimodal Dance Recognition},
booktitle={Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 5: VISAPP},
year={2020},
pages={558-565},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009326005580565},
isbn={978-989-758-402-2},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 5: VISAPP
TI - Multimodal Dance Recognition
SN - 978-989-758-402-2
AU - Wysoczańska M.
AU - Trzciński T.
PY - 2020
SP - 558
EP - 565
DO - 10.5220/0009326005580565
PB - SciTePress