Multimodal Personality Recognition using Cross-attention Transformer and Behaviour Encoding

Tanay Agrawal, Dhruv Agarwal, Dhruv Agarwal, Michal Balazia, Michal Balazia, Neelabh Sinha, Neelabh Sinha, François Bremond, François Bremond

2022

Abstract

Personality computing and affective computing have gained recent interest in many research areas. The datasets for the task generally have multiple modalities like video, audio, language and bio-signals. In this paper, we propose a flexible model for the task which exploits all available data. The task involves complex relations and to avoid using a large model for video processing specifically, we propose the use of behaviour encoding which boosts performance with minimal change to the model. Cross-attention using transformers has become popular in recent times and is utilised for fusion of different modalities. Since long term relations may exist, breaking the input into chunks is not desirable, thus the proposed model processes the entire input together. Our experiments show the importance of each of the above contributions.

Download


Paper Citation


in Harvard Style

Agrawal T., Agarwal D., Balazia M., Sinha N. and Bremond F. (2022). Multimodal Personality Recognition using Cross-attention Transformer and Behaviour Encoding. In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, ISBN 978-989-758-555-5, pages 501-508. DOI: 10.5220/0010841400003124


in Bibtex Style

@conference{visapp22,
author={Tanay Agrawal and Dhruv Agarwal and Michal Balazia and Neelabh Sinha and François Bremond},
title={Multimodal Personality Recognition using Cross-attention Transformer and Behaviour Encoding},
booktitle={Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,},
year={2022},
pages={501-508},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010841400003124},
isbn={978-989-758-555-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,
TI - Multimodal Personality Recognition using Cross-attention Transformer and Behaviour Encoding
SN - 978-989-758-555-5
AU - Agrawal T.
AU - Agarwal D.
AU - Balazia M.
AU - Sinha N.
AU - Bremond F.
PY - 2022
SP - 501
EP - 508
DO - 10.5220/0010841400003124