Pre-Trained Multi-Modal Transformer for Pet Emotion Detection

Run Guo

2022

Abstract

With the rapid development of artificial intelligence and deep learning technology, emotion recognition and emotion detection based on visual information and language information become possible. These recognition and detection methods help to better understand human emotions and intentions in human-computer interaction systems and respond accordingly. On the other hand, more families with pets need to pay attention to the pet’s emotions, so as to adjust and manage the pet’s behaviour in time, and the deep learning model is also used to classify the pet’s facial expressions. This paper proposes a pre-trained multi-modal transformer emotion detection system, which is first pre-trained on a human emotion detection dataset including speech and facial expression data, and then takes the labelled animal voice and expression data as small-sample task data, This approach utilizes an unlabelled corpus for pre-training, which meets the requirements of adequately training model parameters and preventing model overfitting, and finally uses representations of these models for few-shot tasks. Experimental results on video datasets show that the proposed multimodal transformer emotion detection system has good classification results on video datasets containing both sound and visual information.

Download


Paper Citation


in Harvard Style

Guo R. (2022). Pre-Trained Multi-Modal Transformer for Pet Emotion Detection. In Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC; ISBN 978-989-758-622-4, SciTePress, pages 574-579. DOI: 10.5220/0011961500003612


in Bibtex Style

@conference{isaic22,
author={Run Guo},
title={Pre-Trained Multi-Modal Transformer for Pet Emotion Detection},
booktitle={Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC},
year={2022},
pages={574-579},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011961500003612},
isbn={978-989-758-622-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC
TI - Pre-Trained Multi-Modal Transformer for Pet Emotion Detection
SN - 978-989-758-622-4
AU - Guo R.
PY - 2022
SP - 574
EP - 579
DO - 10.5220/0011961500003612
PB - SciTePress