
label classification. Furthermore, contrastive learning
methods could potentially be expanded by exploring
the relationship of hard negative selection based on
emotion relationships. Lastly, multimodal data could
be incorporated to further enhance the model’s perfor-
mance.
REFERENCES
Bucila, C., Caruana, R., and Niculescu-Mizil, A. (2006).
Model compression. In Proceedings of the 12th
ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining, KDD ’06, page
535–541, New York, NY, USA. Association for Com-
puting Machinery.
Busso, C., Bulut, M., Lee, C.-C., Kazemzadeh, A., Mower,
E., Kim, S., Chang, J. N., Lee, S., and Narayanan,
S. S. (2008). IEMOCAP: interactive emotional dyadic
motion capture database. Language Resources and
Evaluation, 42(4):335–359.
Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020).
A simple framework for contrastive learning of visual
representations.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2019). Bert: Pre-training of deep bidirectional trans-
formers for language understanding.
Gao, T., Yao, X., and Chen, D. (2021). SimCSE: Sim-
ple contrastive learning of sentence embeddings. In
Moens, M.-F., Huang, X., Specia, L., and Yih, S.
W.-t., editors, Proceedings of the 2021 Conference on
Empirical Methods in Natural Language Processing,
pages 6894–6910, Online and Punta Cana, Dominican
Republic. Association for Computational Linguistics.
Ghosal, D., Majumder, N., Poria, S., Chhaya, N., and Gel-
bukh, A. (2019). Dialoguegcn: A graph convolutional
neural network for emotion recognition in conversa-
tion.
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the
knowledge in a neural network.
Hu, D., Wei, L., and Huai, X. (2021). DialogueCRN: Con-
textual reasoning networks for emotion recognition in
conversations. In Zong, C., Xia, F., Li, W., and Nav-
igli, R., editors, Proceedings of the 59th Annual Meet-
ing of the Association for Computational Linguistics
and the 11th International Joint Conference on Nat-
ural Language Processing (Volume 1: Long Papers),
pages 7042–7052, Online. Association for Computa-
tional Linguistics.
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian,
Y., Isola, P., Maschinot, A., Liu, C., and Krish-
nan, D. (2020). Supervised contrastive learning. In
Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.,
and Lin, H., editors, Advances in Neural Information
Processing Systems, volume 33, pages 18661–18673.
Curran Associates, Inc.
Lee, J. and Lee, W. (2022). CoMPM: Context model-
ing with speaker’s pre-trained memory tracking for
emotion recognition in conversation. In Carpuat, M.,
de Marneffe, M.-C., and Meza Ruiz, I. V., editors,
Proceedings of the 2022 Conference of the North
American Chapter of the Association for Computa-
tional Linguistics: Human Language Technologies,
pages 5669–5679, Seattle, United States. Association
for Computational Linguistics.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A robustly optimized BERT pre-
training approach. CoRR, abs/1907.11692.
Majumder, N., Poria, S., Hazarika, D., Mihalcea, R., Gel-
bukh, A., and Cambria, E. (2019). Dialoguernn: an
attentive rnn for emotion detection in conversations.
In Proceedings of the Thirty-Third AAAI Conference
on Artificial Intelligence and Thirty-First Innovative
Applications of Artificial Intelligence Conference and
Ninth AAAI Symposium on Educational Advances in
Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19.
AAAI Press.
Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria,
E., and Mihalcea, R. (2019). MELD: A multimodal
multi-party dataset for emotion recognition in conver-
sations. In Korhonen, A., Traum, D., and M
`
arquez,
L., editors, Proceedings of the 57th Annual Meeting of
the Association for Computational Linguistics, pages
527–536, Florence, Italy. Association for Computa-
tional Linguistics.
Qu, C., Yang, L., Qiu, M., Croft, W. B., Zhang, Y., and
Iyyer, M. (2019). Bert with history answer embedding
for conversational question answering. In Proceedings
of the 42nd International ACM SIGIR Conference on
Research and Development in Information Retrieval,
SIGIR’19, page 1133–1136, New York, NY, USA.
Association for Computing Machinery.
Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2020).
Distilbert, a distilled version of bert: smaller, faster,
cheaper and lighter.
Shen, W., Wu, S., Yang, Y., and Quan, X. (2021). Di-
rected acyclic graph network for conversational emo-
tion recognition. In Zong, C., Xia, F., Li, W., and Nav-
igli, R., editors, Proceedings of the 59th Annual Meet-
ing of the Association for Computational Linguistics
and the 11th International Joint Conference on Nat-
ural Language Processing (Volume 1: Long Papers),
pages 1551–1560, Online. Association for Computa-
tional Linguistics.
Song, X., Huang, L., Xue, H., and Hu, S. (2022). Su-
pervised prototypical contrastive learning for emo-
tion recognition in conversation. In Goldberg, Y.,
Kozareva, Z., and Zhang, Y., editors, Proceedings of
the 2022 Conference on Empirical Methods in Natural
Language Processing, pages 5197–5206, Abu Dhabi,
United Arab Emirates. Association for Computational
Linguistics.
van den Oord, A., Li, Y., and Vinyals, O. (2019). Represen-
tation learning with contrastive predictive coding.
Warner, B., Chaffin, A., Clavi
´
e, B., Weller, O., Hallstr
¨
om,
O., Taghadouini, S., Gallagher, A., Biswas, R., Lad-
hak, F., Aarsen, T., Cooper, N., Adams, G., Howard,
J., and Poli, I. (2024). Smarter, better, faster, longer:
Contrastive Learning for Conversational Emotion Recognition Using Knowledge Enhancement of Large Language Models
335