A New Benchmark for NLP in Social Sciences: Evaluating the Usefulness of Pre-trained Language Models for Classifying Open-ended Survey Responses

Maximilian Meidinger, Matthias Aßenmacher

Abstract

In order to evaluate transfer learning models for Natural Language Processing on a common ground, numerous general domain (sets of) benchmark data sets have been established throughout the last couple of years. Primarily, the proposed tasks are classification (binary, multi-class), regression or language generation. However, no benchmark data set for (extreme) multi-label classification relying on full-text inputs has been proposed in the area of social science survey research to this date. This constitutes an important gap, as a common data set for algorithm development in this field could lead to more reproducible, sustainable research. Thus, we provide a transparent and fully reproducible preparation of the 2008 American National Election Study (ANES) data set, which can be used for benchmark comparisons of different NLP models on the task of multi-label classification. In contrast to other data sets, our data set comprises full-text inputs instead of bag-of-words representations or similar. Furthermore, we provide baseline performances of simple logistic regression models as well as performance values for recently established transfer learning architectures, namely BERT (Devlin et al., 2018), RoBERTa (Liu et al., 2019) and XLNet (Yang et al., 2019).

Download


Paper Citation


in Harvard Style

Meidinger M. and Aßenmacher M. (2021). A New Benchmark for NLP in Social Sciences: Evaluating the Usefulness of Pre-trained Language Models for Classifying Open-ended Survey Responses.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 866-873. DOI: 10.5220/0010255108660873


in Bibtex Style

@conference{icaart21,
author={Maximilian Meidinger and Matthias Aßenmacher},
title={A New Benchmark for NLP in Social Sciences: Evaluating the Usefulness of Pre-trained Language Models for Classifying Open-ended Survey Responses},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={866-873},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010255108660873},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - A New Benchmark for NLP in Social Sciences: Evaluating the Usefulness of Pre-trained Language Models for Classifying Open-ended Survey Responses
SN - 978-989-758-484-8
AU - Meidinger M.
AU - Aßenmacher M.
PY - 2021
SP - 866
EP - 873
DO - 10.5220/0010255108660873