loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Lisa Singh ; Rebecca Vanarsdall ; Yanchen Wang and Carole Roan Gresenz

Affiliation: Georgetown University, Washington DC, U.S.A.

Keyword(s): Data Labeling, Reliability, Mechanical Turk, Social Media.

Abstract: For social media related machine learning tasks, having reliable data labelers is important. However, it is unclear whether students or Mechanical Turk workers would be better data labelers on these noisy, short posts. This paper compares the reliability of students and Mechanical Turk workers for a series of social media data labeling tasks. In general, we find that for most tasks, the Mechanical Turk workers have stronger agreement than the student workers. When we group labeling tasks based on difficulty, we find more consistency for Mechanical Turk workers across labeling tasks than student workers. Both these findings suggest that using Mechanical Turk workers for labeling social media posts leads to more reliable labeling than college students.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.226.72.194

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Singh, L.; Vanarsdall, R.; Wang, Y. and Gresenz, C. (2022). Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers?. In Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-583-8; ISSN 2184-285X, SciTePress, pages 408-415. DOI: 10.5220/0011278600003269

@conference{data22,
author={Lisa Singh. and Rebecca Vanarsdall. and Yanchen Wang. and Carole Roan Gresenz.},
title={Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers?},
booktitle={Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA},
year={2022},
pages={408-415},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011278600003269},
isbn={978-989-758-583-8},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA
TI - Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers?
SN - 978-989-758-583-8
IS - 2184-285X
AU - Singh, L.
AU - Vanarsdall, R.
AU - Wang, Y.
AU - Gresenz, C.
PY - 2022
SP - 408
EP - 415
DO - 10.5220/0011278600003269
PB - SciTePress