WICO Graph: A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets

Daniel Schroeder, Daniel Schroeder, Ferdinand Schaal, Petra Filkukova, Konstantin Pogorelov, Johannes Langguth

Abstract

In the wake of the COVID-19 pandemic, a surge of misinformation has flooded social media and other internet channels, and some of it has the potential to cause real-world harm. To counteract this misinformation, reliably identifying it is a principal problem to be solved. However, the identification of misinformation poses a formidable challenge for language processing systems since the texts containing misinformation are short, work with insinuation rather than explicitly stating a false claim, or resemble other postings that deal with the same topic ironically. Accordingly, for the development of better detection systems, it is not only essential to use hand-labeled ground truth data and extend the analysis with methods beyond Natural Language Processing to consider the characteristics of the participant’s relationships and the diffusion of misinformation. This paper presents a novel dataset that deals with a specific piece of misinformation: the idea that the 5G wireless network is causally connected to the COVID-19 pandemic. We have extracted the subgraphs of 3,000 manually classified Tweets from Twitter’s follower network and distinguished them into three categories. First, subgraphs of Tweets that propagate the specific 5G misinformation, those that spread other conspiracy theories, and Tweets that do neither. We created the WICO (Wireless Networks and Coronavirus Conspiracy) dataset to support experts in machine learning experts, graph processing, and related fields in studying the spread of misinformation. Furthermore, we provide a series of baseline experiments using both Graph Neural Networks and other established classifiers that use simple graph metrics as features. The dataset is available at https://datasets.simula.no/wico-graph..

Download


Paper Citation


in Harvard Style

Schroeder D., Schaal F., Filkukova P., Pogorelov K. and Langguth J. (2021). WICO Graph: A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 257-266. DOI: 10.5220/0010262802570266


in Bibtex Style

@conference{icaart21,
author={Daniel Schroeder and Ferdinand Schaal and Petra Filkukova and Konstantin Pogorelov and Johannes Langguth},
title={WICO Graph: A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={257-266},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010262802570266},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - WICO Graph: A Labeled Dataset of Twitter Subgraphs based on Conspiracy Theory and 5G-Corona Misinformation Tweets
SN - 978-989-758-484-8
AU - Schroeder D.
AU - Schaal F.
AU - Filkukova P.
AU - Pogorelov K.
AU - Langguth J.
PY - 2021
SP - 257
EP - 266
DO - 10.5220/0010262802570266