The Investigation of Progress Related to Harmful Speech Detection Models
Ruikun Wang
2024
Abstract
With the rise of the internet, individuals can express their opinions or engage in conversations with others on social platforms. However, along with this development comes the proliferation of harmful speech on these platforms. Harmful speech poses various dangers, such as fueling conflicts and contributing to social issues. Consequently, effective detection and regulation of harmful speech have become hot topics of discussion. This paper provides an overview of several existing models for detecting harmful speech. Firstly, it reviews traditional detection methods and introduces three classic detection models: the Dictionary method, which identifies harmful vocabulary; the n-gram method and skip-gram method, which assess context for detection. Additionally, it reviews machine learning methods, highlighting both traditional machine learning approaches and the recent popular use of large language models as examples. Through analysis, it is noted that traditional detection methods exhibit low implementation costs but suffer from questionable accuracy due to challenges in understanding natural language, leading to reduced precision. In contrast, traditional machine learning methods, although capable of to some extent understanding natural language, require significant human and material resources for model training. Meanwhile, models utilizing large language datasets are able to further comprehend natural language, enhancing model accuracy. The article points out potential shortcomings in current models, such as inaccuracies in detection results due to diverse cultural backgrounds, misidentification of emerging words and changes in word meanings, as well as negative impacts of data biases on model performance.
DownloadPaper Citation
in Harvard Style
Wang R. (2024). The Investigation of Progress Related to Harmful Speech Detection Models. In Proceedings of the 1st International Conference on Engineering Management, Information Technology and Intelligence - Volume 1: EMITI; ISBN 978-989-758-713-9, SciTePress, pages 57-61. DOI: 10.5220/0012902100004508
in Bibtex Style
@conference{emiti24,
author={Ruikun Wang},
title={The Investigation of Progress Related to Harmful Speech Detection Models},
booktitle={Proceedings of the 1st International Conference on Engineering Management, Information Technology and Intelligence - Volume 1: EMITI},
year={2024},
pages={57-61},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012902100004508},
isbn={978-989-758-713-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 1st International Conference on Engineering Management, Information Technology and Intelligence - Volume 1: EMITI
TI - The Investigation of Progress Related to Harmful Speech Detection Models
SN - 978-989-758-713-9
AU - Wang R.
PY - 2024
SP - 57
EP - 61
DO - 10.5220/0012902100004508
PB - SciTePress