Spam Filtering in the Modern Era: A Review of Machine Learning, Deep Learning, and System Comparisons

Xinyi Wang

2024

Abstract

Spam poses a great challenge to Internet users, threatening their productivity, data security, and overall user experience. This review examines the current state of spam filtering systems, particularly those using Machine Learning (ML) and Deep Learning (DL) models. A comparative analysis of models such as Naive Bayes, Support Vector Machines (SVM), Random Forest (RF), Bidirectional Encoder Representations from Transformers (BERT), and Convolutional Neural Networks (CNNs) is conducted to assess their effectiveness and limitations in distinguishing spam from legitimate (ham) emails. Additionally, this study highlights emerging challenges, including Concept Drift and Large Language Model (LLM)-modified Spam, which necessitate the development of more adaptive and intelligent filtering solutions. The future trends indicate that the spam filtering system will gradually shift to Artificial Intelligence (AI)-driven automation and personalized, with lifelong learning models that continuously learn from new data leading the way. Such advancements are essential to counter sophisticated, hard-to-detect spam, like sophisticated LLM-crafted spam, like those crafted by large language models, ensuring a safer and more personalized email experience for users.

Download


Paper Citation


in Harvard Style

Wang X. (2024). Spam Filtering in the Modern Era: A Review of Machine Learning, Deep Learning, and System Comparisons. In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML; ISBN 978-989-758-754-2, SciTePress, pages 451-458. DOI: 10.5220/0013526000004619


in Bibtex Style

@conference{daml24,
author={Xinyi Wang},
title={Spam Filtering in the Modern Era: A Review of Machine Learning, Deep Learning, and System Comparisons},
booktitle={Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML},
year={2024},
pages={451-458},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013526000004619},
isbn={978-989-758-754-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML
TI - Spam Filtering in the Modern Era: A Review of Machine Learning, Deep Learning, and System Comparisons
SN - 978-989-758-754-2
AU - Wang X.
PY - 2024
SP - 451
EP - 458
DO - 10.5220/0013526000004619
PB - SciTePress