A Novel Phishing Email Detection Algorithm based on Multinomial Naive Bayes Classifier and Natural Language Processing

Omar Abdelaziz, Sahana Deb, Rania Hodhod, Lydia Ray

Abstract

Phishing attacks are a type of social engineering attacks which trick the user into sharing sensitive and personally identifiable information. With the use of machine learning techniques attackers are implementing new methods to scheme more convincing socially engineered messages making it harder for the victims to identify them. With about 3.8 billion email users worldwide and an average person receiving more than 100 emails per day, the importance of efficient and automatic detection of fraudulent emails becomes paramount. This paper proposes a novel application of natural language processing (NLP) and Naïve Bayes’ (NB) classifier to identify legitimate and phishing emails. The results show that Bayes’ classifier can be used effectively to detect phishing emails with accuracy of 96.03% and 97.21% for balanced and imbalanced datasets, respectively.

Download


Paper Citation


in Harvard Style

Abdelaziz O., Deb S., Hodhod R. and Ray L. (2020). A Novel Phishing Email Detection Algorithm based on Multinomial Naive Bayes Classifier and Natural Language Processing.In Proceedings of the 1st International Conference on Computing and Emerging Sciences - Volume 1: ICCES, ISBN 978-989-758-497-8, pages 69-73. DOI: 10.5220/0010412600690073


in Bibtex Style

@conference{icces20,
author={Omar Abdelaziz and Sahana Deb and Rania Hodhod and Lydia Ray},
title={A Novel Phishing Email Detection Algorithm based on Multinomial Naive Bayes Classifier and Natural Language Processing},
booktitle={Proceedings of the 1st International Conference on Computing and Emerging Sciences - Volume 1: ICCES,},
year={2020},
pages={69-73},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010412600690073},
isbn={978-989-758-497-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Computing and Emerging Sciences - Volume 1: ICCES,
TI - A Novel Phishing Email Detection Algorithm based on Multinomial Naive Bayes Classifier and Natural Language Processing
SN - 978-989-758-497-8
AU - Abdelaziz O.
AU - Deb S.
AU - Hodhod R.
AU - Ray L.
PY - 2020
SP - 69
EP - 73
DO - 10.5220/0010412600690073