
using supervised learning algorithms; and (ii) Multi-
class classification using supervised learning algo-
rithms.
The best performance is obtained with the Ran-
dom Forest, in the binary classification using super-
vised learning techniques; and an ensemble of five al-
gorithms (i.e., XGBM, GBM, Random Forest, Neural
Networks and SVM) in the multi-class classification
using supervised learning techniques.
The main limitation of our proposed approach is in
terms of explainability. While some models (e.g., Lo-
gistic Regression, Random Forest, Naive Bayes) pro-
vide indicators of the prediction power of each vari-
able, some others (e.g., neural networks, SVM, en-
sembles) are limited in terms of explainability / in-
terpretability, preventing us from identifying the key
feature or group of features that contributes the best in
classifying the connections. It is, therefore, not pos-
sible to assign weights to each feature based on their
contribution to accurately classify the data.
The analysis performed in this paper is also lim-
ited to the use of NetFlow data. Although some ap-
proaches suggest the use of datasets that combine Ar-
gus with tools such as Wireshark or Bro IDS (Rab-
bani et al., 2021), ours requires the input data feeding
the model to be transformed using NetFlow. There
are several advantages of this approach over packets,
such as PCAP (packet capture) dumps, e.g., keeping
only certain information from network packet headers
and not the whole payload. Therefore, the process-
ing and analysis of the data yields more interesting
performance results (since a single flow can represent
thousands of packets) enabling almost real-time anal-
ysis.
Future work will consider using unsupervised
learning approaches and a training sample with a
greater number of attack instances (especially Ping
Scan] and Brute Force) to see if it is possible to im-
prove detection rates on these classes. Having a bal-
anced dataset would also solve the issue. In addi-
tion, it is important to evaluate datasets with other
network attacks (e.g., botnet, malware, man-in-the-
middle, phishing, etc.) to verify up to which extent
the developed models are able to detect attacks differ-
ent from those existing in the training dataset.
REFERENCES
Bex, T. (2021). Comprehensive guide to multiclass classifi-
cation metrics. In Towards data Science. Available at
https://towardsdatascience.com/comprehensive-
guide-on-multiclass-classification-metrics-
af94cfb83fbd.
Delplace, A., Hermoso, S., and Anandita, K. (2020).
Cyber attack detection thanks to machine learn-
ing algorithms. ArXiv preprint, availabble at
https://arxiv.org/abs/2001.06309.
Garg, S., Kaur, K., Kumar, N., Kaddoum, G., Zomaya,
A. Y., and Ranjan, R. (2019). A hybrid deep learning
based model for anomaly detection in cloud datacen-
tre networks. IEEE Trans. Netw. Serv. Manag.
Gonzalez-Granadillo, G., Diaz, R., Medeiros, I., Gonzalez-
Zarzosa, S., and Machnicki, D. (2019). Lads: A live
anomaly detection system based on machine learning
methods. In Security and Cryptography, SECRYPT.
Gu, T., Chen, H., Chang, L., and Li, L. (2019). Intrusion de-
tection system based on improved abc algorithm with
tabu search. IEEE Trans. Electr. Electron. Eng., 14.
Khanna, S. (2025). AI in cybersecurity: A comprehensive
review of threat detection and prevention mechanisms.
International Journal of Sustainable Devlopment in
Field of IT, 17.
Mohamed, N. (2023). Current trends in ai and ml for cyber-
security: A state-of-the-art survey. Cogent Engineer-
ing, 10(2):2272358.
Mubarak, S., Habaebi, M. H., Islam, M. R., Rahman, F.
D. A., and Tahir, M. (2021). Anomaly detection in ics
datasets with machine learning algorithms. Computer
Systems Science & Engineering.
Osi, A. A., Abdu, M., Muhammad, U., Ibrahim, A., Isma’il,
L. A., Suleiman, A. A., Abdulkadir, H. S., Sada, S. S.,
Dikko, H. G., and Ringim, M. Z. (2020). A classi-
fication approach for predicting COVID-19 patient’s
survival outcome with machine learning techniques.
MedRxiv, the preprint server for heatlh sciences.
Rabbani, M., Wang, Y., Khoshkangini, R., Jelodar, H.,
Zhao, R., Ahmadi, S. B. B., and Ayobi, S. (2021). A
review on machine learning approaches for network
malicious behavior detection in emerging technolo-
gies. Entropy, 23:529.
Rahman, M., Uddin, I., Das, R., Saha, T., Haque, E. S. M.,
Shatu, N. R., and Shafiq, S. I. (2025). Application of
artificial intelligence in detecting and mitigating cyber
threats. International Research Journal of Innovations
in Engineering and Technology (IRJIET), 9:17–26.
Sarker, I. H., Kayes, A. S. M., Badsha, S., Alqahtani, H.,
Watters, P., and Ng, A. (2020). Cybersecurity data
science: an overview from machine learning perspec-
tive. Journal of Big Data, 7(41).
Wiafe, I., Koranteng, F. N., Obeng, E. N., Assyne, N., and
Wiafe, A. (2020). Artificial intelligence for cyber-
security: A systematic mapping of literature. IEEE
Open Access Journal, 8.
AI-Based Anomaly Detection and Classification of Traffic Using Netflow
649