AI-Based Anomaly Detection and Classiﬁcation of Trafﬁc Using Netﬂow

Gustavo Gonzalez Granadillo

1 a

and Nesrine Kaaniche

2 b

Schneider Electric, DCR Security Department, Barcelona, Spain

SAMOVAR, Telecom SudParis, Institut Polytechnique de Paris, France

Keywords:

Anomaly Detection, Network Trafﬁc Behavior, Classiﬁcation Algorithms, NetFlow.

Abstract:

Anomalies manifest differently in network statistics, making it difﬁcult to develop generalized models for

normal network behaviors and anomalies. This paper analyzes various Machine Learning (ML) and Deep

Learning (DL) algorithms employing supervised techniques for both binary and multi-class classiﬁcation of

network trafﬁc. Experiments have been conducted using a validated NetFlow-based dataset containing over

31 million incoming and outgoing network connections of an IT infrastructure. Preliminary results indicate

that no single model effectively detects all cyber-attacks. However, selected models for binary and multi-class

classiﬁcation show promising results, achieving performance levels of up to 99.9% in the best of the cases.

1 INTRODUCTION

The rapid evolution of cyber threats has necessitated

equally advanced defense mechanisms, with artiﬁcial

intelligence (AI) emerging as a cornerstone of modern

cybersecurity strategies. Recent scientiﬁc research

underscores AI’s dual role: while it empowers attack-

ers to launch sophisticated campaigns, it also provides

defenders with unprecedented tools to detect, ana-

lyze, and mitigate threats (Rahman et al., 2025). Key

conclusions from peer-reviewed studies and industry

analyses reveal that AI-driven detection systems excel

in identifying anomalies, automating responses, and

adapting to dynamic attack vectors. However, chal-

lenges such as explainability, adversarial AI manip-

ulation, ethical approaches, and the need for contin-

uous innovation persist (Khanna, 2025), (Mohamed,

2023).

While AI has demonstrated its ability to reduce

computational complexity, model training time, and

false alarms, there is a notable limitation in the in-

trusion detection domain. Most studies focus on a

limited number of algorithms, with the support vec-

tor machine being the predominant technique utilized

(Wiafe et al., 2020).

In this paper, we propose to analyze an IT net-

work trafﬁc dataset containing Netﬂow data (e.g., pro-

tocols, source and destination IP, source and destina-

https://orcid.org/0000-0003-2036-981X

https://orcid.org/0000-0002-1045-6445

tion port, packets, etc.) aiming to classify legitimate

and malicious trafﬁc. The selected dataset has been

created in 2017 by Hochschule Coburg, a cyberse-

curity research institute in Germany, and consists of

4 weeks of labeled internal and external communica-

tions, containing more than 31 million observations of

normal trafﬁc and cybersecurity incidents (e.g., Port

Scans, Ping Scans, Brute Force, and Denial of Ser-

vice). The objective of this work is to model inter-

nal communications to detect anomalies (malicious

communications) with a high degree of accuracy that

complements the information presented by commer-

cial Intrusion Detection Systems (IDSs) and helps in

decision making.

The contributions of this paper are summarized as

follows:

• A binary classiﬁcation analysis using supervised

learning techniques to compare the performance

of eight algorithms, enabling the selection of the

most effective method for classifying legitimate

versus malicious trafﬁc.

• A multi-class classiﬁcation analysis employing

supervised learning techniques to predict, not

only when the trafﬁc is normal or malicious, but

also to identify the speciﬁc type of anomaly as-

sociated to the malicious trafﬁc (e.g., Brute Force,

Denial of Service (DoS), Port Scan, or Ping Scan).

A set of experiments have been conducted using

multiple features of ﬂows in order to identify the most

predictive variables and the best approach to detect

644

Granadillo, G. G., Kaaniche and N.

AI-Based Anomaly Detection and Classiﬁcation of Trafﬁc Using Netﬂow.

DOI: 10.5220/0013552700003979

In Proceedings of the 22nd International Conference on Security and Cryptography (SECRYPT 2025), pages 644-649

ISBN: 978-989-758-760-3; ISSN: 2184-7711

anomalies in a given trafﬁc using NetFlow data. Re-

sults show a promising approach using Random For-

est for the binary supervised classiﬁcation approach

and an ensemble of ﬁve ML algorithms for the multi-

class supervised classiﬁcation approach.

The remainder of this paper is structured as fol-

lows. Section 2 presents the methodology used to

analyze the dataset and create the prediction models.

Section 3 describes the experimentation and results

obtained for the binary and multi-class classiﬁcation

approach using supervised learning techniques. Sec-

tion 4 discusses related work. Finally, conclusions

and perspective for future work are presented in Sec-

tion 5.

2 AI-BASED ANOMALY

DETECTION METHODOLOGY

This section describes the methodology followed

to build, test, and validate our proposed AI-based

anomaly detection models.

2.1 Data Access and Collection

Most of the existing datasets to classify network traf-

ﬁc are old and insufﬁcient in terms of understanding

the behavioral patterns of current cyber-attacks. A

great number of research work is still performed us-

ing DARPA and KDD databases, which are at least

20 years old, and which lack instances related to new

sophisticated attacks (Sarker et al., 2020), (Mubarak

et al., 2021). The dataset used in this research has

been created by the Corbug University of applied sci-

ences and arts in Germany in 2017. The Coburg In-

trusion Detection Data Sets (CIDDS) is a public and

open dataset that can be downloaded from the ofﬁcial

university’s website

. There are two datasets avail-

able: CIDDS-001, with over 31 million ﬂows; and

CIDDS-002, with over 16 million ﬂows, both contain-

ing NetFlow data with benign and malicious trafﬁc.

For this analysis, we have selected the CIDDS-

001 dataset which emulates a small business environ-

ment composed of an internal and external network

that include typical IT clients and servers e.g., web,

ﬁle, mail, backup, etc. The complete dataset has been

captured during 4 weeks and has more than 31 million

ﬂows (observations) from which 89.66% are benign

(ﬂows labeled as normal) and 10.34% are malicious

(ﬂows labeled as attacker or victim). In addition, ma-

https://www.hs-coburg.de/forschung/forschungsprojekte-

oeffentlich/informationstechnologie/cidds-coburg-intrusion-

detection-data-sets.html

licious ﬂows are further labeled as BruteForce, Dos,

PingScan or PortScan as shown in Table 1.

Table 1: CIDDS-001 Dataset.

Weeks Week 1 Week 2 Week 3 Week 4

Normal 7010897 8515329 6349783 6175897

DoS 1252127 1706900 0 0

PortScan 183511 82404 0 0

PingScan 3359 2731 0 0

BruteForce 1626 336 0 0

2.2 Data Preparation and Analysis

In order to clean the dataset and prepare it for its us-

age during the training and testing phases, we have

performed the following actions:

• The Flows column has been removed as all obser-

vations present the same value.

• For the binary classiﬁcation, attackType, at-

tackID, and attackDescription have been re-

moved; for the multi-class classiﬁcation, the class

column has been removed.

• Src IP Addr, and Dst IP Addr have been removed

since these features may change in future obser-

vations.

New features have been created and some cate-

gorical features have been transformed into numeric

values before they were used as input to the learning

algorithms. The following transformations have been

performed in the selected dataset:

• The class variable has been transformed into a bi-

nary feature with 0 representing the normal cate-

gory and 1 representing an attack.

• The Bytes variable presents some observations

with a non-numeric format (e.g., 1.0 M, which

corresponds to 1 million bytes), this variable has

been transformed into its corresponding numeric

value (e.g., 1.0 M = 1000000).

• Flags and Proto variables have been transformed

into their numeric values (e.g., .....F= 1, .A..S.=18,

TCP=6, UDP=17).

• Date ﬁrst seen has been split into four columns

(Day, Hour, Minute, Second), since the data has

been captured the same month and year, these two

options were not considered in the transformation.

• Two new features have been created: (i) Pack-

ets speed, the number of packets divided by the

duration; and (ii) Bytes speed, the number of

bytes divided by the duration

After the transformation, all numeric variables

have been standardized by using the mean and stan-

AI-Based Anomaly Detection and Classiﬁcation of Trafﬁc Using Netﬂow

645

dard deviation of the whole dataset. The resulting val-

ues range between -3 and 3.

2.3 Model Tuning and Training

For binary classiﬁcation we selected a sample of

10,486 observations (from which 88.2% are normal

ﬂows and 11.8% are attack ﬂows). For the multi-class

classiﬁcation we selected a sample of week 2 contain-

ing 13.613 observations (from which 79.9% belong

to the normal category and 21.1% belong to the at-

tack category). This latter is further divided into DoS,

Brute Force, Port Scan and Ping Scan.

During the training process, all algorithms are

tuned to obtain the parameters and hyper-parameters

that best ﬁt the model. Logistic Regression has no

parameters to tune but it has been used to identify

the best set of predictable variables through the clas-

sic standard method Stepwise AIC and BIC. Several

tests have been performed with various set of vari-

ables to which we applied cross-validation of 4 groups

and 5 repetitions. As a result, the best model suggests

the use of the following variables: Tos, Flags, Du-

ration, Bytes speed, SrcPt, Packets, Hour, and Pack-

ets speed.

2.4 Performance Evaluation

A variety of metrics are used to evaluate and analyze

the performance of classiﬁcation algorithms. Accu-

racy, precision, recall, and F1 scores are among the

most popular performance indicators currently used

in the literature (Osi et al., 2020). In addition, we have

evaluated the anomaly detection rate of each classiﬁ-

cation model. Anomaly detection is used to deﬁne the

ratio of the total number of correctly classiﬁed nega-

tive instances (e.g., anomalous connections due to at-

tacks) over the total number of negative prediction.

3 EXPERIMENTATION AND

RESULTS

This section describes the algorithms used for the bi-

nary and multi-class classiﬁcation problems studied,

as well as the main results obtained during the testing

phase.

3.1 Binary Classiﬁcation Using

Supervised Learning Techniques

We have selected eight algorithms covering various

supervised learning techniques, e.g., regression, de-

cision trees, artiﬁcial neural networks, support vector

machine, clustering, and Bayesian networks.

• Logistic Regression (LR): The best model

uses the following eight variables: Tos, Flags,

Duration, Bytes speed, SrcPt, Packets, Hour,

Packets speed and provides failure rate of 0.0360

and an Area Under the Curve (AUC) of 0.9411.

• Artiﬁcial Neural Networks (NNET): The best

model is composed of 15 artiﬁcial neurons, a

decay of 0.01 and max iterations maxit of 100.

It uses model averaging, where the same neural

network model is ﬁt using different random

number seeds. The winner model provides a

failure rate of 0.098 and an AUC of 0.9930.

• Random Forest (RF): The best model uses

mtry=7, ntree=150 and sampsize=7000, resulting

in a failure rate of 0.0098 and an AUC of 0.9930.

• Gradient Boosting (GBM): The best model

uses n.trees=3000, interaction.depth=2, shrink-

age=0.1, and n.minobsinnode=20, resulting in a

failure rate of 0.0016 and an AUC of 0.9992.

• eXtreme Gradient Boosting (XGBM): The

best model uses min child weight=5, eta=0.1,

and nrounds=3000, resulting in a failure rate

of 0.0020 and an AUC of 0.9995. Using this

algorithm, variables such as Tos and Hour have a

very low prediction power.

• Support Vector Machine (SVM): The best

model uses C=100, resulting in a failure rate of

0.0353 and an AUC of 0.9437.

• K-Nearest Neighbor (KNN): The best model

uses k=1, resulting in a failure rate of 0.0040 and

an AUC of 0.9911.

• Na

ıve Bayes (NB):The best model has useker-

nel=True, ﬂ=0, and adjust=1, resulting in a failure

rate of 0.0917 and an AUC of 0.9874.

All models have gone through a second cross-

validation process with 4 groups and 10 repetitions.

Table 2 presents the confusion matrix and the results

obtained for the accuracy, precision, recall, and F1-

score metrics, as well as the average prediction time

per million observations.

The best model in terms of prediction of both be-

nign and malicious trafﬁc is the Random Forest. We

SECRYPT 2025 - 22nd International Conference on Security and Cryptography

646

Table 2: Testing Results for Binary Classiﬁcation using Supervised Learning Techniques.

Metrics LR NNET RF GBM XGBM SVM KNN NB

TP 27,486,170 28,014,392 28,030,675 28,028,887 28,026,05 27,272,577 27,920,149 27,922,964

FP 451,444 468,968 29,312 68,923 46,655 366,613 2,353,599 483,397

FN 565,736 37,514 21,231 23,019 25,855 779,329 131,757 128,942

TN 2,784,583 2,767,059 3,206,715 3,167,104 3,189,372 2,869,414 882,428 2,752,630

Accuracy 0.9675 0.9838 0.9984 0.9971 0.9977 0.9634 0.9206 0.9804

Recall 0.9838 0.9835 0.9990 0.9975 0.9983 0.9867 0.9223 0.9830

Precision 0.9798 0.9987 0.9992 0.9992 0.9991 0.9722 0.9953 0.9954

F1-Score 0.9818 0.9910 0.9991 0.9984 0.9987 0.9794 0.9574 0.9892

Anom Det. 0.8311 0.9866 0.9934 0.9928 0.9920 0.7864 0.8701 0.9553

Pred Time 4.74 11.68 4.79 37.76 3.92 2.84 70.86 569.27

have built some ensemble models, but as the improve-

ment is almost insigniﬁcant, we decided to discard it

from the analysis and avoid more complex models.

In general, all models provide a good performance,

with scores ranging from 92.06% to 99.92%. The best

models are those using decision trees algorithms (i.e.,

RF, GBM, and XGBM)

Regarding the prediction time, we have noticed

that some algorithms are much faster than others.

For these evaluations, we have used a computer with

an Intel(R) Core(TM) i7-10870H processor, 2.20GHz

and 16GB of RAM with a Windows 11 64-bit Operat-

ing System. Additionally, it has an NVIDIA GeForce

RTX 3060 graphics card with 6GB of display memory

(and 8GB of shared memory).

The model created with the Random Forest algo-

rithm, not only offers the best prediction results, but

also performs the predictions very quickly (less than

ﬁve seconds per million observations). The fastest of

all algorithms is SVM (2.84 seconds per million ob-

servations) and the slowest is Na

ıve Bayes (9.49 min-

utes per million observations). The Random Forest

model has been selected as the winner for the binary

classiﬁcation analysis using supervised learning tech-

niques.

3.2 Multi-Class Classiﬁcation Using

Supervised Learning Techniques

During this phase, we tuned and trained six algo-

rithms with cross-validation of 4 groups and 5 rep-

etitions using all 14 variables present in the dataset.

From the list of previous supervised learning algo-

rithms we discarded logistic regression, as this is not

a binary classiﬁcation problem; and Na

ıve Bayes, as

the prediction time was too high.

• Artiﬁcial Neural Networks (NNET): The best

model is composed of 20 artiﬁcial neurons, a

decay of 0.01, and maxit of 100. It uses model

averaging and provides an accuracy of 0.9264.

• Random Forest (RF): The best model uses

mtry=6, ntree=300, and sampsize=default,

resulting in an accuracy of 0.9949. The most

predictable variables are DstPt, SrcPt, Flags,

Packets, and Bytes.

• Gradient Boosting (GBM): The best model

uses n.trees=500, interaction.depth=2, shrink-

age=0.1, and n.minobsinnode=20, resulting in

an accuracy of 0.9958. The most predictable

variables are DstPt, Duration, Flags, Bytes,

Packets speed, Packets, SrcPt, and Proto.

• eXtreme Gradient Boosting (XGBM): The

best model uses min child weight=5, eta=0.1,

and nrounds=1000, resulting in an accuracy

of 0.9959. The most predictable variables are

DstPt, Flags, Bytes, Packets, Duration, SrcPt,

Packets speed, Proto, and Tos.

• Support Vector Machine (SVM): The best

model uses C=500, resulting in an accuracy of

0.9458.

• K-Nearest Neighbor (KNN): The best model

uses k=1, resulting in an accuracy of 0.9696.

For the testing phase, we have evaluated the

dataset from weeks 1 and 2 (as the other two weeks

do not present anomalous observations). The metrics

used in this analysis include Precision, Recall and F1-

Score (Bex, 2021), as well as the time spent by each

model in making predictions. Figures 1 and 2 show

the preliminary results obtained in this phase.

As previously shown, the models that offer the

highest performance in detecting most of the attacks

are those using decision trees algorithms (i.e., RF,

GBM, and XGBM). However, the detection of Ping

Scans is very low (57% in the best of the cases for the

F1-score), with algorithms (e.g., NNET) that fail to

detect this category entirely. This is very likely due to

the fact that there are not enough observations of this

AI-Based Anomaly Detection and Classiﬁcation of Trafﬁc Using Netﬂow

647

Figure 1: Precision.

Figure 2: Average Prediction Time.

class in the training dataset. Similarly, the detection

of brute force attacks reaches 77% in the best of the

cases for the F1-score, with algorithms (e.g., SVM)

that fail to detect this category entirely.

In terms of prediction time, the fastest algorithm

is the SVM, which perform predictions with a rate of

4.66 seconds per million observations. The slowest

algorithm is the KNN, with a rate of 2.56 minutes per

million observations.

Aiming to improve the detection rate of the cate-

gories for Ping Scans and Brute Force attacks, we de-

veloped ensemble models using the majority voting

method, where predictions are based on the class pre-

dicted by the majority of models. Since we have six

models, we built ensembles of two, three, four, ﬁve,

and six models. The winner model in this phase is

an ensemble composed of ﬁve models (i.e., XGBM,

GBM, RF, NNET, and SVM). This ensemble im-

proves the prediction of Ping Scans to 62.17% and

Brute Force attacks to 83.67% while keeping the pre-

diction of all other classes higher than 97%.

4 RELATED WORK

AI-based techniques have been widely proposed

in the literature as a viable approach for network

anomaly detection. Deplace et al. (Delplace et al.,

2020), present Machine Learning (ML) and Deep

Learning (DL) models for the detection of botnets us-

ing Netﬂow data sets. The results show that the Ran-

dom Forest (RF) classiﬁer performs the best in 13 dif-

ferent scenarios with an accuracy greater than 95%.

One of our previous works (Gonzalez-Granadillo

et al., 2019) uses the One-CLass SVM algorithms

to classify network trafﬁc as anomalous and normal

based on unlabeled NetFlow data. Such approaches

provide promising results, although they are limited

to the type of dataset used in the training and testing

processes.

Several researchers (Garg et al., 2019), (Gu et al.,

2019) developed ML models to improve the perfor-

mance of Intrusion Detection Systems. The focus has

been on developing optimized features and improving

classiﬁers to reduce false alarms. However, although

the results are outstanding (reaching more than 99%

in terms of accuracy), most of these works have been

tasted and validated against outdated datasets, such as

DARPA (1998), and KDD (1999), for which the char-

acteristics and volume of attacks have changed signif-

icantly since that time.

Current research (Rabbani et al., 2021) proposes

to perform the feature extraction by using collection

tools such as Bro IDS and Argus or to complement

features extracted with NetFlow. Although the au-

thors extracted 49 features with their approach, Ran-

dom Forest reached optimal performance with 11 fea-

tures, whereas Logistic Regression did it with 18 fea-

tures. Results demonstrated more than 99% of detec-

tion accuracy and 93.6% of categorization accuracy,

with the inability to categorize three out of eight at-

tacks due to feature similarities and unbalanced data.

Our work differs from previous works in the way

that it deﬁnes new features to be used for evaluating

the performance of several ML algorithms for binary

and multi-class detection using supervised learning

techniques. In total we have evaluated eight algo-

rithms and have compared against potential ensem-

bles to detect the best model that classiﬁes network

trafﬁc. To overcome the dataset issues, we used a

big and updated dataset developed for the training and

evaluation of several cyber-attacks.

5 CONCLUSION AND FUTURE

WORK

In this paper we have analyzed multiple machine

learning algorithms to detect network trafﬁc anoma-

lies (i.e., Brute-Force attacks, DoS attacks, Port

Scans, and Ping Scans) based on the analysis of a va-

riety of NetFlow features. Two main analysis have

been carried out in this paper: (i) Binary classiﬁcation

SECRYPT 2025 - 22nd International Conference on Security and Cryptography

648

using supervised learning algorithms; and (ii) Multi-

class classiﬁcation using supervised learning algo-

rithms.

The best performance is obtained with the Ran-

dom Forest, in the binary classiﬁcation using super-

vised learning techniques; and an ensemble of ﬁve al-

gorithms (i.e., XGBM, GBM, Random Forest, Neural

Networks and SVM) in the multi-class classiﬁcation

using supervised learning techniques.

The main limitation of our proposed approach is in

terms of explainability. While some models (e.g., Lo-

gistic Regression, Random Forest, Naive Bayes) pro-

vide indicators of the prediction power of each vari-

able, some others (e.g., neural networks, SVM, en-

sembles) are limited in terms of explainability / in-

terpretability, preventing us from identifying the key

feature or group of features that contributes the best in

classifying the connections. It is, therefore, not pos-

sible to assign weights to each feature based on their

contribution to accurately classify the data.

The analysis performed in this paper is also lim-

ited to the use of NetFlow data. Although some ap-

proaches suggest the use of datasets that combine Ar-

gus with tools such as Wireshark or Bro IDS (Rab-

bani et al., 2021), ours requires the input data feeding

the model to be transformed using NetFlow. There

are several advantages of this approach over packets,

such as PCAP (packet capture) dumps, e.g., keeping

only certain information from network packet headers

and not the whole payload. Therefore, the process-

ing and analysis of the data yields more interesting

performance results (since a single ﬂow can represent

thousands of packets) enabling almost real-time anal-

ysis.

Future work will consider using unsupervised

learning approaches and a training sample with a

greater number of attack instances (especially Ping

Scan] and Brute Force) to see if it is possible to im-

prove detection rates on these classes. Having a bal-

anced dataset would also solve the issue. In addi-

tion, it is important to evaluate datasets with other

network attacks (e.g., botnet, malware, man-in-the-

middle, phishing, etc.) to verify up to which extent

the developed models are able to detect attacks differ-

ent from those existing in the training dataset.

REFERENCES

Bex, T. (2021). Comprehensive guide to multiclass classiﬁ-

cation metrics. In Towards data Science. Available at

https://towardsdatascience.com/comprehensive-

guide-on-multiclass-classiﬁcation-metrics-

af94cfb83fbd.

Delplace, A., Hermoso, S., and Anandita, K. (2020).

Cyber attack detection thanks to machine learn-

ing algorithms. ArXiv preprint, availabble at

https://arxiv.org/abs/2001.06309.

Garg, S., Kaur, K., Kumar, N., Kaddoum, G., Zomaya,

A. Y., and Ranjan, R. (2019). A hybrid deep learning

based model for anomaly detection in cloud datacen-

tre networks. IEEE Trans. Netw. Serv. Manag.

Gonzalez-Granadillo, G., Diaz, R., Medeiros, I., Gonzalez-

Zarzosa, S., and Machnicki, D. (2019). Lads: A live

anomaly detection system based on machine learning

methods. In Security and Cryptography, SECRYPT.

Gu, T., Chen, H., Chang, L., and Li, L. (2019). Intrusion de-

tection system based on improved abc algorithm with

tabu search. IEEE Trans. Electr. Electron. Eng., 14.

Khanna, S. (2025). AI in cybersecurity: A comprehensive

review of threat detection and prevention mechanisms.

International Journal of Sustainable Devlopment in

Field of IT, 17.

Mohamed, N. (2023). Current trends in ai and ml for cyber-

security: A state-of-the-art survey. Cogent Engineer-

ing, 10(2):2272358.

Mubarak, S., Habaebi, M. H., Islam, M. R., Rahman, F.

D. A., and Tahir, M. (2021). Anomaly detection in ics

datasets with machine learning algorithms. Computer

Systems Science & Engineering.

Osi, A. A., Abdu, M., Muhammad, U., Ibrahim, A., Isma’il,

L. A., Suleiman, A. A., Abdulkadir, H. S., Sada, S. S.,

Dikko, H. G., and Ringim, M. Z. (2020). A classi-

ﬁcation approach for predicting COVID-19 patient’s

survival outcome with machine learning techniques.

MedRxiv, the preprint server for heatlh sciences.

Rabbani, M., Wang, Y., Khoshkangini, R., Jelodar, H.,

Zhao, R., Ahmadi, S. B. B., and Ayobi, S. (2021). A

review on machine learning approaches for network

malicious behavior detection in emerging technolo-

gies. Entropy, 23:529.

Rahman, M., Uddin, I., Das, R., Saha, T., Haque, E. S. M.,

Shatu, N. R., and Shaﬁq, S. I. (2025). Application of

artiﬁcial intelligence in detecting and mitigating cyber

threats. International Research Journal of Innovations

in Engineering and Technology (IRJIET), 9:17–26.

Sarker, I. H., Kayes, A. S. M., Badsha, S., Alqahtani, H.,

Watters, P., and Ng, A. (2020). Cybersecurity data

science: an overview from machine learning perspec-

tive. Journal of Big Data, 7(41).

Wiafe, I., Koranteng, F. N., Obeng, E. N., Assyne, N., and

Wiafe, A. (2020). Artiﬁcial intelligence for cyber-

security: A systematic mapping of literature. IEEE

Open Access Journal, 8.

AI-Based Anomaly Detection and Classiﬁcation of Trafﬁc Using Netﬂow

649