Comparing Support Vector Machine and Neural Network Classifiers
of CVE Vulnerabilities
Grzegorz J. Blinowski
a
, Paweł Piotrowski and Michał Wiśniewski
Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warszawa, Poland
Keywords: Internet of Things, Internet of Things Security, System Vulnerability Classification, CVE, Neural Networks,
Support Vector Machine.
Abstract: The Common Vulnerabilities and Exposures (CVE) database is the largest publicly available source of
structured data on software and hardware vulnerability. In this work, we analyze the CVE database in the
context of IoT device and system vulnerabilities. We employ and compare support vector machine (SVM)
and neural network (NN) algorithms on a selected subset of the CVE database to classify vulnerability records
in this framework. Our scope of interest consists of records that describe vulnerabilities of potential IoT
devices of different types, such as home appliances, SCADA (industry) devices, mobile controllers,
networking equipment and others. The purpose of this work is to develop and test an automated system of
recognition of IoT vulnerabilities to test two different methods of classification (SVM and NN) and to find
an optimal timeframe for training (historical) data.
1 INTRODUCTION AND
BACKGROUND
1.1 IoT Applications and Architecture:
An Outline
IoT can be most broadly defined as an interconnection
of various uniquely addressable objects through
communication protocols. It can also be described as a
communication system paradigm in which the objects
of everyday life (or industrial devices), equipped with
microcontrollers, network transmitters, and suitable
protocol stacks that allow them to communicate with
one another and (via ubiquitous cloud infrastructure)
with users, become an integral part of the Internet
environment (Atzori et al., 2010). The scope of IoT
deployments is wide and covers areas such as (Da Xu
et al., 2014; Al-Faquaha et al., 2015) home appliances
("smart homes"), smart cities, smart environments
(monitoring), smart agriculture and farming, smart
electricity grids, smart manufacturing and industrial
security and sensing as IIoT (Industrial IoT), and smart
healthcare. In this work, we will consider an IoT model
compatible with the reference architecture model
proposed by the EU FP7 IoT-A project (EU FP7, 2007)
and the IoT-A tree structure (Bauer et al., 2013) that
a
https://orcid.org/0000-0002-0869-2828
consists of three major levels:
Perception and execution layer
Network layer
Cloud or application layer.
1.2 Security Issues with IoT Systems
In this work we will focus on threats against IoT
systems, which occur when a flaw in an IoT device or
application, on the perception, network or cloud level,
is exploited by a hacker, and the device or application
is compromised i.e. full or limited access to
functions and data is gained by an attacker.
In (Ling et al., 2018), the authors have proposed
five "dimensions" of IoT security: hardware,
operating system/firmware, software, networking,
and data. The majority of security problems emerging
in today’s IoT systems result directly from buggy,
incomplete, or outdated software and hardware
implementations in the perception layer, especially in
home and office appliances and in industrial systems.
Typical vulnerabilities in this layer emerge from
common stack or heap overrun in legacy software,
weak (and often built-in) passwords (such was the
case in the famous Mirai botnet (Antonakakis et al.,
2017)), and faulty pairing and binding
implementation. Major protocol flaw design error
734
Blinowski, G., Piotrowski, P. and Wi
´
sniewski, M.
Comparing Support Vector Machine and Neural Network Classifiers of CVE Vulnerabilities.
DOI: 10.5220/0010574807340740
In Proceedings of the 18th International Conference on Security and Cryptography (SECRYPT 2021), pages 734-740
ISBN: 978-989-758-524-1
Copyright
c
2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
(such as Heartbleed and DROWN (Durumeric et al.,
2014; Aviram et al., 2016)) is much rarer as a cause
for vulnerabilities but also happens. The most
common vulnerabilities are summarized in the
OWASP Top 10 list (OWASP 2021).
1.3 Scope of This Work and Related
Research
In this work, we propose a classification of device-
related (i.e. not “pure software”) vulnerability data for
IoT and IIoT equipment. We have divided
vulnerability descriptors from the Common
Vulnerability and Exposures (CVE) public database
into 7 distinct categories, including home equipment,
SCADA devices, and networking systems. The
database samples were hand-classified by us based on
our expert knowledge. We used this data to train
neural network (NN) and support vector machine
(SVM) classifiers to predict categories of “new”
vulnerabilities – for example, data from the year 2018
was used to classify 2019’s data, etc. The rationale
behind such predictions is to prevent and mitigate
threats resulting from new vulnerabilities, as when a
new vulnerability or exploit is discovered, it is often
critical to learn its scope by automatic means as fast
as possible. This is a difficult task given the size of
the database and the rate of its growth – each day tens
of new records are added to the CVE database alone.
AI based classification tools operating on the similar
principles to the ones proposed by us can be used to
filter incoming vulnerability data with respect to
given organization’s network and user infrastructure.
Such solutions are part of software tools supporting
SOC's (Security Operating Centers), and also
emerging SOAR solutions (Security Orchestration,
Automation and Response).
Prior research on automatic analysis and
classification of vulnerability databases includes the
following: models and methodologies of categorizing
vulnerabilities from the CVE database according to
their security types based on Bayesian networks
(Wang and Guo, 2010; Na et al., 2016); in (Neuhaus
and Zimmermann, 2010) topic models were used to
analyze security trends in the CVE database with no
prior (expert) knowledge; and Huang et. al. (Huang et
al., 2019) proposed an automatic classification of
records from the Network Vulnerability Database
(NVD) based on a NN the authors compared their
model to Bayes and k-nearest neighbor models and
found it superior. Inspired by the above mentioned
research we have decided to employ two most
promising methods: SVM and neural networks. All of
the research cited above focused on categorizing the
software aspect of vulnerabilities, such as SQL
injection, race condition, and command shell injection.
In our previous related work (Blinowski and
Piotrowski, 2020), we discussed CVE classification
with an SVM. Here, we extend this research to include
a neural network classifier. According to our
knowledge, no prior work has been done regarding
categorizing the impacted equipment system or
device – and our work tries to address this gap.
This paper is organized as follows: in section 2,
we describe the contents and structure of the CVE
database and NVD records. In section 3, we introduce
our proposed classes of IoT devices and we briefly
discuss SVM and NN classifier methods and the
measures used by us to test the quality of the
classifiers. In section 4, we present the results of the
classification. Our work is summarized in section 5.
2 STRUCTURE AND CONTENTS
OF CVE DATABASE
2.1 The CVE Database
In this work, we used an annotated version of the
CVE database known as the NVD, which is hosted by
the National Institute of Standards and Technology
(NIST). The NVD is created on the basis of
information provided by the CVE database hosted at
MITRE (Mitre, 2020). CVE assigns identifiers (CVE-
IDs) to publicly announced vulnerabilities. NIST
augments the CVE database with information such as
structured product names and versions, and also maps
the entries to CWE names. The NVD feed is provided
both in XML and JSON formats structured in year-
by-year files as a single whole-database file and as an
incremental feed reflecting the current year’s
vulnerabilities.
The fields NVD record fields that are relevant for
further discussion are as follows:
entry contains
record ID as issued by MITRE the
id is in the form
CVE-yyyy-nnnnn (e.g. CVE-2017-3741) and is
commonly used in various other databases, documents,
etc. to refer to a given vulnerability;
vuln identifies
software and hardware products affected by a
vulnerability – this record contains the description of a
product and follows the specifications of the Common
Platform Enumeration (CPE) standard (see section
2.2);
vuln:cvs and cvss:base_metrics describe
the scope and impact of the vulnerability, and this data
allows real-world consequences of the vulnerability to
be identified; and
vuln:summary holds a short
informal description of the vulnerabilities.
Comparing Support Vector Machine and Neural Network Classifiers of CVE Vulnerabilities
735
2.2 Common Platform Enumeration
(CPE)
CPE is a formal naming scheme for identifying
applications, hardware devices, and operating
systems. CPE is part of the Security Content
Automation Protocol (SCAP) standard (NIST, 2020),
The CPE naming scheme is based on a set of
attributes called Well-Formed CPE Name (WFN)
compatible with the CPE Dictionary format (NIST
CPE, 2020). The following attributes are part of this
format: part, vendor, product, version, update,
edition, language, software edition, target software,
target hardware, and other (not all attributes are
always present in the CPE record; very often “update”
and the attributes that follow are omitted).
The CVE database uses URI format for CPE,
and we will only discuss this format. For example, in
the CPE record
cpe:/h:d-link:dgs-1100-05:- ,
the attributes are as follows:
part:h (indicating
hardware device),
vendor:d-link, product:dgs-
1100-05, and version and the following attributes are
not provided. In the NIST CVE records, logical
expression built from CPE descriptors are used to
indicate sets of affected software and/or hardware
platforms. An example is given in Figure 1.
<vuln:vulnerable-configuration
id="http://nvd.nist.gov/">
<cpe-lang:fact-ref name="cpe:/o:
d-link:dgs-1100_firmware:1.01.018"/>
<cpe-lang:logical-test operator="OR"
negate="false">
<cpe-lang:fact-ref name="cpe:/h:
d-link:dgs-1100-05:-"/>
<cpe-lang:fact-ref name="cpe:/h:
d-link:dgs-1100-10mp:-"/>
</cpe-lang:logical-test>
</vuln:vulnerable-configuration>
Figure 1: A vulnerable configuration record from CVE a
logical expression built from CPE identifiers.
3 CVE DATA CLASSIFICATION
AND ANALYSIS
3.1 Data Selection
From the NVD database, we extracted only records
marked in their CPE descriptor as “hardware”. We
assumed that they potentially reflected IoT and IIoT
connected devices from the perception or network
layer. The exact criterion was as follows: if any of the
descriptor in the vuln:vulnerable-configuration
section contained the “part” attribute set to “h”, then
the record was selected for further consideration. We
also narrowed down the timeframe to data from the
years 2010–2019 (data from the first quarter of 2019
was taken into account).
We grouped the selected records into 7 distinct
classes:
H home and SOHO devices; routers, on-line
monitoring.
S SCADA and industrial systems,
automation, sensor systems, non-home IoT
appliances, cars and vehicles (subsystems), and
medical devices.
E enterprise and service provider (SP)
hardware (routers, switches, enterprise Wi-Fi,
and networking) the network level of IoT
infrastructure.
M – mobile phones, tablets, smart watches, and
portable devices these constitute the
“controllers” of IoT systems.
P PCs, laptops, PC-like computing
appliances, and PC servers (controllers).
A – other, non-home appliances: enterprise
printers and printing systems, copy machines,
non-customer storage, and multimedia
appliances.
The rationale behind such classification is the
following: from the security point of view, the key
distinction for an IoT component is the scope of its
application (home use, industrial use, network layer,
etc.). The number of classes was purposefully low
we were limited by the description of the available
data, so it would have been difficult to use a finer-
grain classification. Additionally, it would not be
practical to introduce too many classes with a small
number of members because the automatic
classification quality would suffer. Table 1 shows the
total number of records and number in classes in the
2010 2020 (Q1) timespan. The NVD database is
distributed as XML and JSON feeds. In addition,
there is an on-line search interface. The database, as
of the beginning of 2020, contains over 120 000
records in total, and the number of records usually
increases year by year. The NVD database is neither
completely consistent nor free of errors. For example,
two problems are a lack of CPE identifier (in approx.
900 records) and inconsistencies with the CPE
dictionary (approx. 100 000 CPEs). Binding between
the vulnerability description and the product
concerned may also be problematic. Product names
containing non-ASCII or non-European characters
also pose a problem, as they are recoded to ASCII
often inconsistently or erroneously. Essentially, it is
SECRYPT 2021 - 18th International Conference on Security and Cryptography
736
impossible to extract data relating to web servers,
home routers, IoT home appliances, security cameras,
cars, SCADA systems, etc. without a priori
knowledge of products and vendors.
Table 1: Number of NVD records pre year total and in
classes.
Year Total Number in class
A CE H M PS
2010 185 6 2 113 28 21 2 13
2011 148 4 2 107 21 6 4 4
2012 288 7 2 180 27 15 9 48
2013 417 8 7 236 84 14 4 64
2014 391 3 0 189 102 16 0 81
2015 386 3 2 174 99 40 8 60
2016 463 6 13 150 75 80 7 132
2017 813 1 26 151 371 94 21 150
2018 1629 64 206 258 582 103 26 390
Q1 2019 400 6 16 193 83 22 7 73
3.2 Data Analysis Methodology
We tested two types of classifiers: (1) a linear SVM
(Vapnik, 1998) and (2) a Neural Net. The same set of
data, i.e. selected attributes of “hardware”
vulnerability records extracted from the NVD
database, was used for training. The feature vector
contained: vendor name, product name and other
product data from CPE (if supplied), and CWE
vulnerability description.
The steps of the process of building a classifier
are the following: 1. pre-processing input data
(removal of stop-words, lemmatization, etc.); 2.
feature extraction, i.e. conversion of text data to
vector space via bag-of-words format; and 3. training
the linear SVM or NN. The length of the feature
vector varied from 1998 to 9911 depending on the
training data time period.
SVN Classifier. We used a standard linear SVM,
which computes the maximum margin hyperplane
that separates the positive and negative examples in
feature space. With the SVM method, the decision
boundary is not only uniquely specified, but statistical
learning theory shows that it yields lower expected
error rates when used to classify previously unseen
examples (Vapnik, 1998; Liu et al., 2010), i.e. it gives
good results when classifying new data. We used
Python with NLTK to pre-process the text data and
SVM and classification quality metrics routines from
scikit-learn libraries.
NN Classifier. The number of network inputs is equal
to the size of the feature vector, the number of outputs
is equal the number of classes. In the last step of data
preparation, we used the term frequency-inverse
document frequency (TF-IDF) technique for data
representation. In TF IDF, we “reward” the words
that occur frequently in a given document but are rare
in others, we use TFIDF on all analyzed text tokens
together. We also used hyperparameter optimization
to tune the NN – we use: 0, 1 or 2 hidden layers, with:
16, 32 or 64 neurons in the layer, we also use 50 150
optimization algorithm rounds (epochs). We adjust
other network parameters too, namely: type of the
weight optimizer algorithm, neuron dropout ratio and
some others. For this we employed the GridSearchCV
package. The simulator is written in Python with
scikit-learn, TensorFlow, Keras and Google
Colaboratory libraries. As the volume of training and
validating data is relatively low, we used K-folds
cross validation to optimize the network.
3.3 Classification Measures
To benchmark the classification results, we used two
standard measures: precision and recall. We defined
precision (eq. (1)) as the ratio of true positives to the
sum of true positives and false positives; we defined
recall (eq. (2)) as the ratio of true positives to the sum
of true positives and false negatives (elements
belonging to the current category but not classified as
such.) Finally, as a concise measure, we used the F1
score eq. (3). The F1 score can be interpreted as a
weighted average of precision and recall, where an F1
score reaches its best value at 1 and worst at 0.
𝑝𝑟𝑒𝑐𝑖𝑠𝑠𝑖𝑜𝑛 =
𝑇𝑃
(𝑇𝑃 + 𝐹𝑃)
(1)
𝑟𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃
(𝑇𝑃 + 𝐹𝑁)
(2)
𝐹1 = 2 ∗
∗
()
(3)
4 CLASSIFICATION RESULTS
We tested both classifiers on historical data in one
year intervals. We took into account data from the
timeframe of 2010–2019. For example, to classify
data from 2018, we used records from the following
ranges: 2010–2017, 2011–2017, and finally only
from 2017. The size of the training and testing data is
given in Table 1. Due to limited space, below we will
present the results of classification for data from 2018
and 2019.
Comparing Support Vector Machine and Neural Network Classifiers of CVE Vulnerabilities
737
SVN-2017 SVN-2012-2017
NN-2017 NN-2012-2017
Figure 2: NN and SVN Classification of records from 2018. Left training data from 2017, right – training data from 2012
to 2017.
In Figure 2, we show the confusion matrices for
the classification of 2018’s records trained using data
ranging from 2012 to 2017 and using data from 2017
alone. From a good classifier, we would expect the
majority of records to be on the diagonal. For the
SVN classifier (top row of Figure 2), we can state that
a larger set of training data (i.e. going back further in
time) reduces the quality of the classification. For
example, for "H" class, we have 489 records
classified correctly (84% recall value) when data
from only 2017 is used for training, but only 262
(45%) when we train using a dataset from 2012–2017.
Classification results from other periods (Blinowski
and Piotrowski, 2020) confirm this trend. For NNs
(bottom row of Figure 2), the case is different: there
is no obvious bias towards better classification results
obtained from shorter or longer historical data.
However, if we analyze weighted precision, recall,
and F1 for NN (Figure 3), we can conclude that using
more historical data increases the quality of the
classification.
Further, from the results shown in Figure 3, we
can conclude that the NN classifier gives slightly
better results in terms of all measures. For NN, the
measures are almost stable, with F1 ranging from
58% to 61%, and the trend that using more historical
data is beneficial is visible. For SVN on the other
hand, the conclusion is more complex: there is an
improvement in classification measures if data from
2016–2017 is used, but classification degrades if
older data is taken into account. Similar conclusions
can be drawn for classified data from 2019 (Figure 4).
Again, the NN classifier gives better and more stable
results than SVN (F1 in the range of 69%–74%), and
SECRYPT 2021 - 18th International Conference on Security and Cryptography
738
Figure 3: Weighted precision, recall, and F1 for 2018
records based on training data from 2010–2017, 2011–
2017, etc. Top – SVN, bottom – NN.
using more (older) training data is almost always
beneficial. For SVN, F1 varies from 63% to 72%. In
general, NN gives slightly better classification results
than SVM. We should also note that results presented
in Figure 3 and Figure 4 show precision, recall, and
F1 measures weighted with support (the number of
true instances for each label).
Here, we refer the reader to consult full datasets
of this study (http://www.ii.pw.edu.pl/~gjb/
CVE_IoT2020/results.zip), where we show precision
and recall rates of up to 80% for strongly populated
categories and of approx. 50% or lower for less
numerous categories.
5 SUMMARY
We have proposed a system of automatic
classification of IoT device vulnerabilities listed in
the public CVE/NVD databases. We have divided
vulnerability records into 7 distinct categories
relating to the devices’ field of usage. The hand-
classified database samples were used to train SVM
Figure 4: Weighted precision, recall, and F1 for 2019
records based on data from 2010–2018, 2011–2018, etc.
Top – SVN, bottom – NN.
and NN classifiers to predict categories of “new”
vulnerabilities. The purpose of the classification was
to predict, prevent, and mitigate unknown threats
resulting from newly discovered vulnerabilities.
Given the size and the rate of growth of the public
vulnerability database and the requirement for a rapid
response to new data, this is a task that cannot be done
by hand and must be automated. We attained
weighted classification precision and recall rates of
55%–70% with better measures for the NN
classifier. These are not ideal results, and in practice
they would require further human intervention
(verification and possibly reclassification). The
problem lies with the data itself neither CVE nor
CPE provide enough specific data for the NN or SVM
to discern record categories. An additional
vulnerability ontology should be introduced to extend
the information currently provided. This should
include more precise vendor and model data. A
similar conclusion, not directly related to IoT
security, was suggested in (Syed, 2016), where the
authors propose a unified cybersecurity ontology that
incorporates heterogeneous data and knowledge
Comparing Support Vector Machine and Neural Network Classifiers of CVE Vulnerabilities
739
schemas from different security systems. It is also
worth mentioning that our method can also be used
on numerous other on-line vulnerability databases
such as those managed by companies (e.g. Microsoft
Security Advisories, Tipping Point Zero Day
Initiative, etc.), national CERTs, or professionals’
forums (e.g. Exploit-DB and others). It may also be
worthwhile to integrate information from various
databases this should increase the precision of the
classification and is a topic of our further research.
REFERENCES
Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari,
M., & Ayyash, M. (2015). Internet of things: A survey
on enabling technologies, protocols, and applications.
IEEE communications surveys & tutorials, 17(4), 2347-
2376.
Antonakakis, M., April, T., Bailey, M., Bernhard, M.,
Bursztein, E., Cochran, J., ... & Zhou, Y. (2017).
Understanding the mirai botnet. In 26th {USENIX}
security symposium ({USENIX} Security 17) (pp. 1093-
1110).
Atzori, L., Iera, A., & Morabito, G. (2010). The internet of
things: A survey. Computer networks, 54(15), 2787-
2805.
Aviram, N., Schinzel, S., Somorovsky, J., Heninger, N.,
Dankel, M., Steube, J., ... & Shavitt, Y. (2016).
{DROWN}: Breaking {TLS} Using SSLv2. In 25th
{USENIX} Security Symposium ({USENIX} Security
16) (pp. 689-706).
Bauer, M., Bui, N., Jardak, C., & Nettsträter, A. (2013). The
IoT ARM reference manual. Enabling Things to Talk,
213.
Blinowski, G. J., & Piotrowski, P. (2020, June). CVE based
classification of vulnerable IoT systems. In
International Conference on Dependability and
Complex Systems (pp. 82-93). Springer, Cham.
Da Xu, L., He, W., & Li, S. (2014). Internet of things in
industries: A survey. IEEE Transactions on industrial
informatics, 10(4), 2233-2243
Durumeric, Z., Li, F., Kasten, J., Amann, J., Beekman, J.,
Payer, M., ... & Halderman, J. A. (2014, November).
The matter of heartbleed. In Proceedings of the 2014
conference on internet measurement conference (pp.
475-488).
EU FP7 (2007). The 7th Framework Programme funded
European Research and Technological Development
from 2007 until 2013; Internet of Things and Future
Internet Enterprise Systems; https://ec.europa.eu/
transport/themes/research/fp7_en, last accessed:
01.03.2021
Huang, G., Li, Y., Wang, Q., Ren, J., Cheng, Y., & Zhao,
X. (2019). Automatic classification method for
software vulnerability based on deep neural network.
IEEE Access, 7, 28291-28298.
Ling, Z., Liu, K., Xu, Y., Gao, C., Jin, Y., Zou, C., ... &
Zhao, W. (2018). Iot security: An end-to-end view and
case study. arXiv preprint arXiv:1805.05853
Liu, Z., Lv, X., Liu, K., & Shi, S. (2010, March). Study on
SVM compared with the other text classification
methods. In 2010 Second international workshop on
education technology and computer science (Vol. 1, pp.
219-222). IEEE.
MITRE. (2020). CVE Common Vulnerabilities and
Exposures database, https://cve.mitre.org/, last
accessed: 02.01.2020
Na, S., Kim, T., & Kim, H. (2016, November). A study on
the classification of common vulnerabilities and
exposures using naïve bayes. In International
Conference on Broadband and Wireless Computing,
Communication and Applications (pp. 657-662).
Springer, Cham.
Neuhaus, S., & Zimmermann, T. (2010, November).
Security trend analysis with CVE topic models. In 2010
IEEE 21st International Symposium on Software
Reliability Engineering (pp. 111-120). IEEE.
NIST. (2020). Security Content Automation Protocol v 1.3,
https://csrc.nist.gov/projects/security-content-
automation-protocol/, Created December 07, 2016,
Updated August 07, 2020, last accessed 02.01.2021.
NIST CFP. (2020). Official Common Platform
Enumeration (CPE) Dictionary, https://csrc.nist.gov/
Projects/Security-Content-Automation-Protocol/Speci
fications/cpe, Created December 07, 2016, Updated
August 07, 2020, last accessed 02.01.2021.
OWASP Top Ten Project. (2021). https://owasp.org/www-
project-top-ten/; last accessed: 01.03.2021
Syed, Z., Padia, A., Finin, T., Mathews, L., & Joshi, A.
(2016). UCO: A unified cybersecurity ontology. UMBC
Student Collection.
Vapnik, V. (1998). Statistical learning theory, New York.
NY: Wiley.
Wang, J. A., & Guo, M. (2010, April). Vulnerability
categorization using Bayesian networks. In
Proceedings of the sixth annual workshop on cyber
security and information intelligence research (pp. 1-
4).
SECRYPT 2021 - 18th International Conference on Security and Cryptography
740