Machine Learning based Malware Trafﬁc Detection on IoT Devices using

Summarized Packet Data

Masataka Nakahara, Norihiro Okui, Yasuaki Kobayashi and Yutaka Miyake

KDDI Research, Inc., 3–10–10, Iidabashi, Chiyoda-ku, Tokyo, Japan

Keywords:

IoT Security, Anomaly Detection, Machine Learning.

Abstract:

As the number of IoT (Internet of Things) devices increases, the countermeasures against cyberattacks caused

by IoT devices become more important. Although mechanisms to prevent malware infection to IoT devices

are important, such prevention becomes hard due to sophisticated infection steps and lack of computational

resource for security software in IoT devices. Therefore, detecting malware infection of devices is also impor-

tant to suppress malware spread. As the types of IoT devices and malwares are increasing, advanced anomaly

detection technology like machine learning is required to ﬁnd malware infected devices. Because IoT devices

cannot analyze own behavior by using machine learning due to limited computing resources, such analysis

should be executed at gateway devices to the Internet. This paper proposes an architecture for detecting mal-

ware trafﬁc using summarized statistical data of packets instead of whole packet information. As this proposal

only uses information of amount of trafﬁc and destination addresses for each IoT device, it can reduce the

storage space taken up by data and can analyze number of IoT devices with low computational resources.

We performed the malware trafﬁc detection on proposed architecture by using machine learning algorithms of

Isolation Forest and K-means clustering, and show that high accuracy can be achieved with the summarized

statistical data. In the evaluation, we collected the statistical data from 26 IoT devices (9 categories), and

obtained the result that the data size required for analysis is reduced over 90% with keeping high accuracy.

1 INTRODUCTION

The number of electronic devices that can be con-

nected to the Internet (Internet of Things) is rapidly

increasing, with 8.4 billion devices to be connected

by 2020 and 20.4 billion devices to be connected by

2022 (Hassija et al., 2019). While IoT devices pro-

vide useful functions in people’s lives, security mea-

sures are not sufﬁcient compared to personal comput-

ers or smartphones, which have been the main termi-

nals connected to the Internet so far. For example, in

2016, a malware botnet consisting of 2.5 million IoT

devices was constructed by Mirai, and a DDoS (Dis-

tributed Denial of Service) attack occurred. In addi-

tion, Mirai’s source code has been published, and at-

tacks by variant malwares based on it have been gen-

erated one after another.

As countermeasures for threats to IoT devices, se-

curity vendors provide various solutions. These solu-

tions have functions for preventing infection, such as

preventing intrusion attempts and preventing access to

malicious sites using a blacklist. On the other hand,

existing solutions have limitations. For example, in

the case of security software for a home network, it

can only detect DoS attacks using TCP packets, and

those using UDP cannot be detected. Also, it can-

not detect communications that attempt to spread in-

fection like host scans or communications with C&C

(Command and Control) servers that give commands

to infected devices. As described above, there are

still many problems in detecting malware infection of

IoT devices, and many studies for improving accuracy

have been conducted.

One of the reasons why it is difﬁcult to detect

malware infections in IoT devices is the limitation

of the processing performance of the device (Zhang

et al., 2014; Nguyen et al., 2019). Conventional de-

vices such as PCs, servers, and smartphones are well-

developed with anti-virus software and intrusion de-

tection systems (Bekerman et al., 2015; Canfora et al.,

2015). On the other hand, IoT devices have low pro-

cessing power, it is difﬁcult to detect infections in

real time on each device using conventional methods.

Therefore, it is necessary to have a mechanism that

can collect and analyze packets not by the IoT device

itself but by other equipment. Therefore, packet ag-

Nakahara, M., Okui, N., Kobayashi, Y. and Miyake, Y.

Machine Learning based Malware Trafﬁc Detection on IoT Devices using Summarized Packet Data.

DOI: 10.5220/0009345300780087

In Proceedings of the 5th International Conference on Internet of Things, Big Data and Security (IoTBDS 2020), pages 78-87

ISBN: 978-989-758-426-8

gregation and analysis may be performed at the home

gateway through which IoT device packets ﬂow to

the Internet (Nguyen et al., 2019; Santoso and Vun,

2015). However, the home gateway cannot execute

high-load processing such as packet payload analy-

sis due to restrictions on the processing performance.

From the above, it is desirable to detect malware in-

fection of IoT devices under the constraints of low

load on the home gateway.

In this paper, an anomaly detection system that

satisﬁes such requirements is proposed. The proposed

system converts packets passing through the home

gateway into statistical data, sends them to the analy-

sis server, and detects anomalies at the analysis server.

Sending all packets that pass through the home gate-

way to the analysis server, including the payload, is

not realistic from the viewpoint of processing load

and trafﬁc. However, statistical data is lightweight

and can be sent to the analysis server. But an issue is

whether or not anomalies can be detected accurately

based only on statistical data.

Therefore, in this paper, we focused on the behav-

ior of IoT devices before and after malware infection.

By learning the behavior of IoT devices before mal-

ware infection, we could detectt malware trafﬁc be-

havior. We then evaluated the anomaly detection per-

formance.

The contributions of this paper are followings:

• System architecture applicable for home network

anomaly detection.

• Anomaly detection using lightweight data and un-

supervised algorithm.

• Experiments for various type of IoT devices and

malwares.

2 BACKGROUND

In this section, as a background to this paper, we in-

troduce the trend of IoT security and malware, and

related works.

2.1 IoT Security

Currently, IoT devices are spreading rapidly and the

number of devices is increasing. Compared to PCs

or smartphones that have been connected to the In-

ternet, there are many different types of IoT devices.

Therefore, security measures need to be taken from

various viewpoints (Zhang et al., 2014; Hassan et al.,

2019). For example, in the case of a smart camera

installed in the home, measures against data leakage

are important from the viewpoint of privacy protec-

tion. For devices equipped with Android OS, it is

necessary to deal with Android vulnerabilities as well

as software vulnerabilities. In addition, the common

problem with many devices is authentication. There

are many cases where passwords are broken because

the password has not been changed by the user on a

device that is set with a simple password at the time of

shipment. And malware is considered as the biggest

threat. IoT devices often have lower processing capa-

bility than existing devices, and it may not be possible

to secure the resources necessary for the original pro-

cessing due to the operation of malware. Also, as rep-

resented by the large-scale DDoS attack by Mirai in

2016, various types of malware are currently appear-

ing and becoming threats. Therefore, many malware

detection methods have been studied.

2.2 Malware

The ﬁrst IoT malware was Linux.Darlloz, discov-

ered by Symantec in 2013 (Zhang et al., 2014).

Linux.Darlloz is a worm that spreads by accessing a

randomly generated IP address using a list of com-

monly used IDs and passwords and downloading

samples. Not only did Linux-based surveillance cam-

eras become infected and privacy protection problems

became apparent, but subsequently another OS-based

variant such as Android also emerged, so the impor-

tance and urgency of malware countermeasures in IoT

security were increased.

The most representative malware for IoT is Mirai,

which appeared in 2016 (Kolias et al., 2017). Like

Darlloz, Mirai uses a list of frequently used IDs and

passwords to spread and construct a botnet that oper-

ates in response to commands from the C&C server.

The IoT device that becomes part of the botnet re-

ceives instructions from the C&C server and conducts

further infection spread and DDoS attacks. In October

2016, attacks on websites with 620Gbps trafﬁc and

attacks on cloud hosting services with 1.1Tbps trafﬁc

occurred almost simultaneously. Later, Mirai’s source

code was published, and variants such as Hajime also

appeared. While Mirai is a centralized architecture,

Hajime is a distributed architecture, making it more

difﬁcult to detect malicious behavior. As countermea-

sures against malware infection, manual countermea-

sures such as frequently changing passwords are also

taken, but there are limits to the number of IoT de-

vices that continue to increase, and many automatic

countermeasures are being considered.

Machine Learning based Malware Trafﬁc Detection on IoT Devices using Summarized Packet Data

2.3 Related Works

There are two types of malware countermeasures in

IoT security: a method to prevent infection and a

method to detect behavior after infection. Of these,

the security products for home networks are currently

being distributed to prevent infection. For example, if

the user tries to access a malicious site such as a mal-

ware download destination, it can be blocked. On the

other hand, detection of behavior after infection is not

sup- ported by currently distributed security products.

In addition, malware infection routes are diverse, and

it is difﬁcult to take countermeasures just by prevent-

ing infection, and quick detection after infection is

required. Therefore, a lot of researches on behavior

detection after infection are carried out.

For example, (Mizuno et al., 2017) distinguish de-

sign and malware from a HTTP header using machine

learning like SVM (Support Vector Machine), Ran-

dom Forest, and deep learning. In (Su et al., 2018),

a DDoS attack is detected using CNN (Convolutional

Neural Network) by translation binary to an 8bit se-

quence and gray scale image. In (Alam and Vuong,

2013), Mirai scan, affection, and attack are detected

using Random Forest and Ada Boost. (Meidan et al.,

2017) recognizes which IoT device sends captured

data using Random Forest and GBM (Gradient Boost-

ing Machine). Furthermore, there are many studies of

anomaly detection with supervised machine learning

algorithms such as (Madeira and Nunes, 2016; Hasan

et al., 2019; Doshi et al., 2018; Zolanvari et al., 2019;

Kumar and Lim, 2019; Ding and Fei, 2013).

As introduced above, there are many researches

that analyze IoT device communication by machine

learning, but most of these use overall packet or super-

vised learning which have anomalies in training data.

On the other hand, for anomaly detection on a home

network, it is difﬁcult to use overall packets includ-

ing payload because of the restriction of processing

resources of the home gateway. If the user’s devices

were not infected by malware, the anomaly packet is

not included in the training period, so unsupervised

learning is required. In order to detect anomalies with

unsupervised learning, we need to focus on benign

packets and learn their features. IoT device behavior

is relatively limited, so learning normal behavior for

each device enables us to detect anomalies without a

supervisor.

Considering the above, we propose an anomaly

detection system for the home network under the fol-

lowing conditions:

• Overall packet is not used.

• Unsupervised learning is used.

• Normal behavior is learned for each device.

3 PROPOSED SYSTEM

This section presents the proposed system for home

network anomaly detection system. Figure 1 is an

overview of the proposed system.

3.1 Home Gateway

IoT devices are connected to the home network,

and each device sends packets to the public network

through the home gateway. For example, a certain

camera device sends the state of the home to an ex-

ternal server at regular intervals, and the state of the

home can be conﬁrmed from outside through a smart-

phone. Separately, it has communication to conﬁrm

the presence of the ﬁrmware update. All these com-

munications go through the home gateway, so if the

home gateway has a packet monitoring function, de-

tecting an anomaly in the behavior of IoT devices is

possible.

However, as described in Section 1, it is impos-

sible to inspect all the packets in detail on the home

gateway due to resource restriction. Therefore, the

proposed system aggregates statistical data of pack-

ets, and reduces the processing load of the home gate-

way. An example of the statistical data is shown in

Table 1.

During one statistical cycle, all packets with the

same source and destination MAC address, IP ad-

dress, and port number are aggregated into a single

record. This record is called the statistical record. In

a statistical record, “count” shows the number of ag-

gregated packets, and “length” shows the summation

of the length of aggregated packets. In this example,

the statistical cycle is 10 seconds. The record on the

ﬁrst line indicates that 294 bytes of packets are being

sent in 10 seconds for one combination of source and

destination. As the records in the 4th and 5th lines

show, even if the statistical period is the same, if it

is sent to another address or port, it will be another

record.

The statistical cycle can be adjusted considering

the processing load of the home gateway. The shorter

the statistical cycle becomes, the more the processing

load increases, but anomalies can be detected earlier.

In this paper, the statistical cycle is set to 10 seconds

so that attacks such as DoS and Hostscan could be

detected early.

In the proposed system, these statistical packet

records are sent to the analysis server in every sta-

tistical period. The analysis server detects anomalies

for each received record, and if an anomaly record

is found, notiﬁcation is sent to the home gateway. If

the home gateway receives an anomaly notiﬁcation, it

IoTBDS 2020 - 5th International Conference on Internet of Things, Big Data and Security

Figure 1: System model.

Table 1: An example of statistical information of packets.

Timestamp mac src mac dst ip src ip dst port src port dst protocol count length

2019-06-30 15:00:10 a1:b1:c1:d1:e1:f1 a3:b3:c3:d3:e3:f3 192.168.1.121 239.255.255.xxx 48993 1900 udp 2 294

2019-06-30 15:00:10 a2:b2:c2:d2:e2:f2 a4:b4:c4:d4:e4:f4 192.168.1.122 239.255.255.yyy 48993 1900 udp 1 76

2019-06-30 15:00:10 a2:b2:c2:d2:e2:f2 a4:b4:c4:d4:e4:f4 192.168.1.122 239.255.255.yyy 8883 53612 tcp 1 91

2019-06-30 15:00:20 a2:b2:c2:d2:e2:f2 a4:b4:c4:d4:e4:f4 192.168.1.122 239.255.255.yyy 8883 53612 tcp 1 156

2019-06-30 15:00:20 a2:b2:c2:d2:e2:f2 a5:b5:c5:d5:e5:f5 192.168.1.122 239.255.255.zzz 48993 1900 udp 1 173

sends notiﬁcation to the user or stops communication

of the device, and so forth.

3.2 Analysis Server

The analysis server receives statistical data processed

by the home gateway as described in Section 3.1, and

detects anomaly packets. Since the analysis server

is deployed outside the home network, such as in

the cloud, processing with a heavy load that can-

not be performed by the home gateway can be per-

formed. Thus, we used the machine learning ap-

proach in anomaly detection on the analysis server.

For a certain period after the device is connected

to the home network, the home gateway sends statis-

tical data to the analysis server as training data. At

this point, we assume that the device is not infected

by malware during the training period and only the

device’s original communication is performed. Using

training data, the analysis server learns benign com-

munication of the device, and creates the model. Con-

crete features and learning algorithms are described in

Section 4.

After the training period, the home gateway

sends statistical data for detection, and the analysis

server judges whether the received data are benign or

anomalous. If anomaly records are found, it notiﬁes

the home gateway.

4 DETECTION METHOD

This section presents anomaly detection on the analy-

sis server which is described in Section 3.

4.1 Feature Vector

As described in Section 3, in this paper, we use ma-

chine learning for anomaly detection on the analysis

server. For machine learning, we need to generate fea-

ture vectors from statistical data sent from the home

gateway. In this paper, 21 dimensional features are

generated from the destination IP address, number of

packets, protocol, etc. included in the statistical data.

Description of the features are shown in Table 2.

Thresholds of the “Num of packets” and “Length

of packets” have several variations. The num-

ber of packets is below average, average+standard

deviationσ, average+2σ, average+3σ, and more than

average+3σ. Length thresholds are set as the same.

In order to clarify the features of each record, each

feature was encoded as a binary value of 0 or 1. For

Machine Learning based Malware Trafﬁc Detection on IoT Devices using Summarized Packet Data

Table 2: Feature vectors.

Feature Deﬁnition

Dest IP Destination IP address is included in dictionary data.

Dest IP (24bit) The ﬁrst 24 bit of destination IP is included in dictionary data.

Dest port Destination port is included in dictionary data.

Dest IP&port pair Pair of destination IP and port is included in dictionary data.

Well known port Destination port number is below 1024.

Protocol Protocol is TCP.

Has response It has same source IP& port pair as destination IP& port pair.

Response count The number of response packets is larger than the record.

Response length The length of response packets is larger than the record.

Has similar packet There are different packets only for the destination port or source port.

Num of packets The number of packets is below threshold

Length of packets The length of packets is below threshold

Outbound It is from internal network to outside.

example, in the case of Dest IP, a method of linking

one number to one address is also conceivable. How-

ever, the linked number and the address value have no

relation, so it leads to erroneous learning. Therefore,

we used one hot encoding with the deﬁnitions shown

in Table 2.

As for Dest IP or port, we created a list of desti-

nations during a particular period, and if the IP in one

record is included in the list, Dest IP of the record is

set to 1, otherwise, 0. Here, the list is called a dictio-

nary data, and the period used to create the dictionary

data is called the dictionary period. The dictionary

period is different from the training term of machine

learning. It is a part of the training term or a period be-

fore the training term. By using the dictionary period,

it becomes possible to train considering the jitter of

destination during the training term, and it contributes

to reduction of the false detection rate. For example,

in the case of a smart speaker, it is assumed to com-

municate with a new destination because of new us-

age even if it is a benign communication. Unless it is a

dictionary period, the benign communication with the

new destination can be judged as an anomaly. How-

ever, using the dictionary period, it becomes possible

to learn the communication with the new destination

as benign behavior because there are both communi-

cations that are included in the dictionary data and

that are not included during the training term. Thus

we adopted the dictionary period for generating fea-

tures.

4.2 Classiﬁer

Based on the features generated above, the analysis

sever distinguishes between benign and anomalous

statistical records. One method adopted in this paper

is Isolation Forest. Isolation Forest builds an ensem-

ble of trees for a given data set, then anomalies are

detected as instances which have short average path

lengths on the trees (Liu et al., 2008). The depth of the

trees means the number of partitions required to sepa-

rate instances. Anomaly instances should be far from

other instances, so a smaller number of partitions is

needed and path length on the tree is short. Isola-

tion Forest focuses on the property whereby anoma-

lies have attribute-values that are very different from

those of normal instances. Therefore, it matches our

approach that creates a model from the normal behav-

ior of IoT devices.

Isolation Forest can detect anomaly from test data

even if anomaly data is not included in the training

term. Other algorithms need anomaly data in the

training term because they classify test data based on

classes that appeared in the training term. Further-

more, as Isolation Forest works at high speed, we

adopted Isolation Forest in this paper.

Isolation Forest has the following parameters: the

sub-sampling size and the number of trees. Accord-

ing to (Liu et al., 2008), even if the number of trees

were larger than 10, average path length means the

anomaly score would not greatly change, so we set

the numbers of trees as 10. As for the sub-sampling

size, we used 10, 50, 90 and 100% of each of the sta-

tistical records. As a result, the case of 90% showed

the best performance, so the results shown on follow-

ing session is those of 90%.

4.3 Clustering

As for unsupervised learning, clustering is still one of

the popular algorithms for classiﬁcation tasks. Clus-

tering groups similar data into partitions which are

called clusters and several algorithms are proposed.

We selected K-Means clustering (MacQueen et al.,

1967) which is one of the most popular clustering

methods and applied it to anomaly detection.

In the training phase, clusters of normal data are

created from train data which contains only benign

IoTBDS 2020 - 5th International Conference on Internet of Things, Big Data and Security

data. Since packet contents differ from device to de-

vice, cluster conﬁgurations such as the size of each

cluster or the number of clusters also differ from de-

vice to device. The training phase is shown in Figure

In the case of K-Means, the number of clusters

is needed to set as a parameter before the training

of K-Means and we use silhouette score (Rousseeuw,

1987) to determine the best number of clusters.

Figure 2: Training phase of K-Means.

After the training phase, we calculate distances

between each data and the center of the nearest clus-

ter from the data. One of the distances is indicated by

d in Figure 2. The distance included up to a certain

value of cumulative probability is used as a threshold

to separate benign data and anomaly data. The val-

ues also differ from device to device. P indicates the

values in Figure 3.

Figure 3: Threshold of Distance.

In the test phase, the distance between each data

and the center of the nearest cluster from the data is

calculated. If the distance is farther than the threshold,

the data is regarded as anomaly data.

5 PERFORMANCE EVALUATION

This section presents how to evaluate the performance

of the proposed anomaly detection system and its re-

sults.

5.1 Evaluation Data

In order to evaluate the performance of anomaly de-

tection, both benign and anomaly communication are

required. In this section, we describe how to prepare

each data.

5.1.1 Benign Data

For benign data, we operated actual IoT devices about

twice a week, and captured their packets. For exam-

ple, in the case of a smart camera, we took or browsed

a picture through a smart phone, and in the case of a

smart speaker, we listened to the weather of the cur-

rent location or ask translation of some words. The

categories of the IoT device are shown in Table 3.

Table 3: IoT device list.

Category No. of devices

Smart camera 7

Smart speaker 5

IoT device controller 4

Door phone 3

Environment sensor 2

Light 2

Cleaner 1

Smart TV 1

Remote controller 1

Total 26

In our experiment environment, there are some

gateways. Some of these gateways are associated with

the IoT device and they translate the IoT device’s

speciﬁc communication protocol into an IP packet.

These gateways are categorized as the original device.

For example, the category “Door phone” includes the

actual door phone and gateway for the door phone.

Other gateways integrate communications of several

IoT devices, and they are categorized as the “IoT de-

vice controller”.

For our evaluation, one-month data was extracted

from the captured packets, and processed into statis-

tical records as described in Section 3. Among them,

the statistical records for the ﬁrst two weeks were

used as training data, and the remainder were com-

bined with the anomaly communications as test data.

The dictionary period described in Section 4 is set as

the ﬁrst week in ther training term.

Machine Learning based Malware Trafﬁc Detection on IoT Devices using Summarized Packet Data

Here, the size of raw packet data is 42GB. It in-

cludes packets of 26 devices for one-month. After

processing into statistical records, the data size was

reduced to 2.9GB. We succeeded in reducing the data

size by 93%.

5.1.2 Anomaly Data

As anomaly data, we created packets and their sta-

tistical records that simulate the major behavior of

malware in IoT devices; communication with a C&C

server, host scan, and DoS attack. Simulated mal-

wares are shown in Table 4.

Table 4: Malware list.

Type Cycle No. of records

0.33[sec] 56506

1[min] 43287

C&C 1[hour] 721

12[hour] 61

24[hour] 31

Type No. of dest per sec No. of records

100 120000

200 120000

Host scan 500 120000

1000 120000

3000 120000

Type No. of packets per sec No. of records

100 103740

500 103740

DoS 1000 103740

1500 103740

3000 103740

In order to evaluate the performance of detection

in detail, we created 5 patterns for each type of behav-

ior so that the detection difﬁculty differed.

As for C&C, the cycle of communication is var-

ied. For a host scan, the number of scan targets is

varied. The values in the above table shows the num-

ber of targets in one second. As for DoS, the number

of packets is varied.

At the timing of evaluation, one of these 15 types

of malware behaviors is selected and combined into

benign communication of test data, and it is deter-

mined whether the system can detect an anomaly cor-

rectly or not.

5.2 Evaluation Flow

The evaluation ﬂow is shown in Figure 4.

First, normal communications of the IoT devices

and malware communications are prepared as de-

Figure 4: Evaluation ﬂow.

scribed above. Next, we extracted all communica-

tions of one device and all communications of one

malware from the input data. At this time, 0 is as-

signed to the record of normal communication and 1

is assigned to the record of abnormal communication

as correct labels for training and accuracy evaluation.

Then, we created features using the IP address and

protocol and so on as described in Section 4. After

that, these data were divided into training data and

test data, and a model is generated from the train-

ing data. Malware communications were then mixed

into the test data, and created a state where the device

was infected with malware. Test data was input to

the model, and it was judged whether each statistical

record is benign or anomalous.

Next, the malware to be mixed was changed to

another type and the above operation was performed.

For example, if C&C communication with a commu-

nication cycle of 0.33s was targeted, C&C commu-

nication with another cycle or the host scan will be

mixed instead. The above operation was repeated for

all the devices to be evaluated, and the detection re-

sults were obtained.

The detection results were obtained in the form of

a confusion matrix. Here, the anomaly to be detected

is Positive, and benign is Negative. For the evalua-

tion metric, we used TPR (True Positive Rate), FPR

(False Positive Rate), and MCC (Matthew’s Correla-

tion Coefﬁcient). MCC is a metric used for prediction

accuracy evaluation when the ratio of normal data and

abnormal data is unbalanced (Matthews, 1975). In

this evaluation, we used many types of devices and

malwares, and in many cases, the number of data is

unbalanced. For example, in the case of a device with

a small number of communications, the ratio of ab-

normal communication becomes large, so even if all

IoTBDS 2020 - 5th International Conference on Internet of Things, Big Data and Security

communications are judged abnormal, the accuracy

increases. Therefore, we used MCC instead of preci-

sion or accuracy. MCC takes a value of 1 if the predic-

tion results are all correct and -1 if they are all incor-

rect. Using TP (True Positive), FN (False Negative),

FP (False Positive), TN (True Negative) in the confu-

sion matrix, these metrics are deﬁned as the following

equations:

• T PR =

T P

T P + FN

• FPR =

FP+ T N

• MCC =

(T P × T N) − (FP × FN)

(T P + T P) × (T P + FN) × (T N + FP) × (T N +FN)

5.3 Evaluation Result for Isolation

Forest

First, we mixed one smart camera with C&C commu-

nication whose cycle is 0.33 seconds. The result of

anomaly detection is shown in Table 5.

Table 5: Result of C&C detection from one smart camera.

Prediction

Malware Benign

Answer Malware 56506 0

Benign 1587 115714

The number of records of the original device is

117301, and that of malware is 56506. Here, TPR is

1 and FPR is 0.014. As the MCC is 0.980, it indicates

a high detection accuracy even when data imbalance

is taken into account.

Next, the above evaluation was performed on 15

types of malware and 26 devices, and the results of

averaging TPR and FPR in each device for each mal-

ware type are shown in Table 6.

Time Evaluation. Anomaly detection should ﬁnish

within the statistical cycle. In this paper, the statis-

tical cycle is 10 seconds, so detection should ﬁnish

within 10 seconds. Then, we measured the detection

time for data in which the normal communication of

each device was mixed with host scan communica-

tion with 3000 destinations per second. As the tar-

get, we chose 10 seconds with the highest number of

statistical records during the test period. And consid-

ering the speciﬁcation of the statistical record which

counts communications with different destinations as

different records, host scan has the largest number of

records in malware behavior. The number of target

records is 30,027 on average per device. As a result,

the maximum time to detect for one device was 0.37

Table 6: Result of detection for all devices.

Type Cycle TPR FPR

0.33s 1.000 0.078

1m 0.965 0.078

C&C 1h 0.965 0.078

12h 0.977 0.078

1d 0.965 0.078

Type No. of dest per sec TPR FPR

100 1.000 0.078

500 1.000 0.078

Host scan 1000 1.000 0.078

1500 1.000 0.078

3000 1.000 0.078

Type No. of packets per sec TPR FPR

100 1.000 0.078

200 1.000 0.078

DoS 500 1.000 0.078

1000 1.000 0.078

3000 1.000 0.078

seconds, and total time for all devices was 3.22 sec-

onds, indicating the possibility of detection within the

statistical cycle.

Accuracy Evaluation. As for TPR, results of the

host scan and DoS are 1, so all malwares are detected.

As for C&C, TPR is over 0.95, but did not achieve 1.

As for FPR, all the results are the same, because

the benign records are common. They are 0.078, but

it is higher than the result of Table 5. The average

MCC was 0.755, because there are many false pos-

itive on some devices. So it is doubtful that a suc-

cessful model can be generated only in particular con-

ditions. Then, in order to evaluate each device, the

detection results of all malware were aggregated for

each device type. Average TPR and FPR are shown

in Table 7.

Table 7: Result of malware detection for each device type.

Device type No. of dev TPR FPR

Smart camera 7 1.000 0.033

Smart speaker 5 1.000 0.163

IoT device controller 4 0.945 0.062

Door phone 3 1.000 0.066

Environment sensor 2 0.999 0.055

Light 2 1.000 0.108

Cleaner 1 1.000 0.034

Smart TV 1 1.000 0.128

Remote controller 1 1.000 0.037

This result shows that malwares can be detected

on many devices with high TPR, but smart speakers

and smart TV are higher FPR. These devices have

many patterns of behavior, so it is difﬁcult to distin-

guish malwares and new benign behavior.

Machine Learning based Malware Trafﬁc Detection on IoT Devices using Summarized Packet Data

5.4 Evaluation Result for K-means

Clustering

Subsequently, we performed the same evaluation as

above with K-means clustering. The results of aver-

aging TPR, FPR for 26 devices and 15 types of mal-

ware detection are shown in Table 8.

Table 8: Result of detection for all devices (K-means).

Type Cycle TPR FPR

0.33s 1.000 0.003

1m 0.999 0.003

C&C 1h 0.999 0.003

12h 0.997 0.003

1d 0.995 0.003

Type No. of dest per sec TPR FPR

100 0.856 0.003

500 0.856 0.003

Host scan 1000 0.856 0.003

1500 0.856 0.003

3000 0.856 0.003

Type No. of packets per sec TPR FPR

100 1.000 0.003

200 1.000 0.003

DoS 500 1.000 0.003

1000 1.000 0.003

3000 1.000 0.003

Time Evaluation. Similar to the Isolation Forest,

we measured the time for detection of records in 10

seconds.

Time required depends on the number of clusters:

the case of 40 clusters, the total time for detection of

all devices was 8.4 seconds. The average number of

clusters used here is 34, so it is also possible to detect

malware within the statistical cycle.

Accuracy Evaluation. FPR is 0.003, and it is much

better than the results of the Isolation Forest.

As for TPR, the results of DoS is 1, but host scan

result is lower than that of Isolation Forest. This can

be attributed to some devices having similar length

and count records as the host scan packets. Both

benign and anomaly records use a well-known port.

Therefore, the distance between benign and anomaly

records is shortened. On the other hand, for C&C,

TPR is over 0.99 even in the case of longer cycles.

The distance between C&C and benign records seems

to be long.

The average MCC is 0.816, which is higher than

that of the Isolation Forest. MCC of Isolation Forest

was low in C&C detection, but K-means succeeded

in detecting C&C without false positive, so MCC be-

came higher.

Next, the aggregated result of all malware detec-

tion for each device type is shown in Table 9.

Table 9: Result of malware detection for each device type

(K-means).

Device type No. of dev TPR FPR

Smart camera 7 0.915 0.002

Smart speaker 5 0.917 0.007

IoT device controller 4 1.000 0.0005

Door phone 3 0.972 0.0006

Environment sensor 2 0.958 0.0001

Light 2 0.958 0.0007

Cleaner 1 1.000 0.0002

Smart TV 1 1.000 0.025

Remote controller 1 1.000 0.0004

For almost all device types, FPRs are lower than

0.001. Only smart TV is higher than 0.01. As de-

scribed above, smart TV has many patterns of behav-

ior, so it is difﬁcult to distinguish.

Finally, a comparison of the result between Isola-

tion Forest and K-means is summarized in Table 10.

Table 10: Comparison of two methods).

Isolation K-means

TPR O X

FPR X O

MCC X O

C&C detection X O

Host scan detection O X

DoS detection O O

Speed O O

Isolation Forest was better in TPR. On the con-

trary, FPR and MCC were much better in K-means.

C&C was difﬁcult for Isolation Forest, but K-means

showed good detection. However, Isolation Forest

could detect the host scan better than K-means. Both

methods could detect DoS attack. Speed was sufﬁ-

cient for detecting within the statistical cycle in both

methods.

6 CONCLUSIONS

In this paper, we proposed a system to detect the ab-

normalities of IoT devices in the home network by

sending the statistical information at the home gate-

way to the analysis server. Although the informa-

tion that used in anomaly detection is reduced in the

statistical information, we conﬁrmed that anomalies

of many devices in the experiment can be detected.

Proposed system could reduce the data size required

for analysis over 90% and still able to achieve high

accuracy with Isolation Forest and K-means cluster-

ing. Future issues include improved detection per-

formance for more devices such as smart speakers or

IoTBDS 2020 - 5th International Conference on Internet of Things, Big Data and Security

smart TVs, improved detection performance for de-

vices with less data during the learning period, and

countermeasures when devices are infected with mal-

ware during the learning period. And the evaluation

of load and latency on the home gateway is important.

Additionally, the IoT trafﬁc dataset we used in this pa-

per includes limited use case. If we use more realistic

dataset such as collected from several actual homes,

our proposed system becomes more signiﬁcant.

REFERENCES

Alam, M. S. and Vuong, S. T. (2013). Random forest

classiﬁcation for detecting android malware. In 2013

IEEE International Conference on Green Computing

and Communications and IEEE Internet of Things and

IEEE Cyber, Physical and Social Computing, pages

663–669. IEEE.

Bekerman, D., Shapira, B., Rokach, L., and Bar, A. (2015).

Unknown malware detection using network trafﬁc

classiﬁcation. In 2015 IEEE Conference on Communi-

cations and Network Security (CNS), pages 134–142.

IEEE.

Canfora, G., De Lorenzo, A., Medvet, E., Mercaldo, F.,

and Visaggio, C. A. (2015). Effectiveness of opcode

ngrams for detection of multi family android malware.

In 2015 10th International Conference on Availability,

Reliability and Security, pages 333–340. IEEE.

Ding, Z. and Fei, M. (2013). An anomaly detection ap-

proach based on isolation forest algorithm for stream-

ing data using sliding window. IFAC Proceedings Vol-

umes, 46(20):12–17.

Doshi, R., Apthorpe, N., and Feamster, N. (2018). Ma-

chine learning ddos detection for consumer internet of

things devices. In 2018 IEEE Security and Privacy

Workshops (SPW), pages 29–35.

Hasan, M., Islam, M. M., Zarif, M. I. I., and Hashem, M.

(2019). Attack and anomaly detection in iot sensors in

iot sites using machine learning approaches. Internet

of Things, 7:100059.

Hassan, W. H. et al. (2019). Current research on internet of

things (iot) security: A survey. Computer Networks,

148:283–294.

Hassija, V., Chamola, V., Saxena, V., Jain, D., Goyal, P., and

Sikdar, B. (2019). A survey on iot cecurity: Applica-

tion areas, security threats, and solution architectures.

IEEE Access, 7:82721–82743.

Kolias, C., Kambourakis, G., Stavrou, A., and Voas, J.

(2017). Ddos in the iot: Mirai and other botnets. Com-

puter, 50(7):80–84.

Kumar, A. and Lim, T. J. (2019). Edima: Early detection

of iot malware network activity using machine learn-

ing techniques. In 2019 IEEE 5th World Forum on

Internet of Things (WF-IoT), pages 289–294.

Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2008). Isolation

forest. In 2008 Eighth IEEE International Conference

on Data Mining, pages 413–422. IEEE.

MacQueen, J. et al. (1967). Some methods for classiﬁcation

and analysis of multivariate observations. In Proceed-

ings of the ﬁfth Berkeley Symposium on Mathematical

Statistics and Probability, volume 1, pages 281–297.

Oakland, CA, USA.

Madeira, R. and Nunes, L. (2016). In 2016 Eleventh Inter-

national Conference on Digital Information Manage-

ment (ICDIM), pages 145–150.

Matthews, B. W. (1975). Comparison of the predicted and

observed secondary structure of t4 phage lysozyme.

Biochimica et Biophysica Acta (BBA)-Protein Struc-

ture, 405(2):442–451.

Meidan, Y., Bohadana, M., Shabtai, A., Guarnizo, J. D.,

Ochoa, M., Tippenhauer, N. O., and Elovici, Y.

(2017). Proﬁliot: A machine learning approach for

iot device identiﬁcation based on network trafﬁc anal-

ysis. In Proceedings of the Symposium on Applied

Computing, pages 506–509. ACM.

Mizuno, S., Hatada, M., Mori, T., and Goto, S. (2017).

Botdetector: A robust and scalable approach toward

detecting malware-infected devices. In 2017 IEEE

International Conference on Communications (ICC),

pages 1–7. IEEE.

Nguyen, T. D., Marchal, S., Miettinen, M., Fereidooni, H.,

Asokan, N., and Sadeghi, A.-R. (2019). D

ıot: A

federated self-learning anomaly detection system for

iot. In 2019 IEEE 39th International Conference on

Distributed Computing Systems (ICDCS), pages 756–

767. IEEE.

Rousseeuw, P. (1987). Silhouettes: A graphical aid to the in-

terpretation and validation of cluster analysis. J. Com-

put. Appl. Math., 20(1):53–65.

Santoso, F. K. and Vun, N. C. (2015). Securing iot for smart

home system. In 2015 International Symposium on

Consumer Electronics (ISCE), pages 1–2. IEEE.

Su, J., Vasconcellos, V. D., Prasad, S., Daniele, S., Feng,

Y., and Sakurai, K. (2018). Lightweight classiﬁca-

tion of iot malware based on image recognition. In

2018 IEEE 42nd Annual Computer Software and Ap-

plications Conference (COMPSAC), volume 2, pages

664–669. IEEE.

Zhang, Z.-K., Cho, M. C. Y., Wang, C.-W., Hsu, C.-W.,

Chen, C.-K., and Shieh, S. (2014). Iot security:

Ongoing challenges and research opportunities. In

2014 IEEE 7th International Conference on Service-

Oriented Computing and Applications, pages 230–

234. IEEE.

Zolanvari, M., Teixeira, M. A., Gupta, L., Khan, K. M.,

and Jain, R. (2019). Machine learning-based network

vulnerability analysis of industrial internet of things.

IEEE Internet of Things Journal, 6(4):6822–6834.

Machine Learning based Malware Trafﬁc Detection on IoT Devices using Summarized Packet Data