Email Image Spam Classiﬁcation based on ResNet Convolutional Neural

Network

ıt List

ık

1 a

, Jan

Sediv

2 b

and V

aclav Hlav

2 c

Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics,

Prague 6, Technick

a 2, Czech Republic

Czech Institute of Informatics, Robotics and Cybernetics, Prague 6, Jugosl

avsk

ych partyz

u 1580/3, Czech Republic

Keywords:

Spam, Email, ResNet, Image, Classiﬁcation, Convolutional Neural Network.

Abstract:

The problem with email image spam classiﬁcation is known from the year 2005. There are several approaches

to this task. Lately, those approaches use convolutional neural networks (CNN). We propose a novel approach

to the image spam classiﬁcation task. Our approach is based on CNN and transfer learning, namely Resnet v1

used for semantic feature extraction and one layer Feedforward Neural Network for classiﬁcation. We have

shown that this approach can achieve state-of-the-art performance on publicly available datasets. 99% F1-

score on two datasets (Dredze et al., 2007), Princeton and 96% F1-score on the combination of these datasets.

Due to the availability of GPUs, this approach may be used for just-in-time classiﬁcation in anti-spam systems

handling huge amounts of emails. We have observed also that mentioned publicly available datasets are no

longer representative. We overcame this limitation by using a much richer dataset from a one-week long real

trafﬁc of the freemail provider Email.cz. The training data annotation was created by user labeling of the

emails. The image spam (and image ham even more) tackles privacy issues. We overcame it by publishing

extracted feature vectors with associated classes (instead of images itself). This data does not violate privacy

issues. We have published Email.cz image spam dataset v1 via the AcademicTorrents platform and propose a

system, which achieves up to 96% F1-score with presented model architecture on this novel dataset. Providing

our dataset to the community may help others with solving similar tasks.

1 INTRODUCTION

Spam emails (unconsolidated bulk messages) costs

email providers and users a huge amount of time and

money. Anti-spam systems are trying to lower those

loses by separating the email trafﬁc to wanted and un-

wanted (commonly called ham and spam). Anti-spam

techniques have to evolve because methods used by

spammers are evolving too. One of the examples of

a relatively recent technique is the image spam ap-

pearing in email communication (email image spam -

referred also as image spam throughout this work)

The image spam problem in emails was speciﬁed

in (Aradhye et al., 2005), (Wu et al., 2005). The

problem was deﬁned as an email content obfuscation

method for the anti-spam ﬁlters because they did not

use information from images attached to emails. At

that time, it was hard to process information contained

https://orcid.org/0000-0002-1907-6334

https://orcid.org/0000-0003-0626-2303

https://orcid.org/0000-0002-8472-3147

in images mainly because of the processing power and

insufﬁcient algorithms. The problem of image spam

in its original form is not that urgent as before because

anti-spam solutions are now able to detect it

, there-

fore making it less efﬁcient for attackers.

However, it keeps being an interesting research

domain because the task is getting harder (Carpinteiro

et al., 2017), (Shang and Zhang, 2016). The reason

why it is getting harder is that emails containing im-

ages are now much more common for both legitimate

emails and unfortunately for spam emails too. An-

other reason is that the available processing power is

now much bigger. Consequently, more complex tech-

niques may be used on both sides of the barricade.

The ﬁnal reason is that most of the email providers

are not classifying only to spam and ham classes any-

more. They use more classes, i.e. Newsletter or Fo-

rums. Those conditions are suggesting that image in-

formation may be helpful for those emerging tasks.

https://wiki.apache.org/spamassassin/FuzzyOcrPlugin

Listík, V., Šedivý, J. and Hlavá

c, V.

Email Image Spam Classiﬁcation based on ResNet Convolutional Neural Network.

DOI: 10.5220/0008956704570464

In Proceedings of the 6th International Conference on Information Systems Security and Privacy (ICISSP 2020), pages 457-464

ISBN: 978-989-758-399-5; ISSN: 2184-4356

457

1.1 Public Datasets

As we have stated above, characteristics of the image

spam and email communication have changed signif-

icantly. However, the only commonly used publicly

available dataset is over a decade old (Dredze et al.,

2007). The biggest problem for publishing the im-

age spam dataset is the private nature of email con-

tents. Other datasets used in publications are listed in

Sec. 3.1. Those datasets are commonly not available.

It is not because they were not published, but because

the publication medium (most commonly server of

the publisher) is no longer available.

2 THE APPROACHED TASK

In this contribution, we approach the email image

spam classiﬁcation. The image spam may hide ob-

fuscated text or URL in images instead of plain-text

(which is common for regular spam). Such harm-

ful images may contain inappropriate content, unso-

licited marketing campaigns, and text. The format of

HTML emails containing only embedded images and

no text is very common for newsletters

, which may

be either legitimate or unsolicited. When those emails

do not contain any other information other than the

images and trafﬁc information (headers, sender IP),

we suppose that the information extracted from im-

ages may improve anti-spam solutions signiﬁcantly.

Spam emails make 45% of email trafﬁc globally.

At Email.cz, it was measured that 25% of emails are

spam. This is caused by the fact that Email.cz classi-

ﬁes newsletter emails into a separate class. Spam and

newsletter categories combined constitute up to 75%

of the trafﬁc. It was also measured at Email.cz that

75% of emails contain images embedded in HTML

and 13% of emails contain image attachments. This

means that 88% of emails contain at least one image.

Measurements for all the categories are shown in

Table 1. These measurements demonstrate that spam

and newsletter emails contain even more images than

personal emails.

2.1 Dataset

Due to the lack of representable dataset (problems

with datasets are described in Sec. 1.1 and available

datasets are described in Sec. 3.1) for this task we de-

cided to create a novel dataset as part of this work.

We established the following requirements for the

dataset:

Emails containing advertisements commonly sent in

bulks

Table 1: Statistics of emails containing images measured at

Email.cz. Image means that email contained at least one

image. HTML means images linked from HTML content

of the email. Attach means images in attachments. News

means newsletter. Categories are based on currently used

anti-spam ﬁlter at Email.cz.

All Spam News Inbox

Emails 100.00% 25.10% 52.59% 22.60%

Image 88.15% 95.81% 98.99% 54.37%

HTML 75.39% 65.46% 95.65% 39.17%

Attach 12.76% 30.35% 3.33% 15.21%

• The resulting dataset should be recent and based

on a real email trafﬁc

• Images have to be labeled into following classes

{spam, ham, advertisement}

• The data have to be anonymized because of their

private nature

• It should contain enough samples for training

Neural Networks (NN) or other statistical models

• Reliable platform for the publication

3 RELATED WORK

In this section, we describe publicly available datasets

and commonly used methods for email image spam

classiﬁcation.

3.1 Publicly Available Datasets

Based on (Biggio et al., 2011), which is an overview

of used datasets and methods until 2011, the com-

monly used and still available dataset is (Dredze et al.,

2007). The other commonly used available dataset

is (Cormack and Lynam, 2005). A combination of

more public datasets is also used. A different ap-

proach is to use a custom dataset (not public).

We studied approaches that appeared after 2011.

They follow the pattern described in (Biggio et al.,

2011). Most of the approaches use custom datasets

and (Dredze et al., 2007).

Dredze 2007

(Dredze et al., 2007) is the most commonly used pub-

licly available dataset for image spam classiﬁcation.

This dataset is unique because it contains both spam

and ham samples. Dredze 2007 dataset is separated

into three parts described in Table 2.

This corpus was created by image extraction from

emails. Only images attached to emails were used.

ICISSP 2020 - 6th International Conference on Information Systems Security and Privacy

458

Table 2: Numbers based on (Dredze et al., 2007) in which

the dataset was published. Those numbers are correct, but

some of the images are damaged and cannot be used for

training.

Corpus name # of images

Personal Ham 2550

Personal Spam 3239

SpamArchive Spam 9503

Images were detected based on the ﬁle extension.

SpamArchive is still available

. One part (called pub-

lic) contains spam only. The authors also collected

personal emails from 10 email accounts from 10 dif-

ferent domains over one month and extracted both

spam and ham images. This part is called private or

personal.

Princeton Spam Image Benchmark

Princeton Spam Image Benchmark was published in

2007. It contains 1071 images in 178 groups. It is

accessible

The ﬁrst issue with Dredze 2007 and Princeton

datasets is that they contain only a few thousands of

samples. Neural networks need more samples for

training. We are not training the CNN because it

comes already pre-trained on a huge dataset. Still,

neural nets perform better on more samples.

TREC

TREC dataset

is commonly used for benchmark-

ing. It was also created in 2007 and contains only

emails (Cormack and Lynam, 2005). Images need

to be extracted from the emails. Because this dataset

contains around 7,000 images and the extracted ver-

sion is not publicly available we decided not to use

it. It solves neither the issue with old data nor dataset

size issue.

3.2 Statistical Models

Email image spam task has been approached many

times by the community (Biggio et al., 2011). Be-

tween the years 2005 and 2015, the most common

approach to this task was to use low-level image fea-

ture extraction and SVM classiﬁer. Lately, it has

been more common to use neural networks for this

The site was available at http://www.spamarchive.org/

earlier. It may be found at http://untroubled.org/spam/ now.

http://www.cs.princeton.edu/cass/spam/

Accessible at http://trec.nist.gov/data/spam.html when

this paper was written.

task (Carpinteiro et al., 2017; Shang and Zhang,

2016).

Convolutional neural networks (CNN) and trans-

fer learning have been used in computer vision and

decision-making tasks recently. In this work, we

use CNN for classifying image spam in emails based

on (Shang and Zhang, 2016). They have used a non-

public image spam dataset. Their approach is to clas-

sify images extracted from emails to seven classes us-

ing CNN and SVM. They do not mention how they

are using the result of the classiﬁcation for an anti-

spam solution.

The closest work to ours is (Shang and Zhang,

2016) because of the use of CNN. The difference is

that we are using a pre-trained network and they are

training the network from scratch. The other differ-

ence is that our dataset is publicly available and our

classes may be used directly in the anti-spam system.

The second closest work to ours is (Carpinteiro

et al., 2017) which compares results on 3 publicly

available datasets and one of the models they used is

a neural network.

4 OUR APPROACH

We proposed a model architecture for this task based

on CNN and implemented it. This architecture has

shown promising results in image classiﬁcation lately.

We have tested it on two above mentioned publicly

available datasets and also on Email.cz image spam

dataset v1 (described in Sec. 4.2). In the following

sections, we describe the proposed architecture and

the dataset we published as part of this work.

4.1 Model Architecture

We propose using CNN for the image spam classiﬁca-

tion task. Namely a pre-trained ResNet v1 for seman-

tic feature extraction(He et al., 2016). We are using

the extracted features as an input to a single hidden

layer fully connected neural network. We tested this

approach on publicly available datasets. Unsatisﬁed

with its ability to benchmark the image spam classiﬁ-

cation task, we decided to use the CNN model for the

creation of our dataset.

Transfer learning (sometimes called ﬁne-tuning)

is a process of training a model trained for another

task previously. This technique is used with CNN

commonly. Sometimes this process is performed on

one model. The CNN part of the model is frozen (the

learning rate for that part is very small or zero) and the

fully connected layers are trained with new data. This

method is mostly used when the dataset size or the

Email Image Spam Classiﬁcation based on ResNet Convolutional Neural Network

459

computational power is insufﬁcient for training the

entire CNN. We split the network into two separate

parts. It made it possible to store the results of the

CNN evaluation for further evaluation and learning.

Otherwise, the process stayed the same.

We have used ResNet v1 for feature extraction (He

et al., 2016). More precisely we are using a fully con-

nected layer of the ResNet, which consists of 2048

output neurons as the output of the feature-extractor.

Consequently, the used feature vector contains 2048

ﬂoat entries. The ﬁrst reason to use the ResNet model

was that it achieved state-of-the-art results on the Im-

ageNet classiﬁcation task (Russakovsky et al., 2015).

The second reason is that it was tested in-house in

Email.cz for a similar task and achieved the best re-

sults. That resulted in a prepared and tested imple-

mentation and acceptance at Email.cz.

Resnet is a novel architecture designed for the Im-

ageNet classiﬁcation task. It consists of 152 layers.

It overcomes the problem of vanishing gradients by

using residual connections (skip connections over a

group of layers). This architecture won ILSVRC 2015

classiﬁcation challenge (He et al., 2016).

Our proposed model using features extracted by

ResNet v1 consists of one fully connected hidden

layer with 2048 neurons and ReLu activations. For

the output layer, we are using Softmax activation. The

implementation of this model is described in Sec. 5.2.

4.2 Email.cz Image Spam Dataset v1

As stated in Sec. 2.1 we decided to create our own

dataset and to publish it to the scientiﬁc community.

We did it in cooperation with Email.cz, which is the

largest freemail provider in the Czech Republic. We

publish the data in an anonymized format having min-

imal information loss in mind. Our proposed format

is to publish features extracted from the pre-trained

model instead of the images itself. We named the

dataset Email.cz image spam dataset v1.

At Email.cz, each image attachment is sent for

evaluation. Embedded HTML images are identiﬁed

by the URL. Not all images may be downloaded be-

cause some images are used for tracking the open rate.

The image is identiﬁed with a composite key contain-

ing, e.g. the email and image identiﬁers. Then the

number of occurrences of this composite key is cal-

culated and when it hits a deﬁned threshold the image

is sent for analysis and cache the result. The results

of the images evaluations are stored for future pro-

cessing. One part of the image evaluation is also the

extraction of a ResNet feature vector.

Email.cz provides its users a possibility to express

misclassiﬁed emails, which is a standard in this in-

dustry. User reactions are stored which allows us to

map the reactions to all images contained in the email.

Apache Spark is used to connect email reactions to

the images contained in those emails (Zaharia et al.,

2016). The result is the feature vector of the analyzed

image and all corresponding user reactions.

In our case, the anti-spam system classiﬁes emails

into three classes.

1. Ham emails - mostly personal communication and

should end up in the user’s inbox.

2. Newsletter emails - mostly messages containing

advertisements.

3. Spam emails - unsolicited messages.

When the classiﬁcation is incorrect, the user may

change the label for which results in one of the ac-

tions.

• USER MARK SPAM: From anywhere to spam.

• USER UNMARK SPAM: From spam to anywhere else.

The newsletters are treated analogously.

Email corporas are difﬁcult to build and publish

due to the private nature of email communications.

The same applies to image datasets. We suggest a

new approach to this task, namely publishing only

an extracted feature vectors, which are representative

enough for the classiﬁcation task (as shown in Sec. 7).

It shouldn’t be possible to reconstruct the personal

data from it (Listik, 2018).

We are following the approach of (Dredze et al.,

2007) for attaching labels to images contained in

emails. If the email is classiﬁed spam, all images in-

side are classiﬁed as spam too, same apply to all the

labels.

5 IMPLEMENTATION

We implemented the proposed algorithm and tested

the algorithm on the publicly available datasets and

our newly gathered dataset. In the following section,

we describe the process of data gathering and model

training.

5.1 Data Gathering

For feature extraction from the public datasets, we are

using the open-source implementation of ResNet v1

without any modiﬁcations. In the following sections,

we focus on the details of the creation of our pub-

lished dataset called Email.cz image spam dataset v1.

As described in Sec. 4.2 only some images are

sent for analysis. When the image is chosen to be

ICISSP 2020 - 6th International Conference on Information Systems Security and Privacy

460

analyzed, it is put into the queue. A server (an in-

stance of a ResNet model) takes it out from the queue

and analyzes it. We use open-source implementa-

tion

of ResNet using Tensorﬂow for feature extrac-

tion (Abadi et al., 2015; He et al., 2016). The result

is stored in two locations. First, it is stored in the

key-value store as a cache for the analysis of emails

containing the same image. Second, it is stored in

the Hadoop Distributed File System (HDFS) for fu-

ture use. We also store user reactions including infor-

mation which images were contained in each email on

HDFS too. Hence we may connect user reactions to

images with a Spark job. This cannot be done without

storing the image analysis data because the user reac-

tion comes after the email delivery (after the analysis).

The Spark job ﬁlters out all emails without im-

ages. First, it ﬁnds all reactions to the email. Second,

it also ﬁlters out emails without any reactions, groups

them by users who reacted to them. Third, ﬁnds all

the images contained in emails. Fourth and ﬁnally, it

extracts the image vector from the image representa-

tion and separates those vectors to single records with

the corresponding reactions (Zaharia et al., 2016).

The output of this job is the dataset described and

used in this paper.

Dataset Format

The format of the data is JSON

structure stored sep-

arately on each line which supports efﬁcient loading

in many programming languages.

Dataset Name

We decided to call this dataset Email.cz image spam

dataset v1.

Data Time Span

The images were gathered in the period June 12-18,

2017.

Dataset Download

URL of the dataset may be found at Github.com

https://github.com/tensorﬂow/tensorﬂow/blob/master/

tensorﬂow/contrib/slim/python/slim/nets/resnet v1.py

Described in RFC 7159 https://tools.ietf.org/html/

rfc7159

https://github.com/tivvit/image-spam-cnn-classiﬁer/

blob/master/dataset.yaml - We chose this way because

repository may be updated but the URL in the paper

cannot. The URL also contains the actual process of how

to download the dataset.

Our dataset is published via AcademicTorrents

We chose it because of the distributed fashion of the

peer2peer network. The main reason was that we en-

countered a lot of problems with missing or moved

datasets (as described in Sec. 1.1) for this task on the

Internet.

Format Speciﬁcation

[[[user rections], [user reactions], ...],

[resnet vector]], where

• Resnet vector is 2048 long vector of ﬂoats.

• User reactions are all reactions for one image

from one user deﬁned in user reaction.

User reaction is one of:

• USER MESSAGE MARK SPAM

• USER MESSAGE UNMARK SPAM

• USER MESSAGE MARK AD

• USER MESSAGE MARK NONAD

5.2 Statistical Model

We are using Keras with Tensorﬂow backend for

the implementation of model described in Sec. 4.1

(Chollet et al., 2015; Abadi et al., 2015). This im-

plementation is open-source and may be found at

Github.com

. All the results in this paper may be re-

produced with that repository in favor of reproducible

research.

6 TEST DATASET PROPERTIES

Our dataset consists of 778,768 lines, where each line

represents one image with its reactions. The dataset

is distributed in gzip format and its size is 2.5 GB

(16 GB after extraction). For this number of images,

we have collected 10,623,635 reactions. That is 13.64

reactions per image on average. The reaction distribu-

tion is shown in Table 3.

The category distribution in shown in Table 4.

We have implemented ﬁlters that helps the model

to use only consistent reactions. Implementations of

those ﬁlters are also included in the repository refer-

enced in Sec. 5.2.

First, we have ﬁltered inconsistent reactions of a

single user to one image. Those reactions are proba-

bly misclassiﬁcations. They make only 2.34% of the

data which is not signiﬁcant. It simpliﬁes the reaction

structure to a ﬂat array. We call remaining reactions

http://academictorrents.com/

https://github.com/tivvit/image-spam-cnn-classiﬁer

Email Image Spam Classiﬁcation based on ResNet Convolutional Neural Network

461

Table 3: The reaction distribution. Correlation between

minimal number of reactions and number of sample images.

Reactions to one image #image samples

>0 778768

>1 422541

>2 318234

>3 263478

>4 228131

>5 201003

>6 182244

>7 167866

>8 155431

>9 143242

Table 4: Distribution of the classes, based on user reaction

distribution.

Category Samples

Spam 52.92%

Advertisements 13.94%

Non-spam 6.43%

Non-ad 26.71%

valid and use them in all the following tests and mea-

surements.

We have also merged unmark user re-

actions (USER MESSAGE UNMARK SPAM and

USER MESSAGE MARK NONAD) to one class because we

do not need to use the information where the email

was delivered before, but only the information where

it belongs to. We may suppose it belongs to the

inbox, which is the class name for this merged group.

This also simpliﬁes the usage of classiﬁcation result

in the anti-spam ﬁlter.

We are merging all user’s reactions to one reaction

because reactions are very noisy. When the reaction

is inconsistent the image is not used because the im-

age itself probably does not contain any information

which may be used for the classiﬁcation (e.g. emoji).

We have deﬁned reaction consistency as

#cr

∗ 100 , (1)

where #cr is the number of reactions for most com-

mon reaction group for a given image and #r is the

number of all reactions for one image.

The average consistency of all user reactions for

one image is following 86.21% ± 20.96. When we

consider only images with more than one reaction we

get to 74.71% ± 22.69.

Table 5: Performance on publicly available datasets. Where

Dr means Dredze and PR spam means Princeton spam.

Dataset Precision Recall F1-score

Dr personal 99% 99% 99%

Dr public 95% 95% 95%

Dr combined 96% 96% 96%

Dr ham, PR spam 99% 99% 99%

All combined 96% 97% 96%

Table 6: Shows the number of samples used for testing

(15%) for 100,000 sample subset for a different number of

reactions (rows) and consistencies (columns).

0 0.5 0.6 0.75

>0 14,897 13,840 12,177 10,894

>1 8,276 7,219 5,556 4,273

>2 6,299 5,242 4,280 2,997

>3 5,204 4,216 3,275 2,424

>4 4,504 3,534 2,720 1,888

>5 3,953 3,039 2,255 1,523

>7 3,382 2,513 1,862 1,246

>10 2,663 1,877 1,387 922

>20 1,779 1,192 839 544

7 EXPERIMENTAL RESULTS

All models were trained for 80 epochs with weighted

classes. We are using a 75:25 train/test split for public

datasets and 85:15 train/test split for our dataset. For

all the models we are using Adam optimizer (Kingma

and Ba, 2014). For other details please consult Sec. 5.

7.1 Public Datasets

In Table 5, we present the performance of our solution

on public datasets.

7.2 Our Dataset

In Table 6, we present how minimal consistency and

the minimal number of reactions correspond to the

number of samples.

Table 7 shows the results of our architecture for

various consistencies and minimal reaction counts.

Classiﬁers were trained on a subset of our dataset

(100,000 samples). We can see that when we ﬁlter

out the reactions supported by more users and those

reactions are consistent we will get better results.

Table 8 shows the performance on our dataset for

some chosen consistencies and sample counts.

ICISSP 2020 - 6th International Conference on Information Systems Security and Privacy

462

Table 7: Average F1-scores for test set (15%) of 100,000

sample subset for different number of reactions (rows) and

consistencies (columns). Bold records were tested further.

0 0.5 0.6 0.75

>0 0.88 0.89 0.88 0.89

>1 0.9 0.89 0.93 0.94

>2 0.91 0.91 0.91 0.94

>3 0.91 0.92 0.92 0.93

>4 0.9 0.91 0.93 0.94

>5 0.92 0.92 0.93 0.94

>7 0.92 0.93 0.93 0.95

>10 0.92 0.93 0.94 0.95

>20 0.93 0.94 0.96 0.97

Table 8: Model F1-scores for chosen consistencies and min

reaction counts.

Min reactions consistency samples F1-score

0 0 649815 87%

3 0.6 140522 93%

5 0.6 98551 93%

10 0.6 59828 95%

20 0.6 34933 96%

3 0.75 102837 93%

5 0.75 66220 95%

10 0.75 39690 95%

20 0.75 22245 96%

8 CONCLUSIONS AND FUTURE

WORK

The proposed CNN architecture for the email image

spam classiﬁcation task can achieve state-of-the-art

performance on publicly available datasets. 99% F1-

score on (Dredze et al., 2007) and Princeton datasets

and 96% F1-score on combination of the datasets. It

also achieves up to 96% F1-score on the presented

Email.cz image spam v1 dataset.

Email.cz image spam v1 dataset is published as

part of this work. This dataset focuses on being re-

cent and it is based on real email trafﬁc. Due to this

fact, the data have to be anonymized which is done by

publishing only features extracted from the images.

Those features are extracted by CNN (ResNet v1).

The dataset is published via Academic Torrents plat-

form which is distributed in its nature, that should en-

sure that the data will be available for others in the fu-

ture. We were also considering the sufﬁciency of the

anonymization and concluded, that it is maybe possi-

ble to partially reconstruct the image data. However,

it would be computationally very expensive and the

level of detail that is needed for recognizing personal

information is already lost in the feature vector (Lis-

tik, 2018).

For future work, we want to gather a dataset in

a longer time range, which will contain also images

correctly classiﬁed by the current anti-spam solution.

Thus it will lead to a much bigger dataset. Our other

suggestion is to use a more complex model architec-

ture or a more sophisticated reaction ﬁltering tech-

nique for higher performance.

ACKNOWLEDGEMENTS

We want to thank Seznam.cz company (Email.cz

owner) for providing us the data for the dataset

creation, computational power and the time of the

Email.cz team.

REFERENCES

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z.,

Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin,

M., Ghemawat, S., Goodfellow, I., Harp, A., Irving,

G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kud-

lur, M., Levenberg, J., Man

e, D., Monga, R., Moore,

S., Murray, D., Olah, C., Schuster, M., Shlens, J.,

Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Van-

houcke, V., Vasudevan, V., Vi

egas, F., Vinyals, O.,

Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and

Zheng, X. (2015). TensorFlow: Large-scale machine

learning on heterogeneous systems. Software avail-

able from tensorﬂow.org.

Aradhye, H. B., Myers, G. K., and Herson, J. A. (2005).

Image analysis for efﬁcient categorization of image-

based spam e-mail. In Document Analysis and Recog-

nition, 2005. Proceedings. Eighth International Con-

ference on, pages 914–918. IEEE.

Biggio, B., Fumera, G., Pillai, I., and Roli, F. (2011). A

survey and experimental evaluation of image spam

ﬁltering techniques. Pattern Recognition Letters,

32(10):1436–1446.

Carpinteiro, O. A., Sanches, B. C., and Moreira, E. M.

(2017). Detecting image spam with an artiﬁcial neu-

ral model. International Journal of Computer Science

and Information Security, 15(1):296.

Chollet, F. et al. (2015). Keras. https://github.com/fchol

let/keras.

Cormack, G. V. and Lynam, T. R. (2005). Trec 2005 spam

track overview. In TREC, pages 500–274.

Dredze, M., Gevaryahu, R., and Elias-Bachrach, A. (2007).

Learning fast classiﬁers for image spam. In CEAS.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-

ual learning for image recognition. In Proceedings of

the IEEE conference on computer vision and pattern

recognition, pages 770–778.

Email Image Spam Classiﬁcation based on ResNet Convolutional Neural Network

463

Kingma, D. and Ba, J. (2014). Adam: A method

for stochastic optimization. arXiv preprint

arXiv:1412.6980.

Listik (2018). Image reconstruction from resnet semantic

feature vector. In Poster.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,

Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-

stein, M., et al. (2015). Imagenet large scale visual

recognition challenge. International Journal of Com-

puter Vision, 115(3):211–252.

Shang, E.-X. and Zhang, H.-G. (2016). Image spam clas-

siﬁcation based on convolutional neural network. In

Machine Learning and Cybernetics (ICMLC), 2016

International Conference on, pages 398–403. IEEE.

Wu, C.-T., Cheng, K.-T., Zhu, Q., and Wu, Y.-L. (2005).

Using visual features for anti-spam ﬁltering. In Im-

age Processing, 2005. ICIP 2005. IEEE International

Conference on, volume 3, pages III–509. IEEE.

Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust,

M., Dave, A., Meng, X., Rosen, J., Venkataraman, S.,

Franklin, M. J., et al. (2016). Apache spark: A uniﬁed

engine for big data processing. Communications of

the ACM, 59(11):56–65.

ICISSP 2020 - 6th International Conference on Information Systems Security and Privacy

464