loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Ylermi Cabrera-León 1 ; Patricio García Báez 2 and Carmen Paz Suárez-Araujo 3

Affiliations: 1 Universidad de Las Palmas de Gran Canaria (ULPGC), Spain ; 2 Universidad de La Laguna, Spain ; 3 Universidad de Las Palmas de Gran Canaria, Spain

ISBN: 978-989-758-201-1

Keyword(s): Anti-spam, Spam, Ham, Artificial Neural Networks, Self-Organizing Maps (SOMs), Thematic Category, Term Frequency, Inverse Category or Class Frequency.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Biomedical Engineering ; Biomedical Signal Processing ; Computational Intelligence ; Computer-Supported Education ; Domain Applications and Case Studies ; Fuzzy Systems ; Health Engineering and Technology Applications ; Human-Computer Interaction ; Industrial, Financial and Medical Applications ; Methodologies and Methods ; Neural Based Data Mining and Complex Information Processing ; Neural Networks ; Neurocomputing ; Neurotechnology, Electronics and Informatics ; Pattern Recognition ; Physiological Computing Systems ; Sensor Networks ; Signal Processing ; Soft Computing ; Theory and Methods

Abstract: Spam, or unsolicited messages sent massively, is one of the threats that affects email and other media. Its high volume generates substantial time and economic losses. A solution to this problem is presented: a hybrid anti-spam filter based on unsupervised Artificial Neural Networks (ANNs). It consists of two steps, preprocessing and processing, both based on different computation models: programmed and neural (using Kohonen SOM). This system has been optimized using, as a data corpus, ham from “Enron Email” and spam from two different sources: traditional (user’s inbox) and spamtrap-honeypot. It has been proved that thematic categories can be found both in spam and ham words. 1260 system configurations were analyzed, comparing their quality and performance with the most used metrics. All of them achieved AUC > 0.90 and the best 204 AUC > 0.95, despite just using 13 attributes for the input vectors of the SOM, one for each thematic category. Results were similar to other researchers’ over the same corpus, though they make use of different Machine Learning (ML) methods and a number of attributes several orders of magnitude greater. It was further tested with datasets not utilized during design, obtaining 0.77 < AUC < 0.96 with normalized data. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.204.193.85

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Cabrera-León, Y.; García Báez, P. and Suárez-Araujo, C. (2016). Self-Organizing Maps in the Design of Anti-spam Filters - A Proposal based on Thematic Categories.In Proceedings of the 8th International Joint Conference on Computational Intelligence - Volume 2: NCTA, (IJCCI 2016) ISBN 978-989-758-201-1, pages 21-32. DOI: 10.5220/0006041400210032

@conference{ncta16,
author={Ylermi Cabrera{-}León. and Patricio García Báez. and Carmen Paz Suárez{-}Araujo.},
title={Self-Organizing Maps in the Design of Anti-spam Filters - A Proposal based on Thematic Categories},
booktitle={Proceedings of the 8th International Joint Conference on Computational Intelligence - Volume 2: NCTA, (IJCCI 2016)},
year={2016},
pages={21-32},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006041400210032},
isbn={978-989-758-201-1},
}

TY - CONF

JO - Proceedings of the 8th International Joint Conference on Computational Intelligence - Volume 2: NCTA, (IJCCI 2016)
TI - Self-Organizing Maps in the Design of Anti-spam Filters - A Proposal based on Thematic Categories
SN - 978-989-758-201-1
AU - Cabrera-León, Y.
AU - García Báez, P.
AU - Suárez-Araujo, C.
PY - 2016
SP - 21
EP - 32
DO - 10.5220/0006041400210032

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.