loading
Papers

Research.Publish.Connect.

Paper

Authors: Ammar Al Abd Alazeez ; Sabah Jassim and Hongbo Du

Affiliation: The University of Buckingham, United Kingdom

ISBN: 978-989-758-222-6

Keyword(s): Big Data, Data Stream Clustering, Outliers Detection, Prototype-based Approaches.

Related Ontology Subjects/Areas/Topics: Clustering ; Incremental Learning ; Pattern Recognition ; Theory and Methods

Abstract: Data stream clustering is becoming an active research area in big data. It refers to group constantly arriving new data records in large chunks to enable dynamic analysis/updating of information patterns conveyed by the existing clusters, the outliers, and the newly arriving data chunk. Prototype-based algorithms for solving the problem have their promises for simplicity and efficiency. However, existing implementations have limitations in relation to quality of clusters, ability to discover outliers, and little consideration of possible new patterns in different chunks. In this paper, a new incremental algorithm called Enhanced Incremental K-Means (EINCKM) is developed. The algorithm is designed to detect new clusters in an incoming data chunk, merge new clusters and existing outliers to the currently existing clusters, and generate modified clusters and outliers ready for the next round. The algorithm applies a heuristic-based method to estimate the number of clusters (K), a radius- based technique to determine and merge overlapped clusters and a variance-based mechanism to discover the outliers. The algorithm was evaluated on synthetic and real-life datasets. The experimental results indicate improved clustering correctness with a comparable time complexity to existing methods dealing with the same kind of problems. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.237.51.35

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Al Abd Alazeez, A.; Jassim, S. and Du, H. (2017). EINCKM: An Enhanced Prototype-based Method for Clustering Evolving Data Streams in Big Data.In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-222-6, pages 173-183. DOI: 10.5220/0006196901730183

@conference{icpram17,
author={Ammar Al Abd Alazeez. and Sabah Jassim. and Hongbo Du.},
title={EINCKM: An Enhanced Prototype-based Method for Clustering Evolving Data Streams in Big Data},
booktitle={Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2017},
pages={173-183},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006196901730183},
isbn={978-989-758-222-6},
}

TY - CONF

JO - Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - EINCKM: An Enhanced Prototype-based Method for Clustering Evolving Data Streams in Big Data
SN - 978-989-758-222-6
AU - Al Abd Alazeez, A.
AU - Jassim, S.
AU - Du, H.
PY - 2017
SP - 173
EP - 183
DO - 10.5220/0006196901730183

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.