loading
Documents

Research.Publish.Connect.

Paper

Authors: Michele Ianni 1 ; Elio Masciari 2 ; Giuseppe M. Mazzeo 3 and Carlo Zaniolo 4

Affiliations: 1 DIMES, University of Calabria, Rende (CS) and Italy ; 2 ICAR-CNR, Rende (CS) and Italy ; 3 Facebook, Menlo Park and U.S.A. ; 4 UCLA, Los Angeles and U.S.A.

ISBN: 978-989-758-318-6

Keyword(s): Clustering, Big Data, Spark.

Abstract: The need to support advanced analytics on Big Data is driving data scientist’ interest toward massively parallel distributed systems and software platforms, such as Map-Reduce and Spark, that make possible their scalable utilization. However, when complex data mining algorithms are required, their fully scalable deployment on such platforms faces a number of technical challenges that grow with the complexity of the algorithms involved. Thus algorithms, that were originally designed for a sequential nature, must often be redesigned in order to effectively use the distributed computational resources. In this paper, we explore these problems, and then propose a solution which has proven to be very effective on the complex hierarchical clustering algorithm CLUBS+. By using four stages of successive refinements, CLUBS+ delivers high-quality clusters of data grouped around their centroids, working in a totally unsupervised fashion. Experimental results confirm the accuracy and scalability o f CLUBS+ on Map-Reduce platforms. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.204.176.189

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Ianni, M.; Masciari, E.; M. Mazzeo, G. and Zaniolo, C. (2018). Clustering Big Data.In Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-318-6, pages 276-282. DOI: 10.5220/0006858702760282

@conference{data18,
author={Michele Ianni. and Elio Masciari. and Giuseppe M. Mazzeo. and Carlo Zaniolo.},
title={Clustering Big Data},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2018},
pages={276-282},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006858702760282},
isbn={978-989-758-318-6},
}

TY - CONF

JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Clustering Big Data
SN - 978-989-758-318-6
AU - Ianni, M.
AU - Masciari, E.
AU - M. Mazzeo, G.
AU - Zaniolo, C.
PY - 2018
SP - 276
EP - 282
DO - 10.5220/0006858702760282

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.