K-Means Clustering Optimization using the Elbow Method and Early Centroid Determination Based-on Mean and Median

Edy Umargono, Jadmiko Endro Suseno, Vincensius Gunawan S. K.

2019

Abstract

The most widely used algorithm in the cluster partitioning method is the K-Means algorithm. Historically KMeans is still the best grouping algorithm among other grouping algorithms with the ability to group a number of data with relatively fast and efficient computing time. The KMeans algorithm is widely implemented in various fields in industrial and scientific applications and is very suitable for processing quantitative data with numeric attributes but there are still weaknesses in this algorithm. Weaknesses of the K-Means algorithm include determining the number of clusters based on assumptions and relying heavily on initial selection of centroids to overcome this weakness, in this study, we propose the use of the elbow method to determine the best number of clusters and determination of centroid based-on mean and median data. The results of this study indicate that using initial centroid determination based on mean data makes the number of iterations needed to achieve uniformity in clusters 22.58% less than using initial random cluster determination and determining the best number of clusters using the elbow method makes the required iteration 25% less than using the number of other clusters.

Download


Paper Citation


in Harvard Style

Umargono E., Suseno J. and S. K. V. (2019). K-Means Clustering Optimization using the Elbow Method and Early Centroid Determination Based-on Mean and Median.In Proceedings of the International Conferences on Information System and Technology - Volume 1: CONRIST, ISBN 978-989-758-453-4, pages 234-240. DOI: 10.5220/0009908402340240


in Bibtex Style

@conference{conrist19,
author={Edy Umargono and Jadmiko Endro Suseno and Vincensius Gunawan S. K.},
title={K-Means Clustering Optimization using the Elbow Method and Early Centroid Determination Based-on Mean and Median},
booktitle={Proceedings of the International Conferences on Information System and Technology - Volume 1: CONRIST,},
year={2019},
pages={234-240},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009908402340240},
isbn={978-989-758-453-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the International Conferences on Information System and Technology - Volume 1: CONRIST,
TI - K-Means Clustering Optimization using the Elbow Method and Early Centroid Determination Based-on Mean and Median
SN - 978-989-758-453-4
AU - Umargono E.
AU - Suseno J.
AU - S. K. V.
PY - 2019
SP - 234
EP - 240
DO - 10.5220/0009908402340240