# Performance of a K-Means Algorithm Driven by Careful Seeding

### Libero Nigro, Franco Cicirelli

#### 2023

#### Abstract

This paper proposes a variation of the K-Means clustering algorithm, named Population-Based K-Means (PB-K-MEANS), which founds its behaviour on careful seeding. The new K-Means algorithm rests on a greedy version of the K-Means++ seeding procedure (g_kmeans++), which proves effective in the search for an accurate clustering solution. PB-K-MEANS first builds a population of candidate solutions by independent runs of K-Means with g_kmeans++. Then the reservoir is used for recombining the stored solutions by Repeated K-Means toward the attainment of a final solution which minimizes the distortion index. PB-K-MEANS is currently implemented in Java through parallel streams and lambda expressions. The paper first recalls basic concepts of clustering and of K-Means together with the role of the seeding procedure, then it goes on by describing basic design and implementation issues of PB-K-MEANS. After that, simulation experiments carried out both on synthetic and real-world datasets are reported, confirming good execution performance and careful clustering.

Download#### Paper Citation

#### in Harvard Style

Nigro L. and Cicirelli F. (2023). **Performance of a K-Means Algorithm Driven by Careful Seeding**. In *Proceedings of the 13th International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH*; ISBN 978-989-758-668-2, SciTePress, pages 27-36. DOI: 10.5220/0012045000003546

#### in Bibtex Style

@conference{simultech23,

author={Libero Nigro and Franco Cicirelli},

title={Performance of a K-Means Algorithm Driven by Careful Seeding},

booktitle={Proceedings of the 13th International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH},

year={2023},

pages={27-36},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0012045000003546},

isbn={978-989-758-668-2},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH

TI - Performance of a K-Means Algorithm Driven by Careful Seeding

SN - 978-989-758-668-2

AU - Nigro L.

AU - Cicirelli F.

PY - 2023

SP - 27

EP - 36

DO - 10.5220/0012045000003546

PB - SciTePress