loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

ECD Test: An Empirical Way based on the Cumulative Distributions to Evaluate the Number of Clusters for Unsupervised Clustering

Topics: Artificial Intelligence based Models for Smart Production; Artificial Intelligence based Systems for Smart Production; Concrete Realization of Artificial Intelligence based Systems for Smart Production; Ontology and Knowledge Extraction for Smart Production and Manufacturing

Authors: Dylan Molinié and Kurosh Madani

Affiliation: LISSI Laboratory EA 3956, Université Paris-Est Créteil, Sénart-FB Institute of Technology, Campus de Sénart, 36-37 Rue Georges Charpak, F-77567 Lieusaint, France

Keyword(s): Unsupervised Clustering, Parameter Estimation, Cumulative Distributions, Industry 4.0, Cognitive Systems.

Abstract: Unsupervised clustering consists in blindly gathering unknown data into compact and homogeneous groups; it is one of the very first steps of any Machine Learning approach, whether it is about Data Mining, Knowledge Extraction, Anomaly Detection or System Modeling. Unfortunately, unsupervised clustering suffers from the major drawback of requiring manual parameters to perform accurately; one of them is the expected number of clusters. This parameter often determines whether the clusters will relevantly represent the system or not. From literature, there is no universal fashion to estimate this value; in this paper, we address this problem through a novel approach. To do so, we rely on a unique, blind clustering, then we characterize the so-built clusters by their Empirical Cumulative Distributions that we compare to one another using the Modified Hausdorff Distance, and we finally regroup the clusters by Region Growing, driven by these characteristics. This allows to rebuild the featu re space’s regions: the number of expected clusters is the number of regions found. We apply this methodology to both academic and real industrial data, and show that it provides very good estimates of the number of clusters, no matter the dataset’s complexity nor the clustering method used. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 35.175.201.191

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Molinié, D. and Madani, K. (2022). ECD Test: An Empirical Way based on the Cumulative Distributions to Evaluate the Number of Clusters for Unsupervised Clustering. In Proceedings of the 3rd International Conference on Innovative Intelligent Industrial Production and Logistics - ETCIIM; ISBN 978-989-758-612-5; ISSN 2184-9285, SciTePress, pages 279-290. DOI: 10.5220/0011562500003329

@conference{etciim22,
author={Dylan Molinié. and Kurosh Madani.},
title={ECD Test: An Empirical Way based on the Cumulative Distributions to Evaluate the Number of Clusters for Unsupervised Clustering},
booktitle={Proceedings of the 3rd International Conference on Innovative Intelligent Industrial Production and Logistics - ETCIIM},
year={2022},
pages={279-290},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011562500003329},
isbn={978-989-758-612-5},
issn={2184-9285},
}

TY - CONF

JO - Proceedings of the 3rd International Conference on Innovative Intelligent Industrial Production and Logistics - ETCIIM
TI - ECD Test: An Empirical Way based on the Cumulative Distributions to Evaluate the Number of Clusters for Unsupervised Clustering
SN - 978-989-758-612-5
IS - 2184-9285
AU - Molinié, D.
AU - Madani, K.
PY - 2022
SP - 279
EP - 290
DO - 10.5220/0011562500003329
PB - SciTePress