loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Lim Choen Choi and Soon Cheol Park

Affiliation: Chonbuk National University, Korea, Republic of

Keyword(s): Document clustering, Cluster validity indices, Greedy algorithm, Average similarity.

Related Ontology Subjects/Areas/Topics: Applications ; Clustering ; Data Engineering ; Information Retrieval ; Ontologies and the Semantic Web ; Pattern Recognition ; Software Engineering ; Theory and Methods ; Web Applications

Abstract: A Greedy Algorithm for Document Clustering (Greedy Clustering) is proposed in this paper. Various cluster validity indices (DB, CH, SD, AS) are used to find the most appropriate optimization function for Greedy Clustering. The clustering algorithms are tested and compared on Reuter-21578. The results show that AS Index shows the best performance and the fastest running time among cluster indices in various experiments. Also Greedy Clustering with AS Index has 15~20% better performance than traditional clustering algorithms (K-means, Group Average).

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.138

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Choi, L. C. and Park, S. C. (2012). GREEDY APPROACH FOR DOCUMENT CLUSTERING. In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-8425-99-7; ISSN 2184-4313, SciTePress, pages 597-600. DOI: 10.5220/0003836605970600

@conference{icpram12,
author={Lim Choen Choi and Soon Cheol Park},
title={GREEDY APPROACH FOR DOCUMENT CLUSTERING},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2012},
pages={597-600},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003836605970600},
isbn={978-989-8425-99-7},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - GREEDY APPROACH FOR DOCUMENT CLUSTERING
SN - 978-989-8425-99-7
IS - 2184-4313
AU - Choi, L.
AU - Park, S.
PY - 2012
SP - 597
EP - 600
DO - 10.5220/0003836605970600
PB - SciTePress