loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Nguyen Chi Thanh ; Koichi Yamada and Muneyuki Unehara

Affiliation: Nagaoka University of Technology, Japan

Keyword(s): Document clustering, Document representation, Rough sets, Text mining.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Mining Text and Semi-Structured Data ; Symbolic Systems

Abstract: Similarity rough set model for document clustering (SRSM) uses a generalized rough set model based on similarity relation and term co-occurrence to group documents in the collection into clusters. The model is extended from tolerance rough set model (TRSM) (Ho and Funakoshi, 1997). The SRSM methods have been evaluated and the results showed that it perform better than TRSM. However, in document collections where there are words overlapped in different document classes, the effect of SRSM is rather small. In this paper we propose a method to improve the performance of SRSM method in such document collections.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.189.180.244

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Chi Thanh, N.; Yamada, K. and Unehara, M. (2010). CLUSTERING DOCUMENTS WITH LARGE OVERLAP OF TERMS INTO DIFFERENT CLUSTERS BASED ON SIMILARITY ROUGH SET MODEL . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2010) - KDIR; ISBN 978-989-8425-28-7; ISSN 2184-3228, SciTePress, pages 396-399. DOI: 10.5220/0003068803960399

@conference{kdir10,
author={Nguyen {Chi Thanh}. and Koichi Yamada. and Muneyuki Unehara.},
title={CLUSTERING DOCUMENTS WITH LARGE OVERLAP OF TERMS INTO DIFFERENT CLUSTERS BASED ON SIMILARITY ROUGH SET MODEL },
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2010) - KDIR},
year={2010},
pages={396-399},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003068803960399},
isbn={978-989-8425-28-7},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2010) - KDIR
TI - CLUSTERING DOCUMENTS WITH LARGE OVERLAP OF TERMS INTO DIFFERENT CLUSTERS BASED ON SIMILARITY ROUGH SET MODEL
SN - 978-989-8425-28-7
IS - 2184-3228
AU - Chi Thanh, N.
AU - Yamada, K.
AU - Unehara, M.
PY - 2010
SP - 396
EP - 399
DO - 10.5220/0003068803960399
PB - SciTePress