loading
Papers

Research.Publish.Connect.

Paper

Authors: Stephen Bradshaw 1 ; Colm O'Riordan 1 and Daragh Bradshaw 2

Affiliations: 1 National University Ireland Galway, Ireland ; 2 National University Limerick, Ireland

ISBN: 978-989-758-271-4

Keyword(s): Document Clustering, Graph Theory, WordNet, Classification, Word Sense Disambiguation, Data Mining.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Context Discovery ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Symbolic Systems

Abstract: Clustering documents is a common task in a range of information retrieval systems and applications. Many approaches for improving the clustering process have been proposed. One approach is the use of an ontology to better inform the classifier of word context, by expanding the items to be clustered. Wordnet is commonly cited as an appropriate source from which to draw the additional terms; however, it may not be sufficient to achieve strong performance. We have two aims in this paper: first, we show that the use of Wordnet may lead to suboptimal performance. This problem may be accentuated when a document set has been drawn from comments made in social forums; due to the unstructured nature of online conversations compared to standard document sets. Second, we propose a novel method which involves constructing a bespoke ontology that facilitates better clustering. We present a study of clustering applied to a sample of threads from a social forum and investigate the effectiveness of t he application of these methods. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.94.129.211

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Bradshaw, S.; O'Riordan, C. and Bradshaw, D. (2017). Improving Document Clustering Performance: The Use of an Automatically Generated Ontology to Augment Document Representations.In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, ISBN 978-989-758-271-4, pages 215-223. DOI: 10.5220/0006500202150223

@conference{kdir17,
author={Stephen Bradshaw. and Colm O'Riordan. and Daragh Bradshaw.},
title={Improving Document Clustering Performance: The Use of an Automatically Generated Ontology to Augment Document Representations},
booktitle={Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR,},
year={2017},
pages={215-223},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006500202150223},
isbn={978-989-758-271-4},
}

TY - CONF

JO - Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR,
TI - Improving Document Clustering Performance: The Use of an Automatically Generated Ontology to Augment Document Representations
SN - 978-989-758-271-4
AU - Bradshaw, S.
AU - O'Riordan, C.
AU - Bradshaw, D.
PY - 2017
SP - 215
EP - 223
DO - 10.5220/0006500202150223

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.