loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Daniel Osuna-Ontiveros ; Ivan Lopez-Arevalo and Victor Sosa-Sosa

Affiliation: CINVESTAV - IPN, Mexico

ISBN: 978-989-8425-79-9

Keyword(s): Indexing models, Information retrieval, Semantic clustering, Semantic search.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Process Mining ; Symbolic Systems

Abstract: Information retrieval (IR) models process documents for preparing them for search by humans or computers. In the early models, the general idea was making a lexico-syntactic processing of documents, where the importance of the documents retrieved by a query is based on the frequency of its terms in the document. Another approach is return predefined documents based on the type of query the user make. Recently, some researchers have combined text mining techniques to enhance the document retrieval. This paper proposes a semantic clustering approach to improve traditional information retrieval models by representing topics associated to documents. This proposal combines text mining algorithms and natural language processing. The approach does not use a priori queries, instead clusters terms, where each cluster is a set of related words according to the content of documents. As result, a document-topic matrix representation is obtained denoting the importance of topics inside documents. For query processing, each query is represented as a set of clusters considering its terms. Thus, a similarity measure (e.g. cosine similarity) can be applied over this array and the matrix of documents to retrieve the most relevant documents. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.87.61.215

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Osuna-Ontiveros, D.; Lopez-Arevalo, I. and Sosa-Sosa, V. (2011). A SEMANTIC CLUSTERING APPROACH FOR INDEXING DOCUMENTS.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 280-285. DOI: 10.5220/0003663802880293

@conference{kdir11,
author={Daniel Osuna{-}Ontiveros. and Ivan Lopez{-}Arevalo. and Victor Sosa{-}Sosa.},
title={A SEMANTIC CLUSTERING APPROACH FOR INDEXING DOCUMENTS},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={280-285},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003663802880293},
isbn={978-989-8425-79-9},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - A SEMANTIC CLUSTERING APPROACH FOR INDEXING DOCUMENTS
SN - 978-989-8425-79-9
AU - Osuna-Ontiveros, D.
AU - Lopez-Arevalo, I.
AU - Sosa-Sosa, V.
PY - 2011
SP - 280
EP - 285
DO - 10.5220/0003663802880293

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.