loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Angel L. Garrido ; Maria G. Buey ; Sandra Escudero ; Alvaro Peiro ; Sergio Ilarri and Eduardo Mena

Affiliation: University of Zaragoza, Spain

ISBN: 978-989-758-024-6

Keyword(s): Knowledge Management, Text Mining, Ontologies, Linguistics.

Related Ontology Subjects/Areas/Topics: Biomedical Engineering ; Context Aware Media Tagging ; Data Engineering ; Databases and Datawarehouses ; Enterprise Information Systems ; Health Information Systems ; Information Systems Analysis and Specification ; Internet Technology ; Knowledge Management ; Mobile Information Systems ; Ontologies and the Semantic Web ; Ontology and the Semantic Web ; Society, e-Business and e-Government ; Web Information Systems and Technologies ; Web Interfaces and Applications

Abstract: Automatic text categorisation systems is a type of software that every day it is receiving more interest, due not only to its use in documentaries environments but also to its possible application to tag properly documents on the Web. Many options have been proposed to face this subject using statistical approaches, natural language processing tools, ontologies and lexical databases. Nevertheless, there have been no too many empirical evaluations comparing the influence of the different tools used to solve these problems, particularly in a multilingual environment. In this paper we propose a multi-language rule-based pipeline system for automatic document categorisation and we compare empirically the results of applying techniques that rely on statistics and supervised learning with the results of applying the same techniques but with the support of smarter tools based on language semantics and ontologies, using for this purpose several corpora of documents. GENIE is being applied to real environments, which shows the potential of the proposal. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 100.26.176.182

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Garrido, A.; Buey, M.; Escudero, S.; Peiro, A.; Ilarri, S. and Mena, E. (2014). The GENIE Project - A Semantic Pipeline for Automatic Document Categorisation.In Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-024-6, pages 161-171. DOI: 10.5220/0004750601610171

@conference{webist14,
author={Angel L. Garrido. and Maria G. Buey. and Sandra Escudero. and Alvaro Peiro. and Sergio Ilarri. and Eduardo Mena.},
title={The GENIE Project - A Semantic Pipeline for Automatic Document Categorisation},
booktitle={Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2014},
pages={161-171},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004750601610171},
isbn={978-989-758-024-6},
}

TY - CONF

JO - Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - The GENIE Project - A Semantic Pipeline for Automatic Document Categorisation
SN - 978-989-758-024-6
AU - Garrido, A.
AU - Buey, M.
AU - Escudero, S.
AU - Peiro, A.
AU - Ilarri, S.
AU - Mena, E.
PY - 2014
SP - 161
EP - 171
DO - 10.5220/0004750601610171

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.