loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Elias Oliveira 1 ; Howard Roatti 1 ; Matheus de Araujo Nogueira 2 ; Henrique Gomes Basoni 1 and Patrick Marques Ciarelli 1

Affiliations: 1 Universidade Federal do Espírito Santo, Brazil ; 2 Fundação de Assistência e Educação FAESA, Brazil

ISBN: 978-989-758-158-8

Keyword(s): Text Classification, Social Network, Textmining.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Concept Mining ; Evolutionary Computing ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Soft Computing ; Symbolic Systems

Abstract: The usual practice in the classification problem is to create a set of labeled data for training and then use it to tune a classifier for predicting the classes of the remaining items in the dataset. However, labeled data demand great human effort, and classification by specialists is normally expensive and consumes a large amount of time. In this paper, we discuss how we can benefit from a cluster-based tree kNN structure to quickly build a training dataset from scratch. We evaluated the proposed method on some classification datasets, and the results are promising because we reduced the amount of labeling work by the specialists to 4% of the number of documents in the evaluated datasets. Furthermore, we achieved an average accuracy of 72.19% on tested datasets, versus 77.12% when using 90% of the dataset for training.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 35.172.217.40

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Oliveira, E.; Roatti, H.; Nogueira, M.; Basoni, H. and Ciarelli, P. (2015). Using the Cluster-based Tree Structure of k-Nearest Neighbor to Reduce the Effort Required to Classify Unlabeled Large Datasets.In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1 KDIR: SSTM, (IC3K 2015) ISBN 978-989-758-158-8, pages 567-576. DOI: 10.5220/0005615305670576

@conference{sstm15,
author={Elias Oliveira. and Howard Roatti. and Matheus de Araujo Nogueira. and Henrique Gomes Basoni. and Patrick Marques Ciarelli.},
title={Using the Cluster-based Tree Structure of k-Nearest Neighbor to Reduce the Effort Required to Classify Unlabeled Large Datasets},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1 KDIR: SSTM, (IC3K 2015)},
year={2015},
pages={567-576},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005615305670576},
isbn={978-989-758-158-8},
}

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1 KDIR: SSTM, (IC3K 2015)
TI - Using the Cluster-based Tree Structure of k-Nearest Neighbor to Reduce the Effort Required to Classify Unlabeled Large Datasets
SN - 978-989-758-158-8
AU - Oliveira, E.
AU - Roatti, H.
AU - Nogueira, M.
AU - Basoni, H.
AU - Ciarelli, P.
PY - 2015
SP - 567
EP - 576
DO - 10.5220/0005615305670576

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.