loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Andrea Addis ; Giuliano Armano and Eloisa Vargiu

Affiliation: University of Cagliari, Italy

ISBN: 978-989-8425-28-7

Keyword(s): Hierarchical text categorization.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Symbolic Systems

Abstract: The more the amount of available data (e.g., in digital libraries), the greater the need for high-performance text categorization algorithms. So far, the work on text categorization has been mostly focused on “flat” approaches, i.e., algorithms that operate on non-hierarchical classification schemes. Hierarchical approaches are expected to perform better in presence of subsumption ordering among categories. In fact, according to the “divide et impera” strategy, they partition the problem into smaller subproblems, each being expected to be simpler to solve. In this paper, we illustrate and discuss the results obtained by assessing the “Progressive Filtering” (PF) technique, used to perform text categorization. Experiments, on the Reuters Corpus (RCV1- v2) and on DZMOZ datasets, are focused on the ability of PF to deal with input imbalance. In particular, the baseline is: (i) comparing the results to those calculated resorting to the corresponding flat approach; (ii) calculating the imp rovement of performance while augmenting the pipeline depth; and (iii) measuring the performance in terms of generalization- / specialization- / misclassification-error and unknown-ratio. Experimental results show that, for the adopted datasets, PF is able to counteract great imbalances between negative and positive examples. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.227.233.55

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Addis, A.; Armano, G. and Vargiu, E. (2010). ASSESSING PROGRESSIVE FILTERING TO PERFORM HIERARCHICAL TEXT CATEGORIZATION IN PRESENCE OF INPUT IMBALANCE.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 14-23. DOI: 10.5220/0003066300140023

@conference{kdir10,
author={Andrea Addis. and Giuliano Armano. and Eloisa Vargiu.},
title={ASSESSING PROGRESSIVE FILTERING TO PERFORM HIERARCHICAL TEXT CATEGORIZATION IN PRESENCE OF INPUT IMBALANCE},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={14-23},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003066300140023},
isbn={978-989-8425-28-7},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - ASSESSING PROGRESSIVE FILTERING TO PERFORM HIERARCHICAL TEXT CATEGORIZATION IN PRESENCE OF INPUT IMBALANCE
SN - 978-989-8425-28-7
AU - Addis, A.
AU - Armano, G.
AU - Vargiu, E.
PY - 2010
SP - 14
EP - 23
DO - 10.5220/0003066300140023

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.