loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Giacomo Domeniconi ; Gianluca Moro ; Roberto Pasolini and Claudio Sartori

Affiliation: Università degli Studi di Bologna, Italy

ISBN: 978-989-758-048-2

Keyword(s): Text Mining, Text Classification, Transfer Learning, Cross-domain Classification.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Soft Computing ; Symbolic Systems

Abstract: Cross-domain text classification deals with predicting topic labels for documents in a target domain by leveraging knowledge from pre-labeled documents in a source domain, with different terms or different distributions thereof. Methods exist to address this problem by re-weighting documents from the source domain to transfer them to the target one or by finding a common feature space for documents of both domains; they often require the combination of complex techniques, leading to a number of parameters which must be tuned for each dataset to yield optimal performances. We present a simpler method based on creating explicit representations of topic categories, which can be compared for similarity to the ones of documents. Categories representations are initially built from relevant source documents, then are iteratively refined by considering the most similar target documents, with relatedness being measured by a simple regression model based on cosine similarity, built once at the begin. This expectedly leads to obtain accurate representations for categories in the target domain, used to classify documents therein. Experiments on common benchmark text collections show that this approach obtains results better or comparable to other methods, obtained with fixed empirical values for its few parameters. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.82.93.116

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Domeniconi G., Moro G., Pasolini R. and Sartori C. (2014). Cross-domain Text Classification through Iterative Refining of Target Categories Representations.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014) ISBN 978-989-758-048-2, pages 31-42. DOI: 10.5220/0005069400310042

@conference{kdir14,
author={Giacomo Domeniconi and Gianluca Moro and Roberto Pasolini and Claudio Sartori},
title={Cross-domain Text Classification through Iterative Refining of Target Categories Representations},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)},
year={2014},
pages={31-42},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005069400310042},
isbn={978-989-758-048-2},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)
TI - Cross-domain Text Classification through Iterative Refining of Target Categories Representations
SN - 978-989-758-048-2
AU - Domeniconi G.
AU - Moro G.
AU - Pasolini R.
AU - Sartori C.
PY - 2014
SP - 31
EP - 42
DO - 10.5220/0005069400310042

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.