loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Shumeet Baluja ; Deepak Ravichandran and D. Sivakumar

Affiliation: Google, Inc., United States

ISBN: 978-989-674-011-5

Keyword(s): Text analysis, Text Classification, Machine Learning, Graph Algorithms, Preference Propagation, Semi supervised learning, Natural Language Processing, Adsorption.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Mining Text and Semi-Structured Data ; Symbolic Systems

Abstract: One of the fundamental assumptions for machine-learning based text classification systems is that the underlying distribution from which the set of labeled-text is drawn is identical to the distribution from which the text-to-be-labeled is drawn. However, in live news aggregation sites, this assumption is rarely correct. Instead, the events and topics discussed in news stories dramatically change over time. Rather than ignoring this phenomenon, we attempt to explicitly model the transitions of news stories and classifications over time to label stories that may be acquired months after the initial examples are labeled. We test our system, based on efficiently propagating labels in time-based graphs, with recently published news stories collected over an eighty day period. Experiments presented in this paper include the use of training labels from each story within the first several days of gathering stories, to using a single story as a label.

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.227.76.35

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Baluja S.; Ravichandran D.; Sivakumar D. and (2009). TEXT CLASSIFICATION THROUGH TIME - Efficient Label Propagation in Time-Based Graphs.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 174-182. DOI: 10.5220/0002303001740182

@conference{kdir09,
author={Shumeet Baluja and Deepak Ravichandran and D. Sivakumar},
title={TEXT CLASSIFICATION THROUGH TIME - Efficient Label Propagation in Time-Based Graphs},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={174-182},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002303001740182},
isbn={978-989-674-011-5},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - TEXT CLASSIFICATION THROUGH TIME - Efficient Label Propagation in Time-Based Graphs
SN - 978-989-674-011-5
AU - Baluja, S.
AU - Ravichandran, D.
AU - Sivakumar, D.
PY - 2009
SP - 174
EP - 182
DO - 10.5220/0002303001740182

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.