loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Andrew I. Schein ; Johnnie F. Caver ; Randale J. Honaker and Craig H. Martell

Affiliation: The Naval Postgraduate School, United States

ISBN: 978-989-8425-28-7

Keyword(s): Author attribution, Novel topic, Cross validation, Genre shift.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Computational Intelligence ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Soft Computing ; Symbolic Systems

Abstract: The practice of using statistical models in predicting authorship (so-called author attribution models) is long established. Several recent authorship attribution studies have indicated that topic-specific cues impact author attribution machine learning models. The arrival of new topics should be anticipated rather than ignored in an author attribution evaluation methodology; a model that relies heavily on topic cues will be problematic in deployment settings where novel topics are common. We develop a protocol and test bed for measuring sensitivity to topic cues using a methodology called novel topic cross-validation. Our methodology performs a cross-validation where only topics unseen in training data are used in the test portion. Analysis of the testing framework suggests that corpora with large numbers of topics lead to more powerful hypothesis testing in novel topic evaluation studies. In order to implement the evaluation metric, we developed two subsets of the New York Times Ann otated Corpus including one with 15 authors and 23 topics. We evaluated a maximum entropy classifier in standard and novel topic cross validation in order to compare the mechanics of the two procedures. Our novel topic evaluation framework supports automatic learning of stylometric cues that are topic neutral, and our test bed is reproducible using document identifiers available from the authors. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.88.156.58

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
I. Schein, A.; I. Schein, A.; F. Caver, J.; J. Honaker, R. and H. Martell, C. (2010). AUTHOR ATTRIBUTION EVALUATION WITH NOVEL TOPIC CROSS-VALIDATION.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 206-215. DOI: 10.5220/0003088402060215

@conference{kdir10,
author={Andrew I. Schein. and Andrew I. Schein. and Johnnie F. Caver. and Randale J. Honaker. and Craig H. Martell.},
title={AUTHOR ATTRIBUTION EVALUATION WITH NOVEL TOPIC CROSS-VALIDATION},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={206-215},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003088402060215},
isbn={978-989-8425-28-7},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - AUTHOR ATTRIBUTION EVALUATION WITH NOVEL TOPIC CROSS-VALIDATION
SN - 978-989-8425-28-7
AU - I. Schein, A.
AU - I. Schein, A.
AU - F. Caver, J.
AU - J. Honaker, R.
AU - H. Martell, C.
PY - 2010
SP - 206
EP - 215
DO - 10.5220/0003088402060215

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.