A Weighted Maximum Entropy Language Model for Text Classification

Kostas Fragos; Yannis Maistros; Christos Skourlas

doi:10.5220/0002571800550067

A Weighted Maximum Entropy Language Model for Text Classification

Kostas Fragos, Yannis Maistros, Christos Skourlas

2005

Abstract

The Maximum entropy (ME) approach has been extensively used in various Natural Language Processing tasks, such as language modeling, part-of-speech tagging, text classification and text segmentation. Previous work in text classification was conducted using maximum entropy modeling with binary-valued features or counts of feature words. In this work, we present a method for applying Maximum Entropy modeling for text classification in a different way. Weights are used to select the features of the model and estimate the contribution of each extracted feature in the classification task. Using the X square test to assess the importance of each candidate feature we rank them and the most prevalent features, the most highly ranked, are used as the features of the model. Hence, instead of applying Maximum Entropy modeling in the classical way, we use the X square values to assign weights to the features of the model. Our method was evaluated on Reuters-21578 dataset for test classification tasks, giving promising results and comparably performing with some of the “state of the art” classification schemes.

References

Lewis, D. and Ringuette, M., A comparison of two learning algorithms for text categorization. In The Third Annual Symposium on Document Analysis and Information Retrieval pp.81-93, 1994
Makoto, I. and Takenobu, T., Cluster-based text categorization: a comparison of category search strategies, In ACM SIGIR'95, pp.273-280, 1995
McCallum, A. and Nigam, K., A comparison of event models for naïve Bayes text classification, In AAAI-98 Workshop on Learning for Text Categorization, pp.41-48, 1998
Masand, B., Lino, G. and Waltz, D., Classifying news stories using memory based reasoning, In ACM SIGIR'92, pp.59-65, 1992
Yang, Y. and Liu, X., A re-examination of text categorization methods, In ACM SIGIR'99, pp.42-49, 1999
Yang, Y., Expert network: Effective and efficient learning from human decisions in text categorization and retrieval, In ACM SIGIR'94, pp.13-22, 1994
Buckley, C., Salton, G. and Allan, J., The effect of adding relevance information in a relevance feedback environment, In ACM SIGIR'94, pp.292-300, 1994
Joachims, T., A probabilistic analysis of the rocchio algorithm with TFIDF for text categorization, In ICML'97, pp.143-151, 1997
Guo, H. and Gelfand S. B., Classification trees with neural network feature extraction, In IEEE Trans. on Neural Networks, Vol. 3, No. 6, pp.923-933, Nov., 1992
Liu, J. M. and Chua, T. S., Building semantic perception net for topic spotting, In ACL'01, pp.370-377, 2001
Ruiz, M. E. and Srinivasan, P., Hierarchical neural networks for text categorization, In ACM SIGIR'99, pp.81-82, 1999
Schutze, H., Hull, D. A. and Pedersen, J. O., A comparison of classifier and document representations for the routing problem, In ACM SIGIR'95, pp.229-237, 1995
Cortes, C. and Vapnik, V., Support vector networks, In Machine Learning, Vol.20, pp.273- 297, 1995
Joachims, T., Learning to classify text using Support Vector Machines, Kluwer Academic Publishers, 2002
Joachims, T., Text categorization with Support Vector Machines: learning with many relevant features, In ECML'98, pp.137-142, 1998
Schapire, R. and Singer, Y., BoosTexter: A boosting-based system for text categorization, In Machine Learning, Vol.39, No.2-3, pp.135-168, 2000
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C.J., Classification and Regression Trees, Wadsworth Int. 1984
Brodley, C. E. and Utgoff, P. E., Multivariate decision trees, In Machine Learning, Vol.19, No.1, pp.45-77, 1995
Denoyer, L., Zaragoza, H. and Gallinari, P., HMM-based passage models for document classification and ranking, In ECIR'01, 2001
Miller, D. R. H., Leek, T. and Schwartz, R. M., A Hidden Markov model information retrieval system, In ACM SIGIR'99, pp.214-221, 1999
Kira, K. and Rendell, L. A practical approach to feature selection. In Proc. 9th International workshop on machine learning (pp. 249-256) 1992
Gilad-Bachrach, Navot A., Tishby N. Margin Based Feature Selection - Theory and Algorithms. In Proc of ICML 2004
Stanley F. Chen and Rosenfeld R. A Gaussian prior for smoothing maximum entropy models. Technical report CMU-CS-99108, Carnegie Mellon University, 1999
Ronald Rosenfeld. Adaptive statistical language modelling: A maximum entropy approach, PhD thesis, Carnegie Mellon University, 1994
Ratnparkhi Adwait, J. Reynar, S. Roukos. A maximum entropy model for prepositional phrase attachment. In proceedings of the ARPA Human Language Technology Workshop, pages 250-255, 1994
Ratnparkhi Adwait. A maximum entropy model for part-of-speech tagging. In Proceedings of the Empirical Methods in Natural Language Conference, 1996
Shannon C.E. 1948. A mathematical theory of communication. Bell System Technical Journal 27:379 - 423, 623 - 656
Berger A, A Brief Maxent Tutorial. http://www-2.cs.cmu.edu/aberger/maxent.html 29.Berger A. 1997. The improved iterative scaling algorithm: a gentle introduction http://www-2.cs.cmu.edu/aberger/maxent.html
Della Pietra S., Della Pietra V. and Lafferty J., Inducing features of random fields. IEEE transaction on Pattern Analysis and Machine Intelligence, 19(4), 1997
Nigam K., J. Lafferty, A. McCallum. Using maximum entropy for text classification, 1999
Dumais, S. T., Platt, J., Heckerman, D., and Sahami, M, Inductive learning algorithms and representations for text categorization. Submitted for publication, 1998 http://research.microsoft.com/sdumais/cikm98.doc
Mikheev A., Feature Lattics and maximum entropy models. In machine Learning, McGraw-Hill, New York, 1999
Yang, Y. and Pedersen J., A comparative study on feature selection in text categorization. Fourteenth International Conference on Machine Learning (ICML'97) pp 412-420, 1997
Berger A., Della Pietra S., Della Pietra V., A maximum entropy approach to natural language processing, Computational Linguistics, 22 (1), pp 39-71, 1996

Download

Paper Citation

in Harvard Style

Fragos K., Maistros Y. and Skourlas C. (2005). A Weighted Maximum Entropy Language Model for Text Classification . In Proceedings of the 2nd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2005) ISBN 972-8865-23-6X, pages 55-67. DOI: 10.5220/0002571800550067

in Bibtex Style

@conference{nlucs05,
author={Kostas Fragos and Yannis Maistros and Christos Skourlas},
title={A Weighted Maximum Entropy Language Model for Text Classification},
booktitle={Proceedings of the 2nd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2005)},
year={2005},
pages={55-67},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002571800550067},
isbn={972-8865-23-6X},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2005)
TI - A Weighted Maximum Entropy Language Model for Text Classification
SN - 972-8865-23-6X
AU - Fragos K.
AU - Maistros Y.
AU - Skourlas C.
PY - 2005
SP - 55
EP - 67
DO - 10.5220/0002571800550067