FEATURE INDUCTION OF LINEAR-CHAIN CONDITIONAL RANDOM FIELDS - A Study based on a Simulation

Dapeng Zhang, Bernhard Nebel

Abstract

Conditional Random Fields (CRFs) is a probabilistic framework for labeling sequential data. Several approaches were developed to automatically induce features for CRFs. They have been successfully applied in real-world applications, e.g. in natural language processing. The work described in this paper was originally motivated by processing the sequence data of table soccer games. As labeling such data is very time consuming, we developed a sequence generator (simulation), which creates an extra phase to explore several basic issues of the feature induction of linear-chain CRFs. First, we generated data sets with different configurations of overlapped and conjunct atomic features, and discussed how these factors affect the induction. Then, a reduction step was integrated into the induction which maintained the prediction accuracy and saved the computational power. Finally, we developed an approach which consists of a queue of CRFs. The experiments show that the CRF queue achieves better results on the data sets in all the configurations.

References

  1. Chen, M., Chen, Y., Brent, M. R., and Tenney, A. E. (2009). Gradient-based feature selection for conditional random fields and its applications in computational genetics. In ICTAI 7809: Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence, pages 750-757, Washington, DC, USA. IEEE Computer Society.
  2. Dietterich, T. G., Ashenfelter, A., and Bulatov, Y. (2004). Training conditional random fields via gradient tree boosting. In ICML 7804: Proceedings of the twentyfirst international conference on Machine learning, page 28, New York, NY, USA. ACM.
  3. Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. J. Mach. Learn. Res., 3:1157-1182.
  4. Lafferty, J., McCallum, A., and Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proc. 18th International Conf. on Machine Learning, pages 282-289.
  5. McCallum, A. (2003). Efficiently inducing features of conditional random fields. In UAI, pages 403-410.
  6. Rabiner, L. R. (1990). A tutorial on hidden markov models and selected applications in speech recognition. pages 267-296.
  7. Vishwanathan, S. V. N., Schraudolph, N. N., Schmidt, M. W., and Murphy, K. P. (2006). Accelerated training of conditional random fields with stochastic gradient methods. In ICML 7806: Proceedings of the 23rd international conference on Machine learning, pages 969-976, New York, NY, USA. ACM.
  8. Zhang, D. and Hornung, A. (2008). A table soccer game recorder. In Video Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Download


Paper Citation


in Harvard Style

Zhang D. and Nebel B. (2011). FEATURE INDUCTION OF LINEAR-CHAIN CONDITIONAL RANDOM FIELDS - A Study based on a Simulation . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 230-235. DOI: 10.5220/0003144102300235


in Bibtex Style

@conference{icaart11,
author={Dapeng Zhang and Bernhard Nebel},
title={FEATURE INDUCTION OF LINEAR-CHAIN CONDITIONAL RANDOM FIELDS - A Study based on a Simulation},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={230-235},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003144102300235},
isbn={978-989-8425-40-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - FEATURE INDUCTION OF LINEAR-CHAIN CONDITIONAL RANDOM FIELDS - A Study based on a Simulation
SN - 978-989-8425-40-9
AU - Zhang D.
AU - Nebel B.
PY - 2011
SP - 230
EP - 235
DO - 10.5220/0003144102300235