IN YOUR INTEREST - Objective Interestingness Measures for a Generative Classifier

Dominik Fisch, Edgar Kalkowski, Bernhard Sick, Seppo J. Ovaska

2011

Abstract

In a wide-spread definition, data mining is termed to be the “non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data”. In real applications, however, usually only the validity of data mining results is assessed numerically. An important reason is that the other properties are highly subjective, i.e., they depend on the specific knowledge and requirements of the user. In this article we define some objective interestingness measures for a specific kind of classifier, a probabilistic classifier based on a mixture model. These measures assess the informativeness, uniqueness, importance, discrimination, comprehensibility, and representativity of rules contained in this classifier to support a user in evaluating data mining results. With some simulation experiments we demonstrate how these measures can be applied.

References

  1. Atzmueller, M., Baumeister, J., and Puppe, F. (2004). Rough-fuzzy MLP: modular evolution, rule generation, and evaluation. In 15th International Conference of Declarative Programming and Knowledge Management (INAP-2004), pages 203-213, Potsdam, Germany.
  2. Basu, S., Mooney, R. J., Pasupuleti, K. V., and Ghosh, J. (2001). Evaluating the novelty of text-mined rules using lexical knowledge. In Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2001), pages 233-238, San Francisco, CA.
  3. Bishop, C. (1994). Novelty detection and neural network validation. IEE Proceedings - Vision, Image, and Signal Processing, 141(4):217-222.
  4. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, New York, NY.
  5. Di Fiore, F. (2002). Visualizing interestingness. In Zanasi, A., Brebbia, C., Ebecken, N., and Melli, P., editors, Data Mining III. WIT Press, Southampton, U.K.
  6. Duda, R. O., Hart, P. E., and Stork, D. G. (2001). Pattern Classification. John Wiley & Sons, Chichester, New York, NY.
  7. Fayyad, U. M., Piatetsky-Shapiro, G., and Smyth, P. (1996). Knowledge discovery and data mining: Towards a unifying framework. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pages 82-88, Portland, OR.
  8. Fisch, D., Kü hbeck, B., Ovaska, S. J., and Sick, B. (2010). So near and yet so far: New insight into properties of some well-known classifier paradigms. Information Sciences, 180:3381-3401.
  9. Fisch, D. and Sick, B. (2009). Training of radial basis function classifiers with resilient propagation and variational Bayesian inference. In Proceedings of the International Joint Conference on Neural Networks (IJCNN 7809), pages 838-847, Atlanta, GA.
  10. Frank, A. and Asuncion, A. (2010). UCI machine learning repository.
  11. Hebert, C. and Cremilleux, B. (2007). A unified view of objective interestingness measures. In Perner, P., editor, Machine Learning and Data Mining in Pattern Recognition, number 4571 in LNAI, pages 533-547. Springer, Berlin, Heidelberg, Germany.
  12. Hilderman, R. J. and Hamilton, H. J. (2001). Knowledge Discovery and Measures of Interest. Kluwer Academic Publishers, Norwell, MA.
  13. Liu, B., Hsu, W., Chen, S., and Ma, Y. (2000). Analyzing the subjective interestingness of association rules. IEEE Intelligent systems, 15(5):47-55.
  14. McGarry, K. (2005). A survey of interestingness measures for knowledge discovery. The Knowledge Engineering Review, 20(1):39-61.
  15. Nauck, D. D. (2003). Measuring interpretability in rulebased classification systems. In Proceedings of the 12th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE'03), volume 1, pages 196-201, St. Louis, MO.
  16. Padmanabhan, B. and Tuzhilin, A. (1999). Unexpectedness as a measure of interestingness in knowledge discovery. Decision Support Systems, 27(3):303-318.
  17. Piatetsky-Shapiro, G. and Matheus, C. (1994). The interestingness of deviations. In Proceedings of the AAAI94 Workshop on Knowledge Discovery in Databases (KDD 1994), pages 25-36, Seattle, WA.
  18. Silberschatz, A. and Tuzhilin, A. (1996). What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge And Data Engineering, 8:970-974.
  19. Taha, I. and Ghosh, J. (1997). Evaluation and ordering of rules extracted from feedforward networks. In International Conference on Neural Networks, volume 1, pages 408-413, Houston, TX.
  20. Tan, P.-N., Kumar, V., and Srivastava, J. (2004). Selecting the right objective measure for association analysis. Information Systems, 29(4):293-313.
  21. UCL (2007). Elena Database. http://www.ucl.ac.be/mlg/ index.php?page=Elena.
Download


Paper Citation


in Harvard Style

Fisch D., Kalkowski E., Sick B. and J. Ovaska S. (2011). IN YOUR INTEREST - Objective Interestingness Measures for a Generative Classifier . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 414-423. DOI: 10.5220/0003186404140423


in Bibtex Style

@conference{icaart11,
author={Dominik Fisch and Edgar Kalkowski and Bernhard Sick and Seppo J. Ovaska},
title={IN YOUR INTEREST - Objective Interestingness Measures for a Generative Classifier},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={414-423},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003186404140423},
isbn={978-989-8425-40-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - IN YOUR INTEREST - Objective Interestingness Measures for a Generative Classifier
SN - 978-989-8425-40-9
AU - Fisch D.
AU - Kalkowski E.
AU - Sick B.
AU - J. Ovaska S.
PY - 2011
SP - 414
EP - 423
DO - 10.5220/0003186404140423