CLUSTERING INTERESTINGNESS MEASURES WITH POSITIVE CORRELATION

Xuan-Hiep Huynh, Fabrice Guillet, Henri Briand

2005

Abstract

Selecting interestingness measures has been an important problem in knowledge discovery in database. A lot of measures have been proposed to extract the knowledge from large databases and many authors have introduced the interestingness properties for selecting a suitable measure for a given application. Some measures are adequate for some applications but the others are not, and it is difficult to capture what the best measures for a given data set are. In this paper, we present a new approach implemented in a tool to select the groups or clusters of objective interestingness measures that are highly correlated in an application. The final goal relies on helping the user to select the subset of measures that is the best adapted to discover the best rules according to his/her preferences.

References

  1. Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th VLDB Conference.
  2. Bayardo, J. and Agrawal, R. (1999). Mining the most interestingness rules. In Proceedings of the Fifth ACM SIGKDD International Confeference On Knowledge Discovery and Data Mining.
  3. Blake, C. and Merz, C. (1998). UCI Repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html. University of California, Irvine, Dept. of Information and Computer Sciences.
  4. Blanchard, J., Kuntz, P., Guillet, F., and Gras, R. (2003). Implication intensity: from the basic statistical definition to the entropic version. In Statistical Data Mining and Knowledge Discovery. Chapman & Hall, CRC Press.
  5. Brin, S., Motwani, R., and Silverstein, C. (1997). Beyond market baskets: Generalizing association rules to correlation. In Proceedings of ACM SIGMOD Conference.
  6. Freitas, A. (1999). On rule interestingness measures. In Knowledge-Based Systems, 12(5-6).
  7. Gavrilov, M., Anguelov, D., Indyk, P., and Motwani, R. (2000). Mining the stock market: which measure is best? In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining.
  8. Gras, R. (1996). L'implication statistique - Nouvelle méthode exploratoire de données. La pensée sauvage édition.
  9. Hilderman, R. and Hamilton, H. (2001). Knowledge Discovery and Measures of Interestingness. Kluwer Academic Publishers.
  10. Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., and Verkano, A. I. (1994). Finding interesting rules from larges sets of discovered association rules. In the Third International Conference on Information and Knowledge Management. ACM Press.
  11. Kononenco, I. (1995). On biases in estimating multi-valued attributes. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI'95).
  12. Liu, B., Hsu, W., Mun, L., and Lee, H. (1999). Finding interestingness patterns using user expectations. In IEEE Transactions on Knowledge and Data Mining 11(1999).
  13. Padmanabhan, B. and Tuzhilin, A. (1998). A beliefdriven method for discovering unexpected patterns. In Proceedings of the 4th international conference on Knowledge Discovery and Data Mining.
  14. Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases. MIT Press.
  15. Saporta, G. (1990). Probabilités, Analyse des Données et Statistique. Editions Technip, Paris.
  16. Silberschartz, A. and Tuzhilin, A. (1996). What makes patterns interesting in knowledge discovery systems. In IEEE Transaction on Knowledge and Data Engineering (Vol. 8, No. 9).
  17. Tan, P., Kumar, V., and Srivastava, J. (2004). Selecting the right objective measure for association analysis. In Information Systems 29(4).
  18. Appendix B. Relation between measures 3 4 5 1 - 21 (nn3a + n1 )nab b
Download


Paper Citation


in Harvard Style

Huynh X., Guillet F. and Briand H. (2005). CLUSTERING INTERESTINGNESS MEASURES WITH POSITIVE CORRELATION . In Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 972-8865-19-8, pages 248-253. DOI: 10.5220/0002508502480253


in Bibtex Style

@conference{iceis05,
author={Xuan-Hiep Huynh and Fabrice Guillet and Henri Briand},
title={CLUSTERING INTERESTINGNESS MEASURES WITH POSITIVE CORRELATION},
booktitle={Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2005},
pages={248-253},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002508502480253},
isbn={972-8865-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - CLUSTERING INTERESTINGNESS MEASURES WITH POSITIVE CORRELATION
SN - 972-8865-19-8
AU - Huynh X.
AU - Guillet F.
AU - Briand H.
PY - 2005
SP - 248
EP - 253
DO - 10.5220/0002508502480253