SIMILARITY ASSESSMENT IN A CBR APPLICATION FOR CLICKSTREAM DATA MINING PLANS SELECTION

Cristina Wanzeller, Orlando Belo

2007

Abstract

We implemented a mining plans selection system founded on the Case Based Reasoning paradigm, in order to assist the development of Web usage mining processes. The system’s main goal is to suggest the most suited methods to apply on a data analysis problem. Our approach builds upon the reuse of the experience gained from prior successfully mining processes, to solve current and future similar problems. The knowledge acquired after successfully solving such problems is organized and stored in a relational case base, giving rise to a (multi-) relational cases representation. In this paper we describe the similitude assessment devised within the retrieval of similar cases, to cope with the adopted representation. Structured representation and similarity assessment over complex data are issues relevant to a growing variety of application domains, being considered in multiple related lines of active research. We explore a number of different similarity measures proposed in the literature and we extend one of them to better fit our purposes.

References

  1. Bergmann, R., 2001. Highlights of the European INRECA projects. In ICCBR'01, 4th International Conference on CBR, Springer-Verlag, 1-15.
  2. Bergmann, R., Stahl, A., 1998. Similarity Measures for Object-Oriented Case Representations. In EWCBR'98, 4th European Workshop on Case-Based Reasoning. Springer-Verlag, Vol. 1488, 25-36.
  3. Bohnebeck, U., Horváth, T., Wrobel, S., 1998. Term Comparisons in First-Order Similarity Measures. In 8th International Conference on Inductive Logic Programming, Vol. 1446, Springer-Verlag, 65-79.
  4. Duda, R., Hart, P., Stork, D., 2001. Pattern Classification and Scene Analysis, chapter Unsupervised Learning and Clustering. John Willey and Sons.
  5. Eiter, T., Mannila, H., 1997. Distance Measures for Point Sets and their Computation. Acta Informatica, 34(2), 109-133.
  6. Emde, W., Wettschereck, D., 1996. Relational Instancebased Learning. In 13th International Conf. on Machine Learning, Morgan Kaufmann, 122-130.
  7. Gregori, V., Ramírez C., Orallo, J., Quintana, M., 2005. A survey of (pseudo-distance) Functions for StructuredData. In TAMIDA'05, III Taller Nacional de Minería de Datos y Aprendizaje, Editorial Thomson, CEDI'2005, 233-242.
  8. Flach, P., Giraud-Carrier, C., Lloyd, J., 1998. Strongly Typed Inductive Concept Learning. In 8th International Workshop on Inductive Logic Programming, Springer-Verlag, Vol. 1446, 185-194.
  9. Hilario, M., Kalousis, A., 2003. Representational Issues in Meta-Learning. In ICML'03, 20th International Conf. on Machine Learning , AAAI Press, 313-320.
  10. Kirsten, M., Wrobel, S., 1998. Relational Distance Based Clustering. In 8th Int. Conf. on Inductive Logic Programming, Vol. 1446, Springer-Verlag, 261-270.
  11. Kirsten, M., Wrobel, S., Horvath, T., 2001. Relational Data Mining. Distance Based Approaches to Relational Learning and Clustering, Springer-Verlag, 212-232.
  12. Kolodner, J., 1993. Case Based Reasoning. Morgan Kaufmann, San Francisco, CA.
  13. Ramon, J., 2002. Clustering and Instance Based Learning in First Order logic. PhD thesis, K.U. Leuven, Belgium.
  14. Wanzeller, C., Belo, O., 2006. Selecting Clickstream Data Mining Plans Using a Case-Based Reasoning Application. In DMIE'06, 7th International Conference on Data, Text and Web Mining and their Business Applications and Management Information Engineering, 223-232.
Download


Paper Citation


in Harvard Style

Wanzeller C. and Belo O. (2007). SIMILARITY ASSESSMENT IN A CBR APPLICATION FOR CLICKSTREAM DATA MINING PLANS SELECTION . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-972-8865-89-4, pages 137-144. DOI: 10.5220/0002396201370144


in Bibtex Style

@conference{iceis07,
author={Cristina Wanzeller and Orlando Belo},
title={SIMILARITY ASSESSMENT IN A CBR APPLICATION FOR CLICKSTREAM DATA MINING PLANS SELECTION},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2007},
pages={137-144},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002396201370144},
isbn={978-972-8865-89-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - SIMILARITY ASSESSMENT IN A CBR APPLICATION FOR CLICKSTREAM DATA MINING PLANS SELECTION
SN - 978-972-8865-89-4
AU - Wanzeller C.
AU - Belo O.
PY - 2007
SP - 137
EP - 144
DO - 10.5220/0002396201370144