
 
8 CONCLUSIONS 
The proposed and developed work aims at 
contributing to a more simplified, productive and 
effective exploration of WUM potentialities. The 
practice shows that often it is more efficient to solve 
a problem starting from a tested successful solution 
of a previous similar situation, than to generate the 
entire solution from scratch. This fact is particularly 
truth in the DM and WUM domains, where recurrent 
problems are quite common. To achieve this aim, we 
implemented a system, essentially founded on the 
CBR paradigm, which should suggest the more 
suited mining plans to one clickstream data analysis 
problem, given its high level description. 
In this paper we described the similarity 
assessment approach, followed within the retrieval 
process, in order to cope with the multi-relational 
case representation. Structured representation and 
similarity assessment over complex data are 
important issues to a growing variety of application 
domains. It is a known fact that there is a trade-off 
between the expressiveness of the representation 
languages and the efficiency (complexity) of the 
learning method. The strategy of extending distance-
based propositional methods through structured and 
typed representations, able to simplify the problem 
modelling, and treating the features and theirs 
properties in the similarity measures is 
advantageous. It is simple, enables to benefice from 
the research and the efficiency from these methods, 
exploring at the same time the greater 
expressiveness of such representations. Since this 
strategy is suited to our current demands, it was 
adopted to handle the faced issues.  
We considered specifically the issue of 
measuring the similarity between sets of elements. 
There are multiple proposals in the literature to deal 
with this issue, but an ideal and general approach, 
appropriate to several purposes such as the intended 
semantic and properties, does not exist. 
Consequently, we explored a number of different 
already defined similarity measures and we extended 
one of them to better fit our purposes. This extension 
gave raise to two measures suited to the similarity 
assessment of features with different properties. 
We are currently working on the construction of 
more cases, comprising WUM process with higher 
complexity. Afterward, a more detailed and 
systematic experimental evaluation of the system is 
necessary. Moreover, one future direction of work 
concerns the weights assignment improvement, 
based on a comprehensive evaluation of the features 
relevance and discriminating power.  
ACKNOWLEDGEMENTS 
The work of Cristina Wanzeller has been supported 
by a grant from PRODEP (Acção 5.3, concurso nº02 
/2003). 
REFERENCES 
Bergmann, R., 2001. Highlights of the European INRECA 
projects. In ICCBR’01, 4th International Conference 
on CBR, Springer-Verlag, 1-15. 
Bergmann, R., Stahl, A., 1998. Similarity Measures for 
Object-Oriented Case Representations. In EWCBR'98, 
4th European Workshop on Case-Based Reasoning. 
Springer-Verlag, Vol. 1488, 25-36. 
Bohnebeck, U., Horváth, T., Wrobel, S., 1998. Term 
Comparisons in First-Order Similarity Measures. In 
8th International Conference on Inductive Logic 
Programming, Vol. 1446, Springer-Verlag, 65-79. 
Duda, R., Hart, P., Stork, D., 2001. Pattern Classification 
and Scene Analysis, chapter Unsupervised Learning 
and Clustering. John Willey and Sons.  
Eiter, T., Mannila, H., 1997. Distance Measures for Point 
Sets and their Computation. Acta Informatica, 34(2), 
109–133.  
Emde, W., Wettschereck, D., 1996. Relational Instance-
based Learning. In 13th International Conf. on 
Machine Learning, Morgan Kaufmann,  122-130. 
Gregori, V., Ramírez C., Orallo, J., Quintana, M., 2005. A 
survey of (pseudo-distance) Functions for Structured-
Data. In TAMIDA’05, III Taller Nacional de Minería 
de Datos y Aprendizaje, Editorial Thomson, 
CEDI’2005, 233-242.  
Flach, P., Giraud-Carrier, C., Lloyd, J., 1998. Strongly 
Typed Inductive Concept Learning. In 8th 
International Workshop on Inductive Logic 
Programming, Springer-Verlag, Vol. 1446, 185-194.  
Hilario, M., Kalousis, A., 2003. Representational Issues in 
Meta-Learning. In ICML'03, 20th International Conf. 
on Machine Learning , AAAI Press, 313-320.  
Kirsten, M., Wrobel, S., 1998. Relational Distance Based 
Clustering. In 8th Int. Conf. on Inductive Logic 
Programming, Vol. 1446, Springer-Verlag, 261-270.  
Kirsten, M., Wrobel, S., Horvath, T., 2001. Relational 
Data Mining. Distance Based Approaches to 
Relational Learning and Clustering, Springer-Verlag, 
212-232. 
Kolodner, J., 1993. Case Based Reasoning. Morgan 
Kaufmann, San Francisco, CA. 
Ramon, J., 2002. Clustering and Instance Based Learning 
in First Order logic. PhD thesis, K.U. Leuven, 
Belgium. 
Wanzeller, C., Belo, O., 2006. Selecting Clickstream Data 
Mining Plans Using a Case-Based Reasoning 
Application. In DMIE’06, 7th International 
Conference on Data, Text and Web Mining and their 
Business Applications and Management Information 
Engineering, 223-232. 
ICEIS 2007 - International Conference on Enterprise Information Systems
144