represents the sum of F-measure values resulted for 
all metrics used, and   represents the F-measure 
value calculated for metric ‘m’:  
F
m
w
m
=
1
F
∑
* F
m
 
5) Finally, weights are presented to the user who can 
accept or modify them. If the user does not run the 
sampler, then he either will define them, or 
averaging the metrics (equal weights for all) is the 
only mechanism that SASMINT can use, although it 
may not produce desirable results. 
5 CONCLUSION 
In this paper, we introduce a semi-automatic schema 
matching and integration tool called SASMINT and 
explain how it uses linguistic techniques to 
automatically resolve syntactical and semantic 
heterogeneities between database schemas. In order 
to identify the syntactic and semantic similarity 
between the elements’ names from two schemas, 
unlike other schema matching efforts, the SASMINT 
system utilizes a combination of different types of 
string similarity and semantic similarity metrics 
from NLP. The use of a weighted and recursive 
weighted sum of these metrics are proposed, giving 
more accurate results. Furthermore, the Sampler 
component of SASMINT helps users to influence 
the weights for applying these metrics. A number of 
tests were carried out to measure the correctness of 
the metrics and the results are provided in this paper. 
Other tests are being planned to compare SASMINT 
with other similar systems. 
REFERENCES 
ENBI (2005). European Network for Biodiversity 
Information (IST 2001-00618). 
http://www.enbi.info. 
Camarinha-Matos, L. M. and H. Afsarmanesh (2005). 
Collaborative networks: A new scientific discipline. 
Journal of Intelligent Manufacturing 16(4-5): 439-
452. 
Cleverdon, C. W. and E. M. Keen (1966). Factors 
determining the performance of indexing systems, vol 
2: Test results, Aslib Cranfield Research Project. 
Cranfield Institute of Technology. 
Do, H. H. and E. Rahm (2002). COMA - A System for 
Flexible Combination of Schema Matching 
Approaches. In 28th International Conference on Very 
Large Databases (VLDB). 
Doan, A., J. Madhavan, et al. (2002). Learning to Map 
between Ontologies on the Semantic Web. In World-
Wide Web Conf. (WWW-2002). 
Fellbaum, C. (1998). An Electronic Lexical Database., 
Cambridge: MIT press. 
Jaccard, P. (1912). The distribution of flora in the alpine 
zone. The New Phytologist 11(2): 37-50. 
Jaro, M. A. (1995). Probabilistic linkage of large public 
health. Statistics in Medicine: 14:491-498. 
Lesk, M. (1986). Automatic sense disambiguation using 
machine readable dictionaries: how to tell a pine code 
from an ice cream cone. In 5th SIGDOC Conference. 
Levenshtein, V. I. (1966). Binary codes capable of 
correcting deletions, insertions, and reversals. 
Cybernetics and Control Theory 10(8): 707-710. 
Madhavan, J., P. A. Bernstein, et al. (2001). Generic 
Schema Matching with Cupid. In 27th International 
Conference on Very Large Databases (VLDB). 
Melnik, S., H. Garcia-Molina, et al. (2002). Similarity 
Flooding: A Versatile Graph Matching Algorithm and 
its Application to Schema Matching. In 18th 
International Conference on Data Engineering (ICDE). 
Miller, R. J., L. M. Haas, et al. (2000). Schema Mapping 
as Query Discovery. In 26th International Conference 
on Very Large Databases (VLDB). 
Mitra, P., G. Wiederhold, et al. (2001). A scalable 
framework for the interoperation of information 
sources. International Semantic Web Working 
Symposium. 
Monge, A. E. and C. Elkan (1996). The Field Matching 
Problem: Algorithms and Applications. In 2nd 
International Conference on Knowledge Discovery 
and Data Mining. 
Pedersen, T., S. Banerjee, et al. (2003). Maximizing 
Semantic Relatedness to Perform Word Sense 
Disambiguation. Supercomputing Institute, University 
of Minnesota. 
Rijsbergen, C. J. v. (1979). Information Retrieval, 
Butterworths, London. 
Salton, G. and C. S. Yang (1973). On the specification of 
term values in automatic indexing. Journal of 
Documentation(29): 351-372. 
Unal, O. and H. Afsarmanesh (2006). Interoperability in 
Collaborative Network of Biodiversity Organizations. 
In Proc. of PRO-VE'06 - Virtual Enterprises and 
Collaborative Networks, Accepted for Publication. 
Wu, Z. and M. Palmer (1994). Verb Semantics and 
Lexical Selection. 32nd Annual Meeting of the 
Association for Computational Linguistics. 
 
 
ICSOFT 2006 - INTERNATIONAL CONFERENCE ON SOFTWARE AND DATA TECHNOLOGIES
120