CONTEXT ANALYSIS FOR SEMANTIC MAPPING OF DATA SOURCES USING A MULTI-STRATEGY MACHINE LEARNING APPROACH

Youssef Bououlid Idrissi, Julie Vachon

2005

Abstract

Be it on a webwide or inter-entreprise scale, data integration has become a major necessity urged by the expansion of the Internet and of its widespread use for communication between business actors. However, since data sources are often heterogeneous, their integration remains an expensive procedure. Indeed, this task requires prior semantic alignment of all the data sources concepts. Doing this alignment manually is quite laborious especially if there is a large number of concepts to be matched. Various solutions have been proposed attempting to automatize this step. This paper introduces a new framework for data sources alignment which integrates context analysis to multi-strategy machine learning. Although their adaptability and extensibility are appreciated, actual machine learning systems often suffer from the low quality and the lack of diversity of training data sets. To overcome this limitation, we introduce a new notion called “informational context” of data sources. We therefore briefly explain the architecture of a context analyser to be integrated into a learning system combining multiple strategies to achieve data source mapping.

References

  1. Berlin, J. and Motro, A. (2002). Database schema matching using machine learning with feature selection. In Proceedings of the 14th International Conference on Advanced Information Systems Engineering, pages 452- 466.
  2. Botting, R. J. (2004). The constraint language. Web http://www.csci.csusb.edu/dick/samples/ocl.html.
  3. Doan, A., Domingos, P., and Halevy, A. (2003). Learning to match the schemas of databases- a multistrategy approach. Journal of Machine Learning, 50(3):279-301.
  4. Doan, A., Madhavan, J., Domingos, P., and Halevy, A. (2002). Learning to map between ontologies on the semantic web. In Proceedings of the 11th international conference on World Wide Web, pages 662-673.
  5. Kohavi, R. (1996). Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In Press, A., editor, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 202-207, Portland, OR.
  6. Kurgan, L., Swiercz, W., and Cios, K. J. (2002). Semantic mapping of xml tags using inductive machine learning. In Proceedings of the 2002 International Conference on Machine Learning and Applications (ICMLA'02), pages 99-109, Las Vegas, NV.
  7. Pedro Domingos, M. P. (1997). On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning, 29:103-130.
  8. Tierney, B. and Jackson, M. (2004). Contextual semantic integration for ontologies. In Doctoral Consortium of the 21rst Annual British National Conference on Databases.
Download


Paper Citation


in Harvard Style

Bououlid Idrissi Y. and Vachon J. (2005). CONTEXT ANALYSIS FOR SEMANTIC MAPPING OF DATA SOURCES USING A MULTI-STRATEGY MACHINE LEARNING APPROACH . In Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 972-8865-19-8, pages 445-448. DOI: 10.5220/0002539804450448


in Bibtex Style

@conference{iceis05,
author={Youssef Bououlid Idrissi and Julie Vachon},
title={CONTEXT ANALYSIS FOR SEMANTIC MAPPING OF DATA SOURCES USING A MULTI-STRATEGY MACHINE LEARNING APPROACH},
booktitle={Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2005},
pages={445-448},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002539804450448},
isbn={972-8865-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - CONTEXT ANALYSIS FOR SEMANTIC MAPPING OF DATA SOURCES USING A MULTI-STRATEGY MACHINE LEARNING APPROACH
SN - 972-8865-19-8
AU - Bououlid Idrissi Y.
AU - Vachon J.
PY - 2005
SP - 445
EP - 448
DO - 10.5220/0002539804450448