Specifying Complex Correspondences Between Relational Schemas in a Data Integration Environment

Valéria Pequeno, Helena Galhardas

2014

Abstract

When dealing with the data integration problem, the designer usually encounters incompatible data models characterized by differences in structure and semantics, even in the context of the same organization. In this work, we propose a declarative and formal approach to specify 1-to-1, 1-m, and m-to-n correspondences between relational schema components. Differently from usual correspondences, our Correspondence Asser- tions (CAs) have semantics and can deal with joins, outer-joins, and data-metadata relationships. Finally, we demonstrate how we can generate mapping expressions in the form of SQL queries from CAs.

References

  1. Bellahsene, Z., Bonifati, A., and Rahm, E., editors (2011). Schema Matching and Mapping. Data-Centric Systems and Applications. Springer.
  2. Bohannon, P., Elnahrawy, E., Fan, W., and Flaster, M. (2006). Putting context into schema matching. In VLDB, pages 307-318.
  3. Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6):377-387.
  4. Cruz, I. F., Antonelli, F. P., and Stroe, C. (2009). Agreementmaker: Efficient matching for large real-world schemas and ontologies. Proc. VLDB Endow., 2(2):1586-1589.
  5. Dhamankar, R., Lee, Y., Doan, A., Halevy, A. Y., and Domingos, P. (2004). IMAP: Discovering complex mappings between database schemas. In ACM SIGMOD, pages 383-394.
  6. Doan, A. (2002). Learning to Map between Structured Representations of Data. PhD thesis, University of Washington.
  7. Doan, A., Halevy, A., and Ives, Z. (2012). Principles of Data Integration. Morgan Kaufmann.
  8. Filho, F., L óscio, B., and Maceˆdo, J. A. (2010). Gerac¸a˜o incremental de correspondeˆncias e mapeamentos entre ontologias. In X Workshop de Teses e Dissertac¸o˜es em Banco de Dados.
  9. Giunchiglia, F., Shvaiko, P., Yatskevich, M., Giunchiglia, F., Shvaiko, P., and Yatskevich, M. (2005). Semantic schema matching. In On the Move to Meaningful Internet Systems: CoopIS, DOA, and ODBASE, pages 347-365.
  10. Haas, L. M., Hernández, M. A., Ho, H., Popa, L., and Roth, M. (2005). Clio grows up: from research prototype to industrial tool. In ACM SIGMOD, pages 805-810.
  11. Kimball, R., Ross, M., Thornthwaite, W., Mundy, J., and Becker, B. (2008). The Data Warehouse Lifecycle Tookit. Wiley Publishing, 2nd edition.
  12. Lakshmanan, L. V. S., Sadri, F., and Subramanian, I. N. (1996). SchemaSQL - a language for interoperability in relational multi-database systems. In VLDB, pages 239-250. Morgan Kaufmann Publishers Inc.
  13. Langegger, A., W öß, W., and Blöchl, M. (2008). A Semantic Web Middleware for Virtual Data Integration on the Web, volume The Semantic Web: Research and Applications 5021 of Lecture Notes in Computer Science, pages 493-507. Springer.
  14. Madhavan, J., Bernstein, P. A., and Rahm, E. (2001). Generic schema matching with cupid. In VLDB, pages 49-58. Morgan Kaufmann Publishers Inc.
  15. Magnani, M., Rizopoulos, N., Mc.Brien, P., and Montesi, D. (2005). Schema integration based on uncertain semantic mappings. In ER 2005, Conceptual Modeling, volume 3716 of Lecture Notes in Computer Science, pages 31-46. Springer Berlin / Heidelberg.
  16. Marnette, B., Mecca, G., Papotti, P., Raunich, S., and Santoro, D. (2011). ++spicy: an opensource tool for second-generation schema mapping and data exchange. Proc. VLDB Endow., 4(12):1438-1441.
  17. Massmann, S., Raunich, S., Aumueller, D., Arnold, P., and Rahm, E. (2011). Evolution of the COMA match system. In The 6th Intl. Workshop on Ontology Matching.
  18. Mork, P., Seligman, L., Rosenthal, A., Korb, J., and Wolf, C. (2008). The Harmony integration workbench. J. Data Semantics, 11:65-93.
  19. Papotti, P. and Torlone, R. (2009). Schema exchange: Generic mappings for transforming data and metadata. Data Knowl. Eng., 68(7):665-682.
  20. Pequeno, V. M. (2011). Using Perspective Schema and a Reference Model to Design the ETL Process. PhD thesis, Universidade Nova de Lisboa.
  21. Pequeno, V. M. and Aparício, J. N. (2005). Using correspondence assertions to specify the semantics of views in an object-relational data warehouse. In ICEIS'05, 7th Enterprise Information Systems, pages 219-225.
  22. Pequeno, V. M. and Pires, J. C. M. (2009). Using perspective schemata to model the ETL process. In ICMIS'09, Intl. Conf. on Management Information Systems, pages 332-339. World Academy of Science, Engineering and Technology.
  23. Popfinger, C. (2006). Enhanced Active Databases for Federated Information Systems. PhD thesis, Heinrich Heine University Düsseldorf.
  24. Raffio, A., Braga, D., Ceri, S., Papotti, P., and Hernandez, M. (2008). Clip: a visual language for explicit schema mappings. In ICDE'08, Intl. Conf. on Data Engineering, pages 30-39. IEEE.
  25. Rahm, E. and Bernstein, P. A. (2001). A survey of approaches to automatic schema matching. The VLDB Journal, 10(4):334-350.
  26. Seligman, L., Mork, P., Halevy, A., Smith, K., Carey, M. J., Chen, K., Wolf, C., Madhavan, J., Kannan, A., and Burdick, D. (2010). OpenII: an open source information integration toolkit. In ACM SIGMOD, pages 1057-1060.
  27. Shvaiko, P. and Euzenat, J. (2005). A survey of schemabased matching approaches. Journal on Data Semantics IV, 3730:146-171.
  28. Vidal, V. M. P., Casanova, M. A., and Cardoso, D. S. (2013). Incremental maintenance of RDF views of relational data. In ODBASE'13, 12th Intl. Conf. on Ontologies, DataBases, and Applications of Semantics.
  29. Vidal, V. M. P. and L óscio, B. F. (1999). Updating multiple databases through mediators. In ICEIS'99, Intl. Conf. on Enterprise Information Systems, pages 163-170.
  30. Wyss, C. M. and Robertson, E. L. (2005). languages for metadata integration. Database Syst., 30:624-660.
  31. Yan, L. L., Miller, R. J., Haas, L. M., and Fagin, R. (2001). Data-driven understanding and refinement of schema mappings. In ACM SIGMOD, pages 485-496. ACM.
Download


Paper Citation


in Harvard Style

Pequeno V. and Galhardas H. (2014). Specifying Complex Correspondences Between Relational Schemas in a Data Integration Environment . In Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-027-7, pages 18-29. DOI: 10.5220/0004870200180029


in Bibtex Style

@conference{iceis14,
author={Valéria Pequeno and Helena Galhardas},
title={Specifying Complex Correspondences Between Relational Schemas in a Data Integration Environment},
booktitle={Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2014},
pages={18-29},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004870200180029},
isbn={978-989-758-027-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Specifying Complex Correspondences Between Relational Schemas in a Data Integration Environment
SN - 978-989-758-027-7
AU - Pequeno V.
AU - Galhardas H.
PY - 2014
SP - 18
EP - 29
DO - 10.5220/0004870200180029