AUTOMATIC GENERATION OF DATA MERGING PROGRAM CODES

Hyeonsook Kim, Samia Oussena, Ying zhang, Tony Clark

2010

Abstract

Data merging is an essential part of ETL (Extract-Transform-Load) processes to build a data warehouse system. To avoid rewheeling merging techniques, we propose a Data Merging Meta-model (DMM) and its transformation into executable program codes in the manner of model driven engineering. DMM allows defining relationships of different model entities and their merging types in conceptual level and our formalized transformation described using ATL (ATLAS Transformation Language) enables automatic generation of PL/SQL packages to execute data merging in commercial ETL tools. With this approach data warehouse engineers can be relieved from burden of repetitive complex script coding and pain of maintaining consistency of design and implementation.

References

  1. Allilaire, F., Bzivin, J., Jouault, F., and Kurtev, I., 2006. ATL: Eclipse Support for Model Transformation. In Proceeding of the Eclipse Technology eXchange Workshop (eTX) at ECOOP.
  2. Bezivin, J., 2005. Model-based Technology Integration with the Technical Space Concept, In Metainformatics symposium 2005.
  3. Bohm, M., Habich, D., Lehner, W., and Wloka, U., 2008. Model driven development of complex and data intensive integration processes, MBSDI 2008, CCIS 8, pp.31-42
  4. CWM, 2008. Common Warehouse Metamodel, Object Management Group. http://www.omg.org/technology/
  5. Dave Steinberg, Frank Budinsky, Marcelo Paternostro, Ed Merks, 2008. Eclipse Modeling Framework. AddisonWesley Professional
  6. Embley, D.W., Xu, L., and Ding, Y., 2004. Automatic Direct and Indirect Schema Mapping: Experiences and Lessons Learned, SIGMOD Record, Vol. 33, No. 4
  7. Fabro, D.D.M. and Valduriez, P., 2008. Towards the efficient development of model transformations using model weaving and matching transformations, Conference of Software and Systems Modeling.
  8. Frédéric Jouault and Ivan Kurtev, 2006. On the Architectural Alignment of ATL and QVT
  9. Greenfield, J., 2004. Software factories: Assembling applications with patterns, models, frameworks and tools. In GPCE, page 488.
  10. Kim, H., Zhang, Y., Oussena, S., and Clark, T., 2009. A Case Study on Model Driven Data Integration for Data Centric Software Development, In Proceedings of ACM First International Workshop on Dataintensive Software Management and Mining.
  11. Kimball, R. and Ross, M., 2002. The Data Warehouse Toolkit, John Wiley & Sons. 2nd edition.
  12. Kleppe, A., Warmer, J. and Bast,W., 2003. MDA Explained. The Model Driven Architecture: Practice and Promise. Addison-Wesley, Reading.
  13. Konigs, A. 2005. Model Transformation with Triple Graph Grammars. Model Transformations in Practice Satellite Workshop of MODELS 2005. Montego Bay, Jamaica.
  14. Marcos, D.D.F., Jean B. and Patrick V., 2006. Weaving Models with the Eclipse AMW plugin, Eclipse Modeling Symposium.
  15. MOF, 2008. Meta Object Facility, Object Management Group. http://www.omg.org/mof.
  16. Mora1, L.S., Vassiliadis, P., and Trujillo, J., 2004. Data Mapping Diagrams for Data Warehouse Design with UML, volume 3288 of Lecture Notes in Computer Science, pp 191-204.
  17. Muñoz, L., Mazón, J., and Trujillo, J., 2009. Automatic generation of ETL processes from conceptual models. In Proceeding of the ACM Twelfth international Workshop on Data Warehousing and OLAP.
  18. OCL, 2008. Object Constraint Language. Object Management Group. http://www.omg.org/technology/
  19. Rahm, E., and Do, H. H., 2000. Data Cleaning: Problems and Current Approaches, Journal of IEEE Data Engineering Bulletin, volume 23.
  20. March, S. and Hevner, A., 2007. Integrated decision support systems: A data warehousing perspective. Decision Support Systems, 43(3):1031-1043.
  21. Vassiliadis, P., Simitsis, A,. and Skiadopoulos, S., 2002. Conceptual Modeling for ETL Process, ACM Fifth International Workshop on Data Warehousing and OLAP 2002.
Download


Paper Citation


in Harvard Style

Kim H., Oussena S., zhang Y. and Clark T. (2010). AUTOMATIC GENERATION OF DATA MERGING PROGRAM CODES . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-8425-23-2, pages 179-186. DOI: 10.5220/0003008301790186


in Bibtex Style

@conference{icsoft10,
author={Hyeonsook Kim and Samia Oussena and Ying zhang and Tony Clark},
title={AUTOMATIC GENERATION OF DATA MERGING PROGRAM CODES},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2010},
pages={179-186},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003008301790186},
isbn={978-989-8425-23-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - AUTOMATIC GENERATION OF DATA MERGING PROGRAM CODES
SN - 978-989-8425-23-2
AU - Kim H.
AU - Oussena S.
AU - zhang Y.
AU - Clark T.
PY - 2010
SP - 179
EP - 186
DO - 10.5220/0003008301790186