A CONTINUOUS DATA INTEGRATION METHODOLOGY FOR SUPPORTING REAL-TIME DATA WAREHOUSING

Ricardo Jorge Santos, Jorge Bernardino

2007

Abstract

A data warehouse provides information for analytical processing, decision making and data mining tools. As the concept of real-time enterprise evolves, the synchronism between transactional data and data warehouses, statically implemented, has been reviewed. Traditional data warehouse systems have static structures of their schemas and relationships between data, and therefore are not able to support any dynamics in their structure and content. Their data is only periodically updated because they are not prepared for continuous data integration. For these purposes, real-time data warehouses seem to be very promising. In this paper we present a methodology on how to adapt data warehouse schemas and user-end OLAP (On-Line Analytical Processing) queries for efficiently supporting real-time data integration. To accomplish this, we use techniques such as table structure replication and query predicate restrictions for selecting data, managing to enable continuous data integration in the data warehouse with minimum impact in query execution time. We demonstrate the functionality of the method by analyzing its impact in query performance using benchmark TPC-H executing query workloads while simultaneously performing continuous data integration at various insertion time rates.

References

  1. Abadi, D. J., Carney, D., et al., 2003. “Aurora: A New Model and Architecture for Data Stream Management”, The VLDB Journal, 12(2), pp. 120- 139.
  2. Babu, S., Widom, J., 2001. “Continuous Queries Over Data Streams”, SIGMOD Record 30(3), pp. 109-120.
  3. Binder, T., 2003. Gong User Manual, Tecco Software Entwicklung AG.
  4. Bouzeghoub, M., Fabret, F., Matulovic, M., 1999. “Modeling Data Warehouse Refreshment Process as a Workflow Application”, Intern. Workshop on Design and Management of Data Warehouses (DMDW).
  5. Bruckner, R. M., List, B., Schiefer, J., 2002 A. “Striving Towards Near Real-Time Data Integration for Data Warehouses”, International Conference on Data Warehousing and Knowledge Discovery (DAWAK).
  6. Chaudhuri, S., Dayal, U., 1997. “An Overview of Data Warehousing and OLAP Technology”, SIGMOD Record, Volume 26, Number 1, pp. 65-74.
  7. Inmon, W. H., Terdeman, R. H., Norris-Montanari, J., Meers, D., 2001. Data Warehousing for E-Business, J. Wiley & Sons.
  8. Karakasidis, A., Vassiliadis, P., Pitoura, E., 2005. “ETL Queues for Active Data Warehousing”, IQIS'05.
  9. Kuhn, E., 2003. “The Zero-Delay Data Warehouse: Mobilizing Heterogeneous Databases”, International Conference on Very Large Data Bases (VLDB).
  10. Labio, W., Yang, J., Cui, Y., Garcia-Molina, H., Widom, J., 2000. “Performance Issues in Incremental Warehouse Maintenance”, (VLDB).
  11. Lomet, D., Gehrke, J., 2003. Special Issue on Data Stream Processing, IEEE Data Eng. Bulletin, 26(1).
  12. Oracle Corporation, 2005. www.oracle.com
  13. Pedersen, T. N., 2004. “How is BI Used in Industry?”, Int. Conf. on Data W. and Knowledge Discov. (DAWAK).
  14. Simitsis, A., Vassiliadis, P., Sellis, T., 2005. “Optimizing ETL Processes in Data Warehouses”, International Conference on Data Engineering (ICDE).
  15. Srivastava, U., Widom, J., 2004. “Flexible Time Management in Data Stream Systems”, PODS.
  16. Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T., 2001. “ARKTOS: Towards the Modelling, Design, Control and Execution of ETL Processes”, Information Systems, Vol. 26(8).
  17. White, C., 2002. “Intelligent Business Strategies: RealTime Data Warehousing Heats Up”, DM Preview, www.dmreview.com/article_sub_cfm?articleId=5570.
  18. Yang, J., 2001. “Temporal Data Warehousing”, Ph.D. Thesis, Dpt. Computer Science, Stanford University.
  19. Yang, J., and Widom, J., 2001B. “Temporal View SelfMaintenance”, 7th International Conference on Extending Database Technology (EDBT).
  20. Zurek, T., Kreplin, K., 2001. “SAP Business Information Warehouse - From Data Warehousing to an EBusiness Platform”, (ICDE).
Download


Paper Citation


in Harvard Style

Jorge Santos R. and Bernardino J. (2007). A CONTINUOUS DATA INTEGRATION METHODOLOGY FOR SUPPORTING REAL-TIME DATA WAREHOUSING . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-972-8865-88-7, pages 589-595. DOI: 10.5220/0002377205890595


in Bibtex Style

@conference{iceis07,
author={Ricardo Jorge Santos and Jorge Bernardino},
title={A CONTINUOUS DATA INTEGRATION METHODOLOGY FOR SUPPORTING REAL-TIME DATA WAREHOUSING},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2007},
pages={589-595},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002377205890595},
isbn={978-972-8865-88-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A CONTINUOUS DATA INTEGRATION METHODOLOGY FOR SUPPORTING REAL-TIME DATA WAREHOUSING
SN - 978-972-8865-88-7
AU - Jorge Santos R.
AU - Bernardino J.
PY - 2007
SP - 589
EP - 595
DO - 10.5220/0002377205890595