Revisiting Arguments for a Three Layered Data Warehousing Architecture in the Context of the Hadoop Platform

Qishan Yang, Markus Helfert

Abstract

Data warehousing has been accepted in many enterprises to arrange historical data, regularly provide reports, assist decision making, analyze data and mine potentially valuable information. Its architecture can be divided into several layers from operated databases to presentation interfaces. The data all around the world is being created and growing explosively, if storing data or building a data warehouse via conventional tools or platforms may be time-consuming and exorbitantly expensive. This paper will discuss a three-layered data warehousing architecture in a big data platform, in which the HDFS (Hadoop Distributed File System) and the MapReduce mechanisms have been being leveraged to store and manipulate data respectively.

References

  1. Devlin, B. and Cote, L. D., 1996. Data warehouse: from architecture to implementation. Addison-Wesley Longman Publishing Co., Inc..
  2. Inmon, W. H., 1997. What is a data warehouse? Prism Tech. Topic 1(1).
  3. IBM, 1993. Information Warehouse Architecture I. IBM Corporation.
  4. Inmon, W. H., Strauss, D. and Neushloss, G., 2010. DW 2.0: The architecture for the next generation of data warehousing: The architecture for the next generation of data warehousing. Morgan Kaufmann.
  5. Kimball, R. and Ross, M., 2002. The data warehouse toolkit: the complete guide to dimensional modelling. Nachdr.]. New York [ua]: Wiley.
  6. Inmon, W. H., 1996. Building the Data Warehouse. Wiley. New York, USA.
  7. Apache Hadoop, (2016) Welcome to Apache Hadoop. Available at: https://hadoop.apache.org/ [Accessed 18 February 2016].
  8. Holmes, A., 2012. Hadoop in practice. Manning Publications Co..
  9. Apache Hive, (2016) Apache Hive TM. Available at: https://hive.apache.org/ [Accessed 18 February 2016].
  10. White, T., 2012. Hadoop: The definitive guide. " O'Reilly Media, Inc.".
  11. Apache Hive, (2016) Welcome to Apache HBase. Available at: https://hbase.apache.org/ [Accessed 18 February 2016].
  12. Apache Flume, (2016) Welcome to Apache Flume. Available at: https://flume.apache.org/ [Accessed 18 February 2016].
  13. Apache HBaseIntegration, (2016) Hive HBase Integration. Available at: https://cwiki.apache.org/confluence/ display/Hive/HBaseIntegration#HBaseIntegrationHiveHBaseIntegration [Accessed 18 February 2016].
  14. Apache Sqoop, (2016) Sqoop User Guide. Available at: https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide. html#_introduction [Accessed 18 February 2016].
  15. Thusoo, A., Shao, Z., Anthony, S., Borthakur, D., Jain, N., Sen Sarma, J., Murthy, R. and Liu, H., 2010, June. Data warehousing and analytics infrastructure at facebook. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (pp. 1013-1020). ACM.
  16. Abelló, A., Ferrarons, J. and Romero, O., 2011, October. Building cubes with MapReduce. In Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP (pp. 17-24). ACM.
  17. Arres, B., Kabbachi, N. and Boussaid, O., 2013, May. Building OLAP cubes on a Cloud Computing environment with MapReduce. In Computer Systems and Applications (AICCSA), 2013 ACS International Conference on (pp. 1-5). IEEE.
Download


Paper Citation


in Harvard Style

Yang Q. and Helfert M. (2016). Revisiting Arguments for a Three Layered Data Warehousing Architecture in the Context of the Hadoop Platform . In Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 2: CLOSER, ISBN 978-989-758-182-3, pages 329-334. DOI: 10.5220/0005912703290334


in Bibtex Style

@conference{closer16,
author={Qishan Yang and Markus Helfert},
title={Revisiting Arguments for a Three Layered Data Warehousing Architecture in the Context of the Hadoop Platform},
booktitle={Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 2: CLOSER,},
year={2016},
pages={329-334},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005912703290334},
isbn={978-989-758-182-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 2: CLOSER,
TI - Revisiting Arguments for a Three Layered Data Warehousing Architecture in the Context of the Hadoop Platform
SN - 978-989-758-182-3
AU - Yang Q.
AU - Helfert M.
PY - 2016
SP - 329
EP - 334
DO - 10.5220/0005912703290334