CHANGE DETECTION AND MAINTENANCE OF AN XML WEB WAREHOUSE

Ching-Ming Chao

2005

Abstract

The World Wide Web is a popular broadcast medium that contains a huge amount of information. The web warehouse is an efficient and effective means to facilitate utilization of information on the Web. XML has become the new standard for semi-structured data exchange over the Web. In this paper, therefore, we study the XML web warehouse and propose an approach to the problems of change detection and warehouse maintenance in an XML web warehouse system. This paper has three major contributions. First, we propose an object-oriented data model for XML web pages in the web warehouse as well as system architecture for change detection and warehouse maintenance. Second, we propose a change detection method based on mobile agent technology to actively detect changes of data sources of the web warehouse. Third, we propose an incremental and deferred maintenance method to maintain XML web pages in the web warehouse. We compared our approach with a rewriting approach to storage and maintenance of the XML web warehouse by experiments. Performance evaluation shows that our approach is more efficient than the rewriting approach in terms of the response time and storage space of the web warehouse.

References

  1. Agrawal, D., El Abbadi, A., Singh, A., Yurek, T., 1997. Efficient view maintenance at data warehouses. In Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, pp. 417-427.
  2. Apparao, V., 1998. Document Object Model (DOM) Level 1 Specification (Version 1.0).
  3. Bhowmick, S. S., Ng, W. K., Madria, S. K., Lim, E. P., 2000. Detecting and representing relevant web deltas using web join. In Proceedings of the 20th IEEE International Conference on Distributed Computing Systems, pp. 255-262.
  4. Chawathe, S. S., Abiteboul, S., Widom, J., 1999. Managing historical semistructured data. Theory and Practice of Object Systems, Vol. 5, No. 3, pp. 143-162.
  5. Labio, W., Garcia-Molina, H., 1995. Efficient snapshot differential algorithm for data warehousing. In Proceedings of the 22nd International Conference on Very Large Data Bases, pp. 63-74.
  6. Lim, S. J., Ng, Y. K., 2001. An automated change detection algorithm for HTML documents based on semantic hierarchies. In Proceedings of the 17th IEEE International Conference on Data Engineering, pp. 303-312.
  7. Ng, W. K., Lin, E. P., Huang, C. T., Bhowmick, S., Qin, F. Q., 1998. Web warehousing: an algebra for web information. In Proceedings of the 1998 IEEE Forum on Research and Technology Advances in Digital Libraries, pp. 228-237.
  8. Xyleme, L., 2001. A dynamic warehouse for XML data of the web. IEEE Data Engineering Bulletin, Vol. 24, No. 2, pp. 40-47.
  9. Zhuge, Y., Garcia-Molina, H., Hammer, J., Widom, J., 1995. View maintenance in a warehousing environment. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pp. 316-327.
  10. Zhuge, Y., Garcia-Molina, H., Wiener, J. L., 1996. The Strobe algorithms for multi-source warehouse consistency. In Proceedings of the 4th IEEE International Conference on Parallel and Distributed Information Systems, 146-157.
Download


Paper Citation


in Harvard Style

Chao C. (2005). CHANGE DETECTION AND MAINTENANCE OF AN XML WEB WAREHOUSE . In Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 972-8865-19-8, pages 52-59. DOI: 10.5220/0002555700520059


in Bibtex Style

@conference{iceis05,
author={Ching-Ming Chao},
title={CHANGE DETECTION AND MAINTENANCE OF AN XML WEB WAREHOUSE},
booktitle={Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2005},
pages={52-59},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002555700520059},
isbn={972-8865-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - CHANGE DETECTION AND MAINTENANCE OF AN XML WEB WAREHOUSE
SN - 972-8865-19-8
AU - Chao C.
PY - 2005
SP - 52
EP - 59
DO - 10.5220/0002555700520059