THE INDEX UPDATE PROBLEM FOR XML DATA IN XDBMS

Beda Christoph Hammerschmidt, Martin Kempa, Volker Linnemann

2005

Abstract

Database Management Systems are a major component of almost every information system. In relational Database Management Systems (RDBMS) indexes are well known and essential for the performant execution of frequent queries. For XML Database Management Systems (XDBMS) no index standards are established yet; although they are required not less. An inevitable side effect of any index is that modifications of the indexed data have to be reflected by the index structure itself. This leads to two problems: first it has to be determined whether a modifying operation affects an index or not. Second, if an index is affected, the index has to be updated efficiently - best without rebuilding the whole index. In recent years a lot of approaches were introduced for indexing XML data in an XDBMS. All approaches lack more or less in the field of updates. In this paper we give an algorithm that is based on finite automaton theory and determines whether an XPath based database operation affects an index that is defined universally upon keys, qualifiers and a return value of an XPath expression. In addition, we give algorithms how we update our KeyX indexes efficiently if they are affected by a modification. The Index Update Problem is relevant for all applications that use a secondary XML data representation (e.g. indexes, caches, XML replication/synchronization services) where updates must be identified and realized.

References

  1. Abiteboul, S., Suciu, D., and Buneman, P. (1999). Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, USA, 1st edition edition.
  2. Chen, Q., Lim, A., and Ong, K. W. (2003). D(k)-index: an adaptive structural summary for graph-structured data. In Proceedings of the 2003 ACM SIGMOD conference, San Diego, California, USA, pages 134 - 144, San Diego, California, USA.
  3. Chung, C., Min, J., and Shim, K. (2002). Apex: an adaptive path index for xml data. In Proceedings of the 2002 ACM SIGMOD Conference, Madison, Wisconsin, USA, pages 121-132. ACM Press.
  4. Cooper, B. F., Sample, N., Franklin, M. J., Hjaltason, G. R., and Shadmon, M. (2001). A fast index for semistructured data. In Proceedings of 27th International Conference on Very Large Data Bases, Roma, Italy. Morgan Kaufmann.
  5. Fiebig, T., Helmer, S., Kanne, C.-C., Moerkotte, G., Neumann, J., Schiele, R., and Westmann, T. (2002). Anatomy of a native xml base management system. VLDB Journal, 11(4).
  6. Goldman, R. and Widom, J. (1997). Dataguides: Enabling query formulation and optimization in semistructured databases. In VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, pages 436-445. Morgan Kaufmann.
  7. Grust, T. (2002). Accelerating xpath location steps. In Proceedings of the 2002 ACM SIGMOD Conference, Madison, Wisconsin, USA, pages 109-120.
  8. H. Jiang and H. Lu and W. Wang and B. Ooi (2003). XRTree: Indexing XML Data for Efficient Structural Join. In Proceedings of the 19th International Conference on Data Engineering (ICDE), pages 253-263, Bangalore, India.
  9. Halverson, A., Burger, J., Galanis, L., Kini, A., Krishnamurthy, R., Rao, A. N., Tian, F., Viglas, S. D., Wang, Y., Naughton, J. F., and DeWitt, D. J. (2003). Mixed mode xml query processing. In VLDB'03, Proceedings of 29rd International Conference on Very Large Data Bases, pages 225-236, Berlin, Germany.
  10. Hopcroft, J. E., Motwani, R., and Ullman, J. D. (2001). Introduction to Automata Theory, Languages, and Computation. Addison Wesley Publishing Company.
  11. Jagadish, H., Al-Khalifa, S., Lakshmanan, L., Nierman, A., Paparizos, S., Patel, J., Srivastava, D., and Wu, Y. (2002). Timber: A native xml database. Technical report, University of Michigan, USA.
  12. Kaushik, R., Bohannon, P., Naughton, J. F., and Korth, H. F. (2002). Covering indexes for branching path queries. In Proceedings of the ACM SIGMOD conference, Madison, Wisconsin, USA.
  13. Kaushik, R., Krishnamurthy, R., Naughton, J. F., and Ramakrishnan, R. (2004). On the integration of structure indexes and inverted lists. In Proceedings of the ACM SIGMOD conference, Paris, France, pages 779 - 790. ACM Press.
  14. Kha, D. D., Yoshikawa, M., and Uemura, S. (2001). An XML indexing structure with relative region coordinate. In 2001, I. C. S., editor, Proceedings of the 17th International Conference on Data Engineering (ICDE 2001), pages 313-320, Heidelberg, Germany.
  15. Ley, M. (2001). Digital Bibliography & Library Project. URL: http://dblp.uni-trier.de. Computer Science Bibliography.
  16. Miklau, G. and Suciu, D. (2004). Containment and equivalence for a fragment of xpath. J. ACM, 51(1):2-45.
  17. Milo, T. and Suciu, D. (1999). Index structures for path expressions. In Proceedings of Database Theory - ICDT 7899, 7th International Conference, volume 1540 of Lecture Notes in Computer Science, pages 277-295, Jerusalem, Israel. Springer.
  18. Schenkel, R., Theobald, A., and Weikum, G. (2004). Hopi: An efficient connection index for complex xml document collections. In Proceedings of the 9th International Conference on Extending Database Technology (EDBT), volume 2992 of Lecture Notes in Computer Science, pages 237-255.
  19. Schöning, H. (2001). Tamino - a dbms designed for xml. In Proceedings of the 17th International Conference on Data Engineering, pages 149-154, Heidelberg, Germany. IEEE Computer Society.
  20. Weigel, F., Meuss, H., Bry, F., and Schulz, K. U. (2003). Content-Aware DataGuides for Indexing Large Collections of XML Documents. Technical Report PMSFB-2003-14, Institute of Informatics, University of Munich.
Download


Paper Citation


in Harvard Style

Christoph Hammerschmidt B., Kempa M. and Linnemann V. (2005). THE INDEX UPDATE PROBLEM FOR XML DATA IN XDBMS . In Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 972-8865-19-8, pages 27-34. DOI: 10.5220/0002524600270034


in Bibtex Style

@conference{iceis05,
author={Beda Christoph Hammerschmidt and Martin Kempa and Volker Linnemann},
title={THE INDEX UPDATE PROBLEM FOR XML DATA IN XDBMS},
booktitle={Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2005},
pages={27-34},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002524600270034},
isbn={972-8865-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - THE INDEX UPDATE PROBLEM FOR XML DATA IN XDBMS
SN - 972-8865-19-8
AU - Christoph Hammerschmidt B.
AU - Kempa M.
AU - Linnemann V.
PY - 2005
SP - 27
EP - 34
DO - 10.5220/0002524600270034