Yangjun Chen



With the growing importance of XML in data ex¬change, much research has been done in providing flexible query facilities to extract data from structured XML docu¬ments. In this paper, we discuss an efficient algorithm for tree mapping problem in XML databases based on unordered tree matching. Given a target tree T and a pattern tree Q, the algorithm can find all the embeddings of Q in T in O(|D||Q|) time, where D is a largest data stream associated with a node of Q. More importantly, the algorithm is index-oriented: with XB-trees constructed over data streams, disk access can be dramatically decreased.


  1. Abiteboul, S., Buneman, P. and Suciu, D., 1999. Data on the web: from relations to semistructured data and XML, Morgan Kaufmann Publisher, Los Altos, CA 94022, USA.
  2. Aghili, A., Li, H., Agrawal, D. and Abbadi, A.E., 2006. TWIX: Twig structure and content matching of selective queries using binary labeling, in: INFOSCALE.
  3. Al-Khalifa, S., Jagadish, H.V., N. Koudas, Patel, J.M., Srivastava, D. and Wu, Y., 2002. Structural Joins: A primitive for efficient XML query pattern matching, in Proc. of IEEE Int. Conf. on Data Engineering.
  4. Bruno, N., Koudas, N. and Srivastava, D., 2002. Holistic Twig Joins: Optimal XML Pattern Matching, in Proc. SIGMOD Int. Conf. on Management of Data, Madison, Wisconsin, June 2002, pp. 310-321.
  5. Chamberlin, D.D., Clark, J., Florescu, D. and Stefanescu, M., 2002. XQuery1.0: An XML Query Language, http:/ / querydatamodel/.
  6. Chamberlin, D.D., Robie J. and D. Florescu, D., 2000. Quilt: An XML Query Language for Heterogeneous Data Sources, WebDB 2000.
  7. Chen, T., Lu, J. and Ling, T.W., 2005. On Boosting Holism in XML Twig Pattern Matching, in: Proc. SIGMOD, pp. 455-466.
  8. Choi, B., Mahoui, M. and Wood, D., 2003. On the optimality of holistic algorithms for twig queries, in: Proc. DEXA, pp. 235-244.
  9. Chung, C., Min, J. and Shim, K., 2002. APEX: An adaptive path index for XML data, ACM SIGMOD.
  10. Chen, S., Li, H-G., Tatemura, J., Hsiung, W-P., Agrawa, D. and Canda, K.S., 2006. Twig2Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents, in Proc. VLDB, Seoul, Korea, pp. 283-294.
  11. Cooper, B.F., Sample, N., Franklin, M., Hialtason, A.B. and Shadmon, M., 2001. A fast index for semistructured data, in: Proc. VLDB, pp. 341-350.
  12. Deutch, A., Fernandez, M., Florescu, D., Levy, A. and Suciu, D., 1999. A Query Language for XML, in: Proc. 8th World Wide Web Conf., pp. 77-91.
  13. Florescu, D. and Kossman, D., 1999. Storing and Querying XML data using an RDMBS, IEEE Data Engineering Bulletin, 22(3):27-34.
  14. Goldman R. and Widom, J. 1997. DataGuide: Enable query formulation and optimization in semistructured databases, in: Proc. VLDB, pp. 436-445.
  15. C.M. Hoffmann, C.M. and M.J. O'Donnell, M.J., 1982. Pattern matching in trees, J. ACM, 29(1):68-95.
  16. Lu, J., Ling, T.W., Chan, C.Y. and Chan, T., 2005 From Region Encoding to Extended Dewey: on Efficient Processing of XML Twig Pattern Matching, in: Proc. VLDB, pp. 193 - 204.
  17. McHugh, J. and Widom, J., 1999. Query optimization for XML, in Proc. of VLDB.
  18. Seo, C., Lee, S. and Kim, H., 2003. An Efficient Index Technique for XML Documents Using RDBMS, Information and Software Technology 45(2003) 11-22, Elsevier Science B.V.
  19. Li Q. and Moon, B., 2001. Indexing and Querying XML data for regular path expressions, in: Proc. VLDB, pp. 361-370.
  20. Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., Dewitt, D.J., and J.F. Naughton, J.F., 1999. Relational databases for querying XML documents: Limitations and opportunities, in Proc. of VLDB.
  21. U. of Washington, 2007. The Tukwila System, available from integration/tukwila/.
  22. U. of Wisconsin, 2007. The Niagara System, available from niagara/.
  23. U of Washington XML Repository, 2007. available from research/xmldatasets.
  24. Wang, H., S. Park, Fan, W. and Yu, P.S., 2003. ViST: A Dynamic Index Method for Querying XML Data by Tree Structures, SIGMOD Int. Conf. on Management of Data, San Diego, CA.
  25. Wang H. and Meng, X., 2005. On the Sequencing of Tree Structures for XML Indexing, in Proc. Conf. Data Engineering, Tokyo, Japan, April, pp. 372-385.
  26. World Wide Web Consortium, 2007. XML Path Language (XPath), W3C Recommendation. See http://
  27. World Wide Web Consortium, 2007. XQuery 1.0: An XML Query Language, W3C Recommedation, Version 1.0. See
  28. XMARK: The XML-benchmark project, 2002.
  29. C. Zhang, C., J. Naughton, Dewitt, D., Luo, Q. and G. Lohman, G., 2001. on Supporting containment queries in relational database management systems, in Proc. of ACM SIGMOD.
  30. Kaushik, R., Bohannon, P., Naughton, J. and Korth, H., 2002. Covering indexes for branching path queries, in: ACM SIGMOD.
  31. Schmidt, A.R., F. Waas, Kersten, M.L., Florescu, D., Manolescu, I., Carey, M.J. and R. Busse, 2001. The XML benchmark project, Technical Report INSRo1o3, Centrum voor Wiskunde en Informatica.
  32. Jiang, Z., Luo, C., Hou, W.-C., Zhu, Q., and Che, D., 2007. “Efficient Processing of XML Twig Pattern: A Novel One-Phase Holistic Solution,” In Proc. the 18th Int'l Conf. on Database and Expert Systems Applications (DEXA), pp. 87-97.
  33. Bar-Yossef, Z., Fontoura, M., and V. Josifovski, V. 2007. On the memmory requirements of XPath evaluation over XML streams, Journal of Computer and System Sciences 73, pp. 391-441.

Paper Citation

in Harvard Style

Chen Y. (2009). UNORDERED TREE MATCHING AND TREE PATTERN QUERIES IN XML DATABASES . In Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-674-010-8, pages 191-198. DOI: 10.5220/0002238801910198

in Bibtex Style

author={Yangjun Chen},
booktitle={Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT,},

in EndNote Style

JO - Proceedings of the 4th International Conference on Software and Data Technologies - Volume 2: ICSOFT,
SN - 978-989-674-010-8
AU - Chen Y.
PY - 2009
SP - 191
EP - 198
DO - 10.5220/0002238801910198