A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA

Yaokai Feng, Zhibin Wang, Akifumi Makinouchi

Abstract

It is well-known that multidimensional indices are efficient to improve the query performance on relational data. As one successful multi-dimensional index structure, R*-tree, a famous member of the R-tree family, is very popular. The clustering pattern of the objects (i.e., tuples in relational tables) among R*-tree leaf nodes is one of the deceive factors on performance of range queries, a popular kind of queries on business data. Then, how is the clustering pattern formed? In this paper, we point out that the insert algorithm of R*-tree, especially, its clustering criterion of choosing subtrees for new coming objects, determines the clustering pattern of the tuples among the leaf nodes. According to our discussion and observations, it becomes clear that the present clustering criterion of R*-tree can not lead to a good clustering pattern of tuples when R*-tree is applied to business data, which greatly degrades query performance. After that, a hybrid clustering criterion for the insert algorithm of R*-tree is introduced. Our discussion and experiments indicate that query performance of R*-tree on business data is improved clearly by the hybrid criterion.

References

  1. C. Chung, S. Chun, J. Lee, and S. Lee (2001). Dynamic Update Cube for Range-Sum Queries. Proc. VLDB Intl. Conf.,
  2. Council (1999). TPC benchmark H standard specification (decision support)". http://www.tpc.org/tpch/
  3. D. Papadias, N. Mamoulis, and V. Delis (1998). Algorithms for Querying by Spatial Structure. Proc. VLDB Intl. Conf.
  4. H. Horinokuchi, and A. Makinouchi (1999). Normalized R*-tree for Spatiotemporal Databases and Its Performance Tests. IPSJ Journal, Vol. 40, No. 3.
  5. H. P. Kriegel, T. Brinkhoff, and R. Schneider (1993). Efficient Spatial Query Processing in Geographic Database Systems.
  6. H. V. Jagadish, N.Koudas, and D. Srivastava (2000). On Effective Multi-Dimensional Indexing for Strings. Proc. ACM SIGMOD Intl. Conf.
  7. J. Han and M. Kamber (2001). Data Mining-Concepts and Techniques. Morgan Kaufmann press.
  8. M. Jurgens, and H.-J. Lenz (1998). The Ra*-tree: An Improved R-tree with Materialized Data for Supporting Range Queries on OLAP-Data. Proc. DEXA Workshop.
  9. N. Beckmann, and H. Kriegel (1990). The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. Proc. ACM SIGMOD Intl. Conf.
  10. N. Roussopoulos, S.K and F. Vincent (1995). Nearest neighbor Query. Proc. ACM SIGMOD Intl. Conf.
  11. N. Roussopoulos, Y. K and M. Roussopoulos (1997). Cubetree: Organizaiton of and Bulk Incremental Updates on the Data Cube. Proc. ACM SIGMOD Intl. Conf.
  12. R. Agrawal, A. Gupta, and S. Sarawagi (1997). ModelingMultidimesnional Databases. Proc. Intl. Conf. on Data Engineering (ICDE).
  13. S. Hon, B. Song, and S. Lee (2001). Efficient Execution of Range-Aggregate Queries in Data Warehouse Environments. Proc. the 20th Intl. Conf. on CONCEPTUAL MODELING.
  14. S. Hong, B. Song and S. Lee (2001). Efficient Execution of Range-Aggregate Queries in Data Warehouse Environments, Proc. 20th international Conference on CONCEPTUAL MODELING (ER 2001).
  15. V. Markl, F. Ramsak, and R. Bayer (1999a). Improving OLAP Performance by Multidimensional Hierarchical Clustering. Proc. IDEAS Intl. Synposium.
  16. V. Markl, M. Zirkel, and R. Bayer (1999b). Processing Operations with Restrictions in Relational Database Management Systems without external Sorting. Proc. Intl. Conf. on Data Engineering.
  17. Y. Feng, A. Makinouchi, and H. Ryu (2004). Improving Query Performance on OLAP-Data Using Enhanced Multidimensional Indices. Proc. ICEIS Intl. Conf.
  18. Y. Kotidis, and N. Roussopoulos (1998). An Alternative Storage Organization for ROLAP Aggregate Views Based on Cubetrees. Proc. ACM SIGMOD Intl. Conf.
Download


Paper Citation


in Harvard Style

Feng Y., Wang Z. and Makinouchi A. (2005). A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA . In Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 972-8865-19-8, pages 346-352. DOI: 10.5220/0002552703460352


in Bibtex Style

@conference{iceis05,
author={Yaokai Feng and Zhibin Wang and Akifumi Makinouchi},
title={A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA},
booktitle={Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2005},
pages={346-352},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002552703460352},
isbn={972-8865-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - A HYBRID CLUSTERING CRITERION FOR R*-TREE ON BUSINESS DATA
SN - 972-8865-19-8
AU - Feng Y.
AU - Wang Z.
AU - Makinouchi A.
PY - 2005
SP - 346
EP - 352
DO - 10.5220/0002552703460352