Authors:
Yaokai Feng
;
Zhibin Wang
and
Akifumi Makinouchi
Affiliation:
The Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan
Keyword(s):
Multidimensional indices, R*-tree, clustering criterion, Multidimensional range query, TPC-H.
Related
Ontology
Subjects/Areas/Topics:
Data Warehouses and OLAP
;
Databases and Information Systems Integration
;
Enterprise Information Systems
Abstract:
It is well-known that multidimensional indices are efficient to improve the query performance on relational data. As one successful multi-dimensional index structure, R*-tree, a famous member of the R-tree family, is very popular. The clustering pattern of the objects (i.e., tuples in relational tables) among R*-tree leaf nodes is one of the deceive factors on performance of range queries, a popular kind of queries on business data. Then, how is the clustering pattern formed? In this paper, we point out that the insert algorithm of R*-tree, especially, its clustering criterion of choosing subtrees for new coming objects, determines the clustering pattern of the tuples among the leaf nodes. According to our discussion and observations, it becomes clear that the present clustering criterion of R*-tree can not lead to a good clustering pattern of tuples when R*-tree is applied to business data, which greatly degrades query performance. After that, a hybrid clustering criterion for the i
nsert algorithm of R*-tree is introduced. Our discussion and experiments indicate that query performance of R*-tree on business data is improved clearly by the hybrid criterion.
(More)