Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2003, Department of Computers, Czech Technical University, Prague Department of Software Engineering, Charles University, Prague Department of Computer Science, VŠB-Technical University of Ostrava CSKI, OS Computer Science and Society
Abstract. In the area of multidimensional databases, the UB-tree represents a promising indexing structure. A key feature of any multidimensional indexing structure is its ability to effectively perform the range queries. In the case of UB-trees, we have proposed an advanced range query algorithm making possible to operate on indices of high dimensionality. In this paper we present experimental results of this range query algorithm. Keywords: UB-tree, range query, benchmarks, DRU algorithm
1998
We investigate the usability and performance of the UB-Tree (universal B-Tree) for multidimensional data, as they arise in all relational databases and in particular in data- warehousing and data-mining applications. The UB-Tree is balanced and has all the guaranteed performance characteristics of B-Trees, i.e., it requires linear space for storage and logarithmic time for the basic operations of insertion, retrieval
Information Systems, 2006
In multi-dimensional databases the essential tool for accessing data is the range query (or window query). In this paper we introduce a new algorithm of processing range query in universal B-tree (UB-tree), which is an index structure for searching in multi-dimensional databases. The new range query algorithm (called the DRU algorithm) works efficiently, even for processing high-dimensional databases. In particular, using the DRU algorithm many of the UBtree inner nodes need not to be accessed. We explain the DRU algorithm using a simple geometric model, providing a clear insight into the problem. More specifically, the model exploits an interesting relation between the Z-curve and generalized quad-trees. We also present experimental results for the DRU algorithm implementation. r
2009
Semantic query optimization consists in restricting the search space in order to reduce the set of objects of interest for a query. This paper presents an indexing method based on UB-trees and a static analysis of the constraints associated to the views of the database and to any constraint expressed on attributes. The result of the static analysis is a partitioning of the object space into disjoint blocks. Through Space Filling Curve (SFC) techniques, each fragment (block) of the partition is assigned a unique identifier, enabling the efficient indexing of fragments by UB-trees. The search space corresponding to a range query is restricted to a subset of the blocks of the partition. This approach has been developed in the context of a KB-DBMS but it can be applied to any relational system.
2006 10th International Database Engineering and Applications Symposium (IDEAS'06), 2006
Multi-dimensional data structures are applied in many real index applications, i.e. data mining, indexing multimedia data, indexing nonstructured text documents and so on. Many index structures and algorithms have been proposed. There are two major approaches to multi-dimensional indexing. These are, data structures to indexing metric and vector spaces. The R-tree, R*-tree, and UB-tree are representatives of the vector data structures. These data structures provide efficient processing for many types of queries, i.e. point queries, range queries and so on. As far as the vector data structures are concerned the range query retrieves all points in defined hyper box in an n-dimensional space. The narrow range query is a significant type of the range query. Its processing is inefficient in the vector data structures. Moreover, the efficiency decreases from increase dimension of an indexed space. We depict an application of the signature for more efficient processing of narrow range queries. The approach puts the signature into the R-tree but native functionalities are preserved, i.e. the range query algorithm for general range query. The novel data structure is called the Signature R-tree. This data structure is more resistant to the curse of dimensionality.
2006
Multi-dimensional data structures are applied in many real index applications, i.e. data mining, indexing multimedia data, indexing nonstructured text documents and so on. Many index structures and algorithms have been proposed. There are two major approaches to multi-dimensional indexing. These are, data structures to indexing metric and vector spaces. The R-tree, R*-tree, and UB-tree are representatives of the vector data structures. These data structures provide efficient processing for many types of queries, i.e. point queries, range queries and so on. As far as the vector data structures are concerned the range query retrieves all points in defined hyper box in an n-dimensional space. The narrow range query is a significant type of the range query. Its processing is inefficient in the vector data structures. Moreover, the efficiency decreases from increase dimension of an indexed space. We depict an application of the signature for more efficient processing of narrow range queries. The approach puts the signature into the R-tree but native functionalities are preserved, i.e. the range query algorithm for general range query. The novel data structure is called the Signature R-tree. This data structure is more resistant to the curse of dimensionality.
2001
Nowadays feature vector based similarity search is increasingly emerging in database systems. Consequently, many multidimensional data index techniques have been widely introduced to database researcher community. These index techniques are categorized into two main classes: SP (space partitioning)/KD-tree-based and DP (data partitioning)/R-tree-based. Recently, a hybrid index structure has been proposed. It combines both SP/KDtree-based and DP/R-tree-based techniques to form a new, more efficient index structure. However, weaknesses are still existing in techniques above. In this paper, we introduce a novel and flexible index structure for multidimensional data, the SH-tree (Super Hybrid tree). Theoretical analyses show that the SHtree is a good combination of both techniques with respect to both presentation and search algorithms. It overcomes the shortcomings and makes use of their positive aspects to facilitate efficient similarity searches.
Distributed and Parallel Databases, 2005
Multidimensional indexing is concerned with the indexing of multi-attributed records, where queries can be applied on some or all of the attributes. Indexing multi-attributed records is referred to by the term multidimensional indexing because each record is viewed as a point in a multidimensional space with a number of dimensions that is equal to the number of attributes. The values of the point coordinates along each dimension are equivalent to the values of the corresponding attributes. In this paper, the PN-tree, a new index structure for multidimensional spaces, is presented. This index structure is an efficient structure for indexing multidimensional points and is parallel by nature. Moreover, the proposed index structure does not lose its efficiency if it is serially processed or if it is processed using a small number of processors. The PN-tree can take advantage of as many processors as the dimensionality of the space. The PN-tree makes use of B + -trees that have been developed and tested over years in many DBMSs. The PN-tree is compared to the Hybrid tree that is known for its superiority among various index structures. Experimental results show that parallel processing of the PN-tree reduces significantly the number of disk accesses involved in the search operation. Even in its serial case, the PN-tree outperforms the Hybrid tree for large database sizes.
Lecture Notes in Computer Science, 1998
In order to answer efficiently range queries in 2-d R-trees, first we sort queries by means of a space filling curve, then we group them together, and finally pass them for processing. Initially, we consider grouping of pairs of requests only, and give two algorithms with exponential and linear complexity. Then, we generalize the linear method, grouping more than two requests per group. We evaluate these methods under different LRU buffer sizes, measuring the cache misses per query. We present experimental results based on real and synthetic data. The results show that careful query scheduling can improve substantially the overall performance of multiple range query processing.
2000
Multi-dimensional data structures are applied in many real index applications, i.e. data min- ing, indexing multimedia data, indexing non- structured text documents and so on. Many index structures and algorithms have been proposed. There are two major approaches to multi-dimensional indexing. These are, data structures to indexing metric and vec- tor spaces. The R-tree, R*-tree, and UB-tree are representatives of
2009
Mobile query processing is, currently, a very active research field. Range and nearest neighbor queries are commonly used in spatiotemporal databases and location based services (LBS). In this paper, we focus on finding nearest neighbors of a query point within a certain distance range. We propose a new indexing structure CN-tree, Compact N-tree, based on a recent indexing technique called N-tree. CN-tree joins efficiency of N-tree's data partitioning scheme to pertinent objects' approximation with minimal bounding rectangles of R-trees which are reported to be the best performing for range search. We show how we use the approximation in constructing CN-tree and, then, how this index can support range queries efficiently by minimizing computation of distances and avoiding overlapping of minimal bounding rectangles. The experimental results through the comparison with the well know R*-tree, show that the proposed CN-tree widely outperforms R*-tree as an in-memory index and it presents competitive performances when used as an in-disk index.
Since the main memory is expensive and volatile persistence data cannot be kept in main memory. Most databases utilize the secondary/tertiary storage. The main overhead of using a secondary storage is access time. This is usually high in multi-dimensional databases like OLTP that are characterized by recurrent updates and queries. The use of secondary/tertiary storage inherently suggests indexing. Indexing helps to retrieve/store the required data/ data segments faster than iterating through the table. Indexing enhances speed of querying. In multi-dimensional databases (like OLAP databases) the essential tool for accessing data is the range query (or window query). In extant databases, B-Trees and its variants are the convention for indexing. But the major setback of B-trees is that they are single attribute index structures. Which implies that the record of such database are ordered by a particular attribute usually but not necessarily the primary key. This limits range query restriction on one particular attribute of the table. The use of multiple B-trees indexing for various attributes of a table is the convention for achieving range query for other attributes. The use of multi-indexing is additive and poses so many drawbacks (additional space required and speed is hampered). With the exponential burst of data, there is need for a better data structure with efficient query algorithm that has high storage capacity and also considerably fast (algorithm) for multidimensional range queries. This paper discusses a multidimensional indexing structure that is fast and also consumes virtually equivalent space as though is a single attribute structure. Experiment show that the structure has multiplicative complexities and is immune to the curse of dimensionality.
2014
Range queries are a widely-used type of similarity queries that find all objects within a given distance from the query object. In this paper, we propose an approximate range query algorithm for the NDtree, a multi-dimensional index for vectors with nonordered discrete components. By sacrificing a little accuracy, approximate algorithms generally can greatly improve search performance. Our proposed approximate algorithm maintains a priority queue of tree nodes whose bounding rectangles (BR) intersect the query sphere. But it only accesses a user-specified portion of the queue. We propose a novel volumebased weighting scheme for the priority queue. The idea is that tree nodes whose BR has a larger intersection with the query sphere contain more result objects, thus should be accessed earlier. A closed-form formula is derived to calculate the volume of an intersection. Our experimental study using both synthetic and real data shows that the proposed algorithm can significantly improve...
Proc. of the 8th Int'l Conf. on Database Systems …, 2003
Submitten at VLDB, 2004
Multi-dimensional data structures are applied in many real index applications, i.e. data mining, indexing multimedia data, indexing nonstructured text documents and so on. Many index structures and algorithms have been proposed. There are two major approaches to multi-dimensional indexing. These are, data structures to indexing metric and vector spaces. The R-tree, R*-tree, and UB-tree are representatives of the vector data structures. These data structures provide efficient processing for many types of queries, i.e. point queries, range queries and so on. As far as the vector data structures are concerned the range query retrieves all points in defined hyper box in an n-dimensional space. The narrow range query is a significant type of the range query. Its processing is inefficient in the vector data structures. Moreover, the efficiency decreases from increase dimension of an indexed space. We depict an application of the signature for more efficient processing of narrow range queries. The approach puts the signature into the R-tree but native functionalities are preserved, i.e. the range query algorithm for general range query. The novel data structure is called the Signature R-tree. This data structure is more resistant to the curse of dimensionality.
2017
Efficient evaluation of selection predicates (e.g., range predicates) defined on multiple columns of the same table is a difficult, but nevertheless important task. As we have seen an enormous increase of data within the last decade, efficient multi-dimensional selection predicate evaluation becomes more important. This is especially important for scientific data management tasks, where we often face data sets that need to be filtered based on several dimensions. So far, the state-of-the-art solution strategy is to apply highly optimized sequential scans. However, the intermediate results are often large, while the final query result often only contains a small fraction of the data set. This is due to the combined selectivity of all predicates. We propose Elf a new tree-based approach to efficiently support such queries. Our structure indexes densely populated sub-spaces allowing for efficient pruning. Keywords— data analytics, indexing, main-memory databases, storage structures.
Information Systems, 1982
A new method for multiple attribute indexing, the Multidimensional B-Tree (MBDT), is developed. This method is well suited for dynamic databases, since it handles several types of associative queries efficiently and requires low-cost maintenance. Algorithms and search strategies for exact match, partial match, and range queries are presented and statistical procedures are given to estimate the average and worst case retrieval times. The applicability of our organization to practical databases is discussed and analytical tradeoffs with regard to index organizations based on k-d trees are established.
OLYMPIADS IN INFORMATICS, 2015
We present new results on Binary Indexed Trees in order to efficiently solve Range Minimum Queries. We introduce a way of using the Binary Indexed Trees so that we can answer different types of queries, e.g. the range minimum query, in O (log N) time complexity per operation, outperforming in speed similar data structures like Segment/Range Trees or the Sparse Table Algorithm.
2000
Multidimensional access methods like the UB-Tree can be used to accelerate almost any query processing operation, if proper query processing algorithms are used: Relational queries or SQL queries consist of restrictions, projections, ordering, grouping and aggregation, and join operations. In the presence of multidimensional restrictions or sorting, multidimensional range query or Tetris algorithms efficiently process these operations. In addition, these algorithms also efficiently support queries that generate some hierarchical restrictions (for instance by following 1:n foreign key relationships). In this paper we investigate the impacts on query processing in RDBMS when using UB-Trees and multidimensional hierarchical clustering for physical data organization. We illustrate the benefits by performance measurements of queries for a star schema from a real world application of a SAP business information warehouse. The performance results reported in this paper were measured with our prototype implementation of UB-Trees on top of Oracle 8. We compare the performance of UB-Trees to native query processing techniques of Oracle, namely access via an index organized table, which essentially stores a relation in a clustered B*-Tree, and access via a full table scan of an entire relation. In addition we measure the performance of the intersection of multiple bitmap indexes to answer multidimensional range queries.
2001
Only few multidimensional access methods have made their way into commercial relational DBMS. Even if a RDBMS ships with a multidimensional index, the multidimensional index usually is an add-on like Oracle SDO, which is not integrated into the SQL interpreter, query processor and query optimizer of the DBMS kernel. Our demonstration shows TransBase HyperCube, a commercial RDBMS, whose kernel fully integrates the UB-Tree, a multidimensional extension of the B-Tree. This integration was performed in an ESPRIT project funded by the European Commission. We put the main emphasis of our demonstration on the application of UB-Tree indexes in realworld databases for OLAP. However, we also address general issues of UB-Trees like creation, spacerequirements, or comparison to other indexing methods.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.