Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1981, Bit
AI
The Multidimensional B-tree (MDBT) presents a novel approach for multiple attribute indexing, enhancing efficiency in searching associative queries. It significantly improves upon traditional indexing structures by enabling dynamic storage management, accommodating frequent insertions and deletions with minimal maintenance costs. The paper details algorithms for maintaining the MDBT structure while allowing for effective storage reclamation, demonstrating its practical advantages in dynamic database environments.
In multimedia databases, the spatial index structures based on trees (like R-tree, M-tree) have been proved to be efficient and scalable for low-dimensional data retrieval. However, if the data dimensionality is too high, the hierarchy of nested regions (represented by the tree nodes) becomes spatially indistinct. Hence, the query processing deteriorates to inefficient index traversal (in terms of random-access I/O costs) and in such case the tree-based indexes are less efficient than the sequential search. This is mainly due to repeated access to many nodes at the top levels of the tree. In this paper we propose a modified storage layout of tree-based indexes, such that nodes belonging to the same tree level are stored together. Such level-ordered storage allows to prefetch several top levels of the tree to the buffer pool by only a few or even a single contiguous I/O operation (i.e. one-seek read). The experimental results show that our approach can speedup the tree-based search significantly.
Information Systems, 1982
A new method for multiple attribute indexing, the Multidimensional B-Tree (MBDT), is developed. This method is well suited for dynamic databases, since it handles several types of associative queries efficiently and requires low-cost maintenance. Algorithms and search strategies for exact match, partial match, and range queries are presented and statistical procedures are given to estimate the average and worst case retrieval times. The applicability of our organization to practical databases is discussed and analytical tradeoffs with regard to index organizations based on k-d trees are established.
Communications of the ACM, 1975
This paper develops the multidimensional binary search tree (or k-d tree, where k is the dimensionality of the search space) as a data structure for storage of information to be retrieved by associative searches. The k-d tree is defined and examples are given. It is shown to be quite efficient in its storage requirements. A significant advantage of this structure is that a single data structure can handle many types of queries very efficiently. Various utility algorithms are developed; their proven average running times in an n record file are : insertion, O(log n); deletion of the root, 0 (n (k--1)/k) ; deletion of a random node, O(log n); and optimization (guarantees logarithmic performance of searches), 0 (n log n).
1997
We propose a new multi-attribute index. Our approach combines the hB-tree, a multi-attribute index, and the Π-tree, an abstract index which offers efficient concurrency and recovery methods. We call the resulting method the hB Π -tree. We describe several versions of the hB Π -tree, each using a different node-splitting and index-term-posting algorithm. We also describe a new node deletion algorithm. We have implemented all the versions of the hB Π -tree. Our performance results show that even the version that offers no performance guarantees, actually performs very well in terms of storage utilization, index size (fan-out), exact-match and range searching, under various data types and distributions. We have also shown that our index is fairly insensitive to increases in dimension. Thus, it is suitable for indexing high-dimensional applications. This property and the fact that all our versions of the hB Π -tree can use the Π-tree concurrency and recovery algorithms make the hB Πtree a promising candidate for inclusion in a general-purpose DBMS.
Proc. of the 8th Int'l Conf. on Database Systems …, 2003
Proceedings of the Sixth Euromicro Workshop on Parallel and Distributed Processing - PDP '98 -, 1998
to communicate among themselves by means of a ,4 wide class os multidimensional indexes employs a recursive partitioning of the data space as the kd-tree does. In this paper we present the m-Q-tree as a multidimensional data structure that can achieve the maximum degree of 2" children in every node (where m is the number of index attributes) and a maximum of only one underflow page per node. We describe the m-Q-tree, and give searching and inserting algorithms. In order to develop a solution for building the m-Q-tree, we dejne and use a conceptual tool, called prejx gruph, which permits us to manage the regions associated to all sons of every node. The proposed algorithm is of order Ofin). Finally, we present the results of a series of tests which indicate that the structure performs well. m-Q-tree gives a general technique for declustering data in a parallel database. We propose m-Q-tree as a new general access method which permits the exploitation of the potential parallelism of all relational operations, in addition to favour the execution of complex queries, including dqferent kinds of conditions over several attributesfor one or more relations.
2005
We present the interpolation search tree (ISB-tree), a new cache-aware indexing scheme that supports update operations (insertions and deletions) in O(1) worst-case (w.c.) block transfers and search operations in O(logB log n) expected block transfers, where B represents the disk block size and n denotes the number of stored elements. The expected search bound holds with high probability for a large class of (unknown) input distributions. The w.c. search bound of our indexing scheme is O(logB n) block transfers. Our update and expected search bounds constitute a considerable improvement over the O(logB n) w.c. block transfer bounds for search and update operations achieved by the B-tree and its numerous variants. This is also suggested by a set of preliminary experiments we have carried out. Our indexing scheme is based on an externalization of a main memory data structure based on interpolation search.
We propose the PATRICIA-hypercube-tree, or PH-tree, a multi-dimensional data storage and indexing structure. It is based on binary PATRICIA-tries combined with hypercubes for efficient data access. Space efficiency is achieved by combining prefix sharing with a space optimised implementation. This leads to storage space requirements that are comparable or below storage of the same data in non-index structures such as arrays of objects. The storage structure also serves as a multi-dimensional index on all dimensions of the stored data. This enables efficient access to stored data via point and range queries. We explain the concept of the PH-tree and demonstrate the performance of a sample implementation on various datasets and compare it to other spatial indices such as the kD-tree. The experiments show that for larger datasets beyond 10^7 entries, the PH-tree increasingly and consistently outperforms other structures in terms of space efficiency, query performance and update performance. For some highly skewed datasets, it even shows super-constant performance, becoming faster for larger datasets.
ACM Transactions on Database Systems, 1980
A modified version of the multiple attribute tree (MAT) database organization, which uses a compact directory, is discussed. An efficient algorithm to process the directory for carrying out the node searches is presented. Statistical procedures are developed to estimate the number of nodes searched and the number of data blocks retrieved for most general and complex queries. The performance of inverted file and modified MAT organizations are compared using six real-life databases and four types of query complexities. Careful tradeoffs are established in terms of storage and access times for directory and data, query complexities, and database characteristics.
Sovremennye Informacionnye Tehnologii i IT-obrazovanie, 2018
We present a new dynamic index structure for multidimensional data. The considered index structure is based on an extended grid file concept. Strengths and weaknesses of the grid files were analyzed. Based on that analysis we proposed to strengthen the concept of grid files by considering their stripes as linear hash tables, introducing the concept of chunk and representing the grid file structure as a graph. As a result we significantly reduced the amount of disk operations. Efficient algorithms for storage and access of index directory are proposed, in order to minimize memory usage and lookup operations complexities. Estimations of complexities for these algorithms are presented. A comparison of our approach to support effective grid file structure with other known approaches is presented. This comparison shows effectiveness of suggested metadata storage environment. An estimation of directory size is presented. A prototype to support of our grid file concept has been created and...
As a member of R-tree family, R*-tree is widely used in multimedia databases and spatial databases, in which NN (Nearest Neighbor) search is very popular. According to our investigations, (1) the degree of objects clustering in the leaf nodes is a very important factor on performance of NN search; (2) Normally, in R*-tree, its objects are not well-clustered in their leaf nodes. This paper proposes a new index structure, called Clustering-Based R*-tree (denoted CBR*-tree), for static databases by introducing clustering technology to R*-tree. Although some packing algorithms for R-trees have been proposed, all of them try to pack the same (or roughly same) number of objects in each leaf node, which often result in that the distribution of objects in leaf nodes can not reflect their actual distribution. The experimental results show that the CBR*-tree has better NN search performance than R*-tree and packed R-trees.
Distributed and Parallel Databases, 2005
Multidimensional indexing is concerned with the indexing of multi-attributed records, where queries can be applied on some or all of the attributes. Indexing multi-attributed records is referred to by the term multidimensional indexing because each record is viewed as a point in a multidimensional space with a number of dimensions that is equal to the number of attributes. The values of the point coordinates along each dimension are equivalent to the values of the corresponding attributes. In this paper, the PN-tree, a new index structure for multidimensional spaces, is presented. This index structure is an efficient structure for indexing multidimensional points and is parallel by nature. Moreover, the proposed index structure does not lose its efficiency if it is serially processed or if it is processed using a small number of processors. The PN-tree can take advantage of as many processors as the dimensionality of the space. The PN-tree makes use of B + -trees that have been developed and tested over years in many DBMSs. The PN-tree is compared to the Hybrid tree that is known for its superiority among various index structures. Experimental results show that parallel processing of the PN-tree reduces significantly the number of disk accesses involved in the search operation. Even in its serial case, the PN-tree outperforms the Hybrid tree for large database sizes.
We consider two tree-based indexing schemes that are widely used in practical systems as the basis for both primary and secondary key indexing. We define B-tree and its features, advantages, disadvantages of B-tree. The difference between B+-tree and B-tree has also been discussed. We show the algorithm, examples and figures in the context of B+-tree.
EUROMICRO 97. Proceedings of the 23rd EUROMICRO Conference: New Frontiers of Information Technology (Cat. No.97TB100167), 2000
The paper focuses on the indexing on non-primitive (complex) values of attributes in an object management system. A new index structure for indexing on set (multivalued) attributes is proposed. This structure is based an a partial order imposed on the values of the indexed attribute, which are subsets of a set of primitive values. It is shown that the proposed index allows the system to eficiently perform typical set operators that are postulated to be applied in object query languages (is-equal, is-subset, is-superset), without performing any costly operations on lists of object identiJiers that would be necessary in traditional index structures. The new index structure, called partial-order tree, is described and algorithms performing the set operators are outlined
Journal of Visual Communication and Image Representation, 1998
As in conventional DataBase Management Systems (DBMSs), to allow users to efficiently access and retrieve data objects, a MultiMedia DataBase Management System (MMDBMS) must employ an effective access method such as indexing and hashing. This paper provides a survey of treebased multidimensional indexing techniques for MMDBMSs that maintain image data represented as feature vectors. These techniques support such data while maintaining desirable characteristics of a Btree, an index structure most commonly used in traditional DBMSs. In this survey, we provide descriptions of each tree as well as give examples of the different data organization schemes. We also describe the advantages and disadvantages of using each technique. In addition, we provide classifications of the trees using several different properties. These classifications should assist researchers in identifying the strengths and weaknesses of any new indexing technique they develop as well as help users determine the most appropriate data structure for their applications.
B-tree and R-tree are two basic index structures; many different variants of them are proposed after them. Different variants are used in specific application for the performance optimization. In this paper different variants of B-tree and R-tree are discussed and compared. Index structures are different in terms of structure, query support, data type support and application. Index structure’s structures are discussed first. B-tree and its variants are discussed and them R-tree and its variants are discussed. Some structures example is also shown for the more clear idea. Then comparison is made between all structure with respect to complexity, query type support, data type support and application.
IEEE Transactions on Software Engineering, 2000
It is shown how a highly compact representation of binary trees can be used as the basis of two access methods for dynamic files, called BDS-trees and S-trees, respectively. Both these methods preserve key-order and offer easy and efficient sequential access. They are different in the way the compact binary trees are used for searching. With a BDS-tree the search is a digital search using binary digits. Although the S-tree search is performed on a bit-by-bit basis as well, it will appear to be slightly different. Actually, with S-trees the compact binary trees are used to represent separators at low storage costs. As a result, the fan-out, and thus performance, of a B-tree can be improved by using within each index page an S-tree for representing separators efficiently.
Information Processing & Management, 1985
A variety of data structures such as inverted file, multi-lists, quad tree, k-d tree. range tree. polygon tree, quintary tree. multidimensional tries, segment tree. doubly chained tree. the grid file. d-fold tree, super B-tree, Multiple .4ttribute Tree (MAT). etc. have been studied for multidimensional searching and related problems. Physical data base organization, which is an important application of multidimensional searching, is traditionally and mostly handled by employing inverted file. This study proposes MAT data structure for bibliographic file systems. by illustrating the superiority of MAT data structure over inverted file. Both the methods are compared in terms of preprocessing, storage. and query costs. Worst-case complexity analysis of both the methods. for a.partial match query, is carried out in two cases: (a) when directory resides in main memory, (b) when directory resides in secondary memory. In both cases, MAT data structure is shown to be more efficient than the inverted file method. Arguments are given to illustrate the superiority of M.4T data structure in an average case also. An efficient adaptation of MAT data structure. that exploits the special features of MAT structure and bibliographic files. is proposed for bibliographic file systems. In this adaptation, suitable techniques for fixing and ranking of the attributes for MAT data structure are proposed. Conclusions and proposals for future research are presented.
Being popular for managing data dynamically in today's storage systems, fast data insertion, deletion and searching are also concerned with the system's performance. Those criteria are heavily dependent on the way to handle the attributes of the algorithm used because it can determine how large as well as how much the system can hold data and throughput. B+ tree-based indexing algorithm is capable of scaling data logarithmically and so widely used in distributed file system. However, the level of the system's scalability is solely associated with the order and height of the tree. The proposed system modifies the traditional B+ Tree in the form power of 2-based for data expansion and it is designed on object-based file system.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.