Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Proceedings International Database Engineering and Applications Symposium
Advanced data warehouses and web databases have set the demand for processing large sets of time ranges, quality classes, fuzzy data, personalized data and extended objects. Since, all of these data types can be mapped to intervals, interval indexing can dramatically speed up or even be an enabling technology for these new applications. We introduce a method for managing intervals by indexing the dual space with the UB-Tree. We show that our method is an effective and efficient solution, benefitting from all good characteristics of the UB-Tree, i.e., concurrency control, worst case guarantees for insertion, deletion and update as well as efficient query processing. Our technique can easily be integrated into an RDBMS engine providing the UB-Tree as access method. We also show that our technique is superior and more flexible to previously suggested techniques.
Proc. 26th Int. Conf. on Very Large …, 2000
Modern database applications show a growing demand for efficient and dynamic management of intervals, particularly for temporal and spatial data or for constraint handling. Common approaches require the augmentation of index structures which, however, is not supported by existing relational database systems. By design, the new Relational Interval Tree 1 (RI-tree) employs built-in indexes on an as-they-are basis and is easy to implement. Whereas the functionality and efficiency of the RI-tree is supported by any off-the-shelf relational DBMS, it is perfectly encapsulated by the object-relational data model. The RI-tree requires O(n/b) disk blocks of size b to store n intervals, O(log b n) I/O operations for insertion or deletion, and O(h · log b n + r/b) I/Os for an intersection query producing r results. The height h of the virtual backbone tree corresponds to the current expansion and granularity of the data space but does not depend on n. As demonstrated by our experimental evaluation on an Oracle8i server, competing dynamic interval access methods are outperformed by factors of up to 42 for disk accesses and 4.9 for query response time.
Lecture Notes in Computer Science, 2001
Intervals represent a fundamental data type for temporal, scientific, and spatial databases where time stamps and point data are extended to time spans and range data, respectively. For OLTP and OLAP applications on large amounts of data, not only intersection queries have to be processed efficiently but also general interval relationships including before, meets, overlaps, starts, finishes, contains, equals, during, startedBy, finishedBy, overlappedBy, metBy, and after. Our new algorithms use the Relational Interval Tree, a purely SQL-based and objectrelationally wrapped index structure. The technique therefore preserves the industrial strength of the underlying RDBMS including stability, transactions, and performance. The efficiency of our approach is demonstrated by an experimental evaluation on a real weblog data set containing one million sessions.
2005
With the increasing occurrence of temporal and spatial data in present-day database applications, the interval data type is adopted by more and more database systems. For an efficient support of queries that contain selections on interval attributes as well as simple-valued attributes (e. g. numbers, strings) at the same time, special index structures are required supporting both types of predicates in combination. Based on the Relational Interval Tree, we present various indexing schemes that support such combined queries and can be integrated in relational database systems with minimum effort. Experiments on different query types show superior performance for the new techniques in comparison to competing access methods.
Lecture Notes in Computer Science, 1998
To support temporal operators and to increase the efficiency of temporal queries, indexing based on temporal attributes is required. We consider the problem of indexing the temporal dimension in valid time databases. We assume that the temporal information of data objects are represented as valid time intervals that have to be managed dynamically by an efficient index structure. Unlike the time intervals in transaction time databases, valid time intervals can be inserted, deleted, and modified at any point in time. Furthermore, their lifespans can go beyond the current time point and extend into the future. We propose an indexing scheme that uses augmented B+trees called Interval B+trees for indexing a dynamic set of valid time intervals. Interval B+trees (IB+trees) use beginning points of the intervals as key points and keep maximum end point information of its subtrees for each internal node. We introduce an algorithm to apply time-splits at the leaf level of the IB+tree that would partition long valid time intervals into disjoint subintervals and distribute them among several leaf nodes to increase efficiency of search operation, especially for timeslice queries. We compared IB+trees with time-splits to one dimensional R-trees and observed that while their performances for timeslice queries are comparable, IB+trees are far more superior for many temporal queries that are based on beginning points of time intervals. This is expected as the IB+trees use the beginning points of intervals as keys and therefore support such queries naturally. We also show the extensions to our indexing scheme for handling open ended valid time intervals (valid time intervals whose lifespans extend into future indefinitely), and valid time intervals whose end points move along the current timeline.
IEEE Transactions on Knowledge and Data Engineering, 1997
IXSQL, an extension to SQL, is proposed for the management of interval data. IXSQL is syntactically and semantically upwards consistent with SQL2. Its specification has been based both on theoretical results and actual user requirements for the management of temporal data, a special case of interval data. Design decisions and implementation issues are also discussed.
2004
Intervals play an important role in various kinds of database-applications in practice, for example in historical, spatial, and temporal databases. As a consequence, there is a practical need for a clear and proper treatment of various useful operations on intervals and interval sets in a database context. However, the semantics of some important operations on interval sets are not always treated or not treated very clearly in the literature; e.g., often they are defined in an algorithmic rather than a declarative manner. Moreover, implementation proposals are often not as straightforward as they could be. This paper presents a declarative treatment of various operations on interval sets, also introducing some new notions (such as ordered interval sets, their visible points, and their surface). Then the paper formally "links" such (mathematical) intervals to their database representations. Finally the paper provides straightforward translations from these formal database representations to standard SQL, without the need for SQL extensions.
2000
Multidimensional access methods like the UB-Tree can be used to accelerate almost any query processing operation, if proper query processing algorithms are used: Relational queries or SQL queries consist of restrictions, projections, ordering, grouping and aggregation, and join operations. In the presence of multidimensional restrictions or sorting, multidimensional range query or Tetris algorithms efficiently process these operations. In addition, these algorithms also efficiently support queries that generate some hierarchical restrictions (for instance by following 1:n foreign key relationships). In this paper we investigate the impacts on query processing in RDBMS when using UB-Trees and multidimensional hierarchical clustering for physical data organization. We illustrate the benefits by performance measurements of queries for a star schema from a real world application of a SAP business information warehouse. The performance results reported in this paper were measured with our prototype implementation of UB-Trees on top of Oracle 8. We compare the performance of UB-Trees to native query processing techniques of Oracle, namely access via an index organized table, which essentially stores a relation in a clustered B*-Tree, and access via a full table scan of an entire relation. In addition we measure the performance of the intersection of multiple bitmap indexes to answer multidimensional range queries.
Lecture Notes in Computer Science, 2004
The efficient management of interval sequences represents a core requirement for many temporal and spatial database applications. With the Relational Interval Tree (RI-tree), an efficient access method has been proposed to process intersection queries of spatial objects encoded by interval sequences on top of existing object-relational database systems. This paper complements that approach by effective and efficient models to estimate the selectivity and the I/O cost of interval sequence intersection queries in order to guide the cost-based optimizer whether and how to include the RI-tree into the execution plan. By design, the models immediately fit to common extensible indexing/optimization frameworks, and their implementations exploit the built-in statistics facilities of the database server. According to our experimental evaluation on an Oracle database, the average relative error of the estimated query results and costs lies in the range of 0% to 32%, depending on the size and the structural complexity of the query objects.
1992
We are given a large population database that contains information about population instances. The population is known to comprise of m groups, but the population instances are not labeled with the group identification.
Reliable Computing, 1996
Computer Science and Information Systems, 2010
The need for efficient access and management of time dependent data in modern database applications is well recognised and researched. Existing access methods are mostly derived from the family of spatial R-tree indexing techniques. These techniques are particularly not suitable to handle data involving open ended intervals, which are common in temporal databases. This is due to overlapping between nodes and huge dead space found in the database. In this study, we describe a detailed investigation of a new approach called "Triangular Decomposition Tree" (TD-Tree). The underlying idea for the TD-Tree is to manage temporal intervals by virtual index structures relying on geometric interpretations of intervals, and a space partition method that results in an unbalanced binary tree. We demonstrate that the unbalanced binary tree can be efficiently manipulated using a virtual index. We also show that the single query algorithm can be applied uniformly to different query types without the need of dedicated query transformations. In addition to the advantages related to the usage of a single query algorithm for different query types and better space complexity, the empirical performance of the TDtree has been found to be superior to its best known competitors.
2009
Semantic query optimization consists in restricting the search space in order to reduce the set of objects of interest for a query. This paper presents an indexing method based on UB-trees and a static analysis of the constraints associated to the views of the database and to any constraint expressed on attributes. The result of the static analysis is a partitioning of the object space into disjoint blocks. Through Space Filling Curve (SFC) techniques, each fragment (block) of the partition is assigned a unique identifier, enabling the efficient indexing of fragments by UB-trees. The search space corresponding to a range query is restricted to a subset of the blocks of the partition. This approach has been developed in the context of a KB-DBMS but it can be applied to any relational system.
1998
We investigate the usability and performance of the UB-Tree (universal B-Tree) for multidimensional data, as they arise in all relational databases and in particular in data- warehousing and data-mining applications. The UB-Tree is balanced and has all the guaranteed performance characteristics of B-Trees, i.e., it requires linear space for storage and logarithmic time for the basic operations of insertion, retrieval
arXiv (Cornell University), 2022
Temporal information plays a crucial role in many database applications, however support for queries on such data is limited. We present an index structure, termed RD-INDEX, to support range-duration queries over interval timestamped relations, which constrain both the range of the tuples' positions on the timeline and their duration. RD-INDEX is a grid structure in the two-dimensional space, representing the position on the timeline and the duration of timestamps, respectively. Instead of using a regular grid, we consider the data distribution for the construction of the grid in order to ensure that each grid cell contains approximately the same number of intervals. RD-INDEX features provable bounds on the running time of all the operations, allow for a simple implementation, and supports very predictable query performance. We benchmark our solution on a variety of datasets and query workloads, investigating both the query rate and the behavior of the individual queries. The results show that RD-INDEX performs better than the baselines on rangeduration queries, for which it is explicitly designed. Furthermore, it outperforms specialized indexes also on workloads containing queries constraining either only the duration or the range.
The VLDB Journal
The interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in pa...
Advanced Modeling and Optimization, 2009
Abstract. Interval graph is a very important subclass of intersection graphs and perfect graphs. It has many applications in different real life situations. The problems on interval graph are solved by using different data structures among them interval tree is very useful. ...
Lecture Notes in Computer Science, 1994
We identify a number of probicms concerning the management of interval data and propose efficient algorithms in the case of 2dimensional interval relations. The approach is of practical importance and has many applications, one of which is spatiotemporal databases.
1997
In this paper we describe a special caching technique, called UB-Cache, which is tailored to work with data organized as a UB-Tree [6], [7], a novel multidimensional datastructure. The UB-Cache makes it possible to read data from disk in arbitrary sort order according to those attributes that are used in the UB-Tree. This property can be used to speed up all operations of relational algebra substantially. We assume that the reader is familiar with the UB-Tree as described in [6] or [7].
Information Systems, 2006
In multi-dimensional databases the essential tool for accessing data is the range query (or window query). In this paper we introduce a new algorithm of processing range query in universal B-tree (UB-tree), which is an index structure for searching in multi-dimensional databases. The new range query algorithm (called the DRU algorithm) works efficiently, even for processing high-dimensional databases. In particular, using the DRU algorithm many of the UBtree inner nodes need not to be accessed. We explain the DRU algorithm using a simple geometric model, providing a clear insight into the problem. More specifically, the model exploits an interesting relation between the Z-curve and generalized quad-trees. We also present experimental results for the DRU algorithm implementation. r
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.