Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2003, String Processing and …
Dynamic spatial approximation trees (dsa-trees) are efficient data structures for searching metric spaces. However, using enough storage, pivoting schemes beat dsa-trees in any metric space. In this paper we combine both concepts in a data structure that enjoys the features of dsa-trees and that improves query time by making the best use of the available memory. We show experimentally that our data structure is competitive for searching metric spaces.
2003
Dynamic spatial approximation trees (dsa–trees) are efficient data structures for searching metric spaces. However, using enough storage, pivoting schemes beat dsa–trees in any metric space. In this paper we combine both concepts in a data structure that enjoys the features of dsa–trees and that improves query time by making the best use of the available memory. We show experimentally that our data structure is competitive for searching metric spaces.
Hybrid dynamic spatial approximation trees are recently proposed data structures for searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms. These data structures are hybrid schemes, with the full features of dynamic spatial approximation trees and able of using the available memory to improve the query time. It has been shown that they compare favorably against alternative data structures in spaces of medium difficulty.
2003
Hybrid dynamic spatial approximation trees are recently proposed data structures for searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms. These data structures are hybrid schemes, with the full features of dynamic spatial approximation trees and able of using the available memory to improve the query time. It has been shown that they compare favorably against alternative data structures in spaces of medium difficulty. In this paper we complete and improve hybrid dynamic spatial approximation trees, by presenting a new search alternative, an algorithm to remove objects from the tree, and an improved way of managing the available memory. The result is a fully dynamic and optimized data structure for similarity searching in metric spaces.
String Processing and Information Retrieval, 2002
The Spatial Approximation Tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction time, poor performance in low dimensional spaces or queries with high selectivity, and the fact of being a static data structure, that is, once built, one cannot add or delete elements. These facts rule it out for many interesting applications. In this paper we overcome these weaknesses. We present a dynamic version of the sa-tree that handles insertions and deletions, showing experimentally that the price of adding dynamism is rather low. This is remarkable by itself since very few data structures for metric spaces are fully dynamic. In addition, we show how to obtain large improvements in construction and search time for low dimensional spaces or highly selective queries. The outcome is a much more practical data structure that can be useful in a wide range of applications.
2001
The spatial approximation tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown to compare favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The main drawback of the sa-tree is that it is a static data structure, that is, once built, it is difficult to add new elements to it. This rules it out for many interesting applications. In this paper we overcome this weakness. We propose and study several methods to handle insertions in the sa-tree. Some are classical solutions well known in the data structures community, while the most promising ones have been specifically developed considering the particular properties of the sa-tree, and involve new algorithmic insights in the behavior of this data structure. As a result, we show that it is viable to modify the sa-tree so as to permit fast insertions while keeping its good search efficiency
Journal of Experimental Algorithmics, 2009
Proximity searching consists in retrieving from a database those elements that are similar to a query object. The usual model for proximity searching is a metric space where the distance, which models the proximity, is expensive to compute. An index uses precomputed distances to speed up query processing. Among all the known indices, the baseline for performance for about twenty years has been AESA. This index uses an iterative procedure, where at each iteration it first chooses the next promising element ("pivot") to compare to the query, and then it discards database elements that can be proved not relevant to the query using the pivot. The next pivot in AESA is chosen as the one minimizing the sum of lower bounds to the distance to the query proved by previous pivots. In this paper we introduce the new index iAESA, which establishes a new performance baseline for metric space searching. The difference with AESA is the method to select the next pivot. In iAESA, each candidate sorts previous pivots by closeness to it, and chooses the next pivot as the candidate whose order is most similar to that of the query. We also propose a modification to AESA-like algorithms to turn them into probabilistic algorithms.
ACM Transactions on Database Systems, 2002
Metric access methods (MAMs), such as the M-tree, are powerful index structures for supporting similarity queries on metric spaces, which represent a common abstraction for those searching problems that arise in many modern application areas, such as multimedia, data mining, decision support, pattern recognition, and genomic databases. As compared to multi-dimensional (spatial) access methods (SAMs), MAMs are more general, yet they are reputed to lose in flexibility, since it is commonly deemed that they can only answer queries using the same distance function used to build the index. In this paper we show that this limitation is only apparent -thus MAMs are far more flexible than believed -and extend the M-tree so as to be able to support user-defined distance criteria, approximate distance functions to speed up query evaluation, as well as dissimilarity functions which are not metrics. The so-extended M-tree, also called QIC-M-tree, can deal with three distinct distances at a time: 1) a query (user-defined) distance, 2) an index distance (used to build the tree), and 3) a comparison (approximate) distance (used to quickly discard from the search uninteresting parts of the tree). We develop an analytical cost model that accurately characterizes the performance of QIC-M-tree and validate such model through extensive experimentation on real metric data sets. In particular, our analysis is able to predict the best evaluation strategy (i.e. which distances to use) under a variety of configurations, by properly taking into account relevant factors such as the distribution of distances, the cost of computing distances, and the actual index structure. We also prove that the overall saving in CPU search costs when using an approximate distance can be estimated by using information on the data set only -thus such measure is independent of the underlying access method -and show that performance results are closely related to a novel "indexing" error measure. Finally, we show how our results apply to other MAMs and query types.
arXiv (Cornell University), 2015
Emerging location-based systems and data analysis frameworks requires efficient management of spatial data for approximate and exact search. Exact similarity search can be done using space partitioning data structures, such as KD-tree, R*-tree, and ball-tree. In this paper, we focus on ball-tree, an efficient search tree that is specific for spatial queries which use euclidean distance. Each node of a ball-tree defines a ball, i.e. a hypersphere that contains a subset of the points to be searched. In this paper, we propose ball*-tree, an improved ball-tree that is more efficient for spatial queries. Ball*-tree enjoys a modified space partitioning algorithm that considers the distribution of the data points in order to find an efficient splitting hyperplane. Also, we propose a new algorithm for KNN queries with restricted range using ball*-tree, which performs better than both KNN and range search for such queries. Results show that ball*-tree performs 39%-57% faster than the original ball-tree algorithm.
Anais do XXXVI Simpósio Brasileiro de Banco de Dados (SBBD 2021), 2021
Spatial approximations simplify the geometric shape of complex spatial objects. Hence, they have been employed to alleviate the evaluation of costly computational geometric algorithms when processing spatial queries. For instance, spatial index structures employ them to organize spatial objects in tree structures (e.g., the R-tree). We report experiments considering two real datasets composed of ∼1.5 million regions and ∼2.7 million lines. The experiments confirm the performance benefits of spatial approximations and spatial index structures. However, we also identify that a second processing step is needed to deliver the final answer and often requires higher processing time than the step that uses index structures only. It leads to the interest in studying how spatial approximations can be efficiently used to improve both steps. This paper presents a systematic review on this topic. As a result, we provide an overview and comparison of existing approaches that propose, evaluate, o...
We present an approximate distance oracle for a point set S with n points and doubling dimension λ. For every ε > 0, the oracle supports (1 + ε)-approximate distance queries in (universal) constant time, occupies space [ε −O(λ) + 2 O(λ log λ) ]n, and can be constructed in [2 O(λ) log 3 n + ε −O(λ) + 2 O(λ log λ) ]n expected time. This improves upon the best previously known constructions, presented by . Furthermore, the oracle can be made fully dynamic with expected O(1) query time and only 2 O(λ) log n + ε −O(λ) + 2 O(λ log λ) update time. This is the first fully dynamic (1 + ε)-distance oracle.
2002
The Spatial Approximation Tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The main drawback of the ...
2010
The metric space model allows abstracting many similarity search problems. Similarity search has multiple applications especially in the multimedia databases area. The idea is to index the database so as to accelerate similarity queries. Although there are several promising indices, few of them are dynamic, i.e., once created very few allow to perform insertions and deletions of elements at a reasonable cost.
Similarity Search and Applications, 2009 …, 2009
Metric space searching is an emerging technique to address the problem of efficient similarity searching in many applications, including multimedia databases and other repositories handling complex objects. Although promising, the metric space approach is still immature in several aspects that are well established in traditional databases. In particular, most indexing schemes are not dynamic, that is, few of them tolerate insertion of elements at reasonable cost over an existing index and only a few work efficiently in secondary memory.
Journal of Computer Science Technology, 2014
Metric space searching is an emerging technique to address the problem of similarity searching in many applications. In order to efficiently answer similarity queries, the database must be indexed. In some interesting real applications dynamism is an indispensable property of the index. There are very few actually dynamic indexes that support not only searches, but also insertions and deletions of elements. The dynamic spatial approximation tree (DSAT) is a data structure specially designed for searching in metric spaces, which compares favorably against other data structures in high dimensional spaces or queries with low selectivity. Insertions are efficient and easily supported in DSAT, but deletions degrade the structure over time. Several methods are proposed to handle deletions over the DSAT. One of them has shown to be superior to the others, in the sense that it permits controlling the expected deletion cost as a proportion of the insertion cost and searches does not overly degrade after several deletions. In this paper we propose and study a new alternative deletion method, based on the better existing strategy. The outcome is a fully dynamic data structure that can be managed through insertions and deletions over arbitrarily long periods of time without any significant reorganization.
Chilean Computer Science Society, …, 2003
The Dynamic Spatial Approximation Tree (dsa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The dsa-tree supports insertion and deletions of elements. However, it has been noted that deletions degrade the structure over time, so the structure cannot be regarded as fully dynamic in the sense that deletions are not sustainable for long periods of time.
2010
Metric space searching is an emerging technique to address the problem of efficient similarity searching in many applications, including multimedia databases and other repositories handling complex objects. Although promising, the metric space approach is still immature in several aspects that are well established in traditional databases. In particular, most indexing schemes are not dynamic. From the few dynamic indexes, even fewer work well in secondary memory. That is, most of them need the index in main memory in order to operate efficiently. In this paper we introduce a secondary-memory variant of the Dynamic Spatial Approximation Tree with Clusters (DSACL-tree) which has shown to be competitive in main memory. The resulting index handles well the secondary memory scenario and is competitive with the state of the art. The resulting index is a much more practical data structure that can be useful in a wide range of database applications.
2008
Many computational applications need to look for informa- tion in a database. Nowadays, the predominance of non- conventional databases makes the similarity search (i.e., searching elements of the database that are "similar" to a given query) becomes a preponderant concept. The Spatial Approximation Tree has been shown that it compares favorably against alternative data structures for similarity searching in metric spaces of medium to high di- mensionality ("difficult" spaces) or queries with low selec- tivity. However, for the construction process the tree root has been randomly selected and the tree ,in its shape and performance, is completely determined by this selection. Therefore, we are interested in improve mainly the searches in this data structure trying to select the tree root so to re- flect some of the own characteristics of the metric space to be indexed. We regard that selecting the root in this way it allows a better adaption of the data structure ...
2013
The Dynamic Spatial Approximation Tree (DSAT) is a data structure specially designed for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The DSAT supports insertion and deletions of elements. However, it has been noted that eliminations degrade the structure over time. In [8] is proposed a method to handle deletions over the DSAT, which shown to be superior to the former in the sense that it permits controlling the expected deletion cost as a proportion of the insertion cost. In this paper we propose and study a new deletion method, based on the deletions strategies presented in [8], which has demonstrated to be better. The outcome is a fully dynamic data structure that can be managed through insertions and deletions over arbitrarily long periods of time without any reorganization.
Journal of Information and Data Management
Many applications rely on spatial information retrieval, which involves costly computational geometric algorithms to process spatial queries. Spatial approximations simplify the geometric shape of complex spatial objects, allowing faster spatial queries at the expense of result accuracy. In this sense, spatial approximations have been employed to efficiently reduce the number of objects under consideration, followed by a refinement step to restore accuracy. For instance, spatial index structures employ spatial approximations to organize spatial objects in hierarchical structures (e.g., the R-tree). It leads to the interest in studying how spatial approximations can be efficiently employed to improve spatial query processing. This article presents a systematic review on this topic. We gather relevant studies by performing a search string on several digital libraries. We further expand the studies under consideration by employing a single iteration of the snowballing approach, where w...
2009
Mobile query processing is, currently, a very active research field. Range and nearest neighbor queries are commonly used in spatiotemporal databases and location based services (LBS). In this paper, we focus on finding nearest neighbors of a query point within a certain distance range. We propose a new indexing structure CN-tree, Compact N-tree, based on a recent indexing technique called N-tree. CN-tree joins efficiency of N-tree's data partitioning scheme to pertinent objects' approximation with minimal bounding rectangles of R-trees which are reported to be the best performing for range search. We show how we use the approximation in constructing CN-tree and, then, how this index can support range queries efficiently by minimizing computation of distances and avoiding overlapping of minimal bounding rectangles. The experimental results through the comparison with the well know R*-tree, show that the proposed CN-tree widely outperforms R*-tree as an in-memory index and it presents competitive performances when used as an in-disk index.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.