Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008
…
7 pages
1 file
Many computational applications need to look for information in a database. Nowadays, the predominance of nonconventional databases makes the similarity search (i.e., searching elements of the database that are "similar" to a given query) becomes a preponderant concept. The Spatial Approximation Tree has been shown that it compares favorably against alternative data structures for similarity searching in metric spaces of medium to high dimensionality ("difficult" spaces) or queries with low selectivity. However, for the construction process the tree root has been randomly selected and the tree ,in its shape and performance, is completely determined by this selection. Therefore, we are interested in improve mainly the searches in this data structure trying to select the tree root so to reflect some of the own characteristics of the metric space to be indexed. We regard that selecting the root in this way it allows a better adaption of the data structure to the intrinsic dimensionality of the metric space considered, so also it achieves more efficient similarity searches.
2010
The metric space model allows abstracting many similarity search problems. Similarity search has multiple applications especially in the multimedia databases area. The idea is to index the database so as to accelerate similarity queries. Although there are several promising indices, few of them are dynamic, i.e., once created very few allow to perform insertions and deletions of elements at a reasonable cost.
Similarity Search and Applications, 2009 …, 2009
Metric space searching is an emerging technique to address the problem of efficient similarity searching in many applications, including multimedia databases and other repositories handling complex objects. Although promising, the metric space approach is still immature in several aspects that are well established in traditional databases. In particular, most indexing schemes are not dynamic, that is, few of them tolerate insertion of elements at reasonable cost over an existing index and only a few work efficiently in secondary memory.
2010
Similarity search in high-dimensional metric spaces is a key operation in many applications, such as multimedia databases, image retrieval, object recognition, and others. The high dimensionality of the data requires special index structures to facilitate the search. A problem regarding the creation of suitable index structures for highdimensional data is the relationship between the geometry of the data and the organization of an index structure. In this paper, we study the performance of a new index structure, called Divisive-Agglomerative Hierarchical Clustering tree (DAHC-tree), which reduces the effects imposed by the above liability. DAHC-tree is constructed by dividing and grouping the data set into compact clusters. We perform a rigorous experimental design and analyze the trade-offs involved in building such an index structure. Additionally, we present extensive experiments comparing our method against state-of-the-art of exact and approximate solutions. The conducted analysis and the reported comparative test results demonstrate that our technique significantly improves the performance of similarity queries.
2002
The Spatial Approximation Tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The main drawback of the ...
2003
Hybrid dynamic spatial approximation trees are recently proposed data structures for searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms. These data structures are hybrid schemes, with the full features of dynamic spatial approximation trees and able of using the available memory to improve the query time. It has been shown that they compare favorably against alternative data structures in spaces of medium difficulty. In this paper we complete and improve hybrid dynamic spatial approximation trees, by presenting a new search alternative, an algorithm to remove objects from the tree, and an improved way of managing the available memory. The result is a fully dynamic and optimized data structure for similarity searching in metric spaces.
ACM Journal of Experimental Algorithmics, 2009
Proximity searching consists of retrieving from a database those elements that are similar to a query object. The usual model for proximity searching is a metric space where the distance, which models the proximity, is expensive to compute. An index uses precomputed distances to speedup query processing. Among all the known indices, the baseline for performance for about 20 years has been AESA. This index uses an iterative procedure, where at each iteration it first chooses the next promising element (“pivot”) to compare to the query, and then it discards database elements that can be proved not relevant to the query using the pivot. The next pivot in AESA is chosen as the one minimizing the sum of lower bounds to the distance to the query proved by previous pivots. In this article, we introduce the new index iAESA , which establishes a new performance baseline for metric space searching. The difference with AESA is the method to select the next pivot. In iAESA, each candidate sorts...
2001
The spatial approximation tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown to compare favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The main drawback of the sa-tree is that it is a static data structure, that is, once built, it is difficult to add new elements to it. This rules it out for many interesting applications. In this paper we overcome this weakness. We propose and study several methods to handle insertions in the sa-tree. Some are classical solutions well known in the data structures community, while the most promising ones have been specifically developed considering the particular properties of the sa-tree, and involve new algorithmic insights in the behavior of this data structure. As a result, we show that it is viable to modify the sa-tree so as to permit fast insertions while keeping its good search efficiency
Hybrid dynamic spatial approximation trees are recently proposed data structures for searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms. These data structures are hybrid schemes, with the full features of dynamic spatial approximation trees and able of using the available memory to improve the query time. It has been shown that they compare favorably against alternative data structures in spaces of medium difficulty.
Journal of Computer Science Technology, 2014
Metric space searching is an emerging technique to address the problem of similarity searching in many applications. In order to efficiently answer similarity queries, the database must be indexed. In some interesting real applications dynamism is an indispensable property of the index. There are very few actually dynamic indexes that support not only searches, but also insertions and deletions of elements. The dynamic spatial approximation tree (DSAT) is a data structure specially designed for searching in metric spaces, which compares favorably against other data structures in high dimensional spaces or queries with low selectivity. Insertions are efficient and easily supported in DSAT, but deletions degrade the structure over time. Several methods are proposed to handle deletions over the DSAT. One of them has shown to be superior to the others, in the sense that it permits controlling the expected deletion cost as a proportion of the insertion cost and searches does not overly degrade after several deletions. In this paper we propose and study a new alternative deletion method, based on the better existing strategy. The outcome is a fully dynamic data structure that can be managed through insertions and deletions over arbitrarily long periods of time without any significant reorganization.
1997
A new access method, called M-tree, is proposed to organize and search large data sets from a generic "metric space", i.e. where object proximity is only defined by a distance function satisfying the positivity, symmetry, and triangle inequality postulates. We detail algorithms for insertion of objects and split management, which keep the M-tree always balanced -several heuristic split alternatives are considered and experimentally evaluated. Algorithms for similarity (range and k-nearest neighbors) queries are also described.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
String Processing and Information Retrieval, 2002
ACM Transactions on Database Systems, 2002
1999
Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '99, 1999
International Journal of Computer …, 2007
ACM SIGMOD Record, 1997
arXiv (Cornell University), 2015
Lecture Notes in Computer Science, 2005
First DELOS Network of …, 2000
Journal of Information and Data Management
Lecture Notes in Computer Science, 2003