Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2000
We propose DSL, a new Scalable Distributed Data Structure for the dictionary problem, based on a version of Skip Lists, as an alternative to both random trees and deterministic height balanced trees. Our scheme exhibits, with high probability, logarithmic search time, constant reconstruction time, and linear space overhead. Additionally, at the expense of two additional pointers per internal node, the search operation could cost O log d¡ expected messages, where d is the distance between guessed and actual position.
Data management in the peer-to-peer system is a challenging task due to the random distribution of data among several participating peers. Efficient data structures like distributed hash tables (DHT) and its variants are designed and implemented to reduce the complexity of data management in such environment. However, DHT has its limitations in supporting range queries and its variants like distributed segment trees often perform poorly when the number of peers is high. Further, distributed lists and distributed balanced trees require significant amount of time for stabilizing after a new peer joins or a peer leaves. In this paper, a new distributed data structure called determin-istic 1–2 skip list is introduced as an alternate solution for data management in the peer-to-peer systems. A determin-istic skip list can be viewed as an alternate of a balanced tree, where the semantic locality of each key is preserved. Thus it can support the range queries as well as the single shot queries. This paper proposes three main operations on this data structure-searching data based on keys, insertion when a new peer joins, and deletion when a peer leaves. The correctness of the proposed operations are analyzed using theoretical arguments and mathematical proofs. The A preliminary version of this work appears in the proceedings of proposed scheme is simulated using NS-2.34 network sim-ulator, and the efficiency of the scheme has been compared with DHT, DST, distributed list and distributed tree based data management. Keywords Deterministic skip list · 1–2 skip list · Data structure · Range queries
1999
Abstract In this paper we consider the dictionary problem in a message passing distributed environment. We introduce a new version of an order-preserving distributed search tree, called BDST for Balanced and Distributed Search Tree, capable to both grow and shrink as long as keys are inserted and deleted. This is the rst distributed data structure to explicitly support both insertion and deletion with logarithmic costs, ie a key can be searched, inserted and deleted in O (logn) messages, where n is the number of servers.
2002
In this paper we consider the dictionary problem in a message-passing distributed environment. We introduce a new version, based on AVL-trees, of distributed search trees, the first to be fully scalable, that is, able to both grow and shrink as long as keys are inserted and deleted. We prove that in the worst case a key can be inserted, searched, or deleted with O (lg 2N) messages. We show that for the introduced distributed search tree this bound is tight.
Journal of Computer and System Sciences, 2004
An implicit data structure for the dictionary problem maintains n data values in the first n locations of an array in such a way that it efficiently supports the operations insert, delete and search. No information other than that in Oð1Þ memory cells and in the input data is to be retained; and the only operations performed on the data values (other than reads and writes) are comparisons. This paper describes the implicit B-tree, a new data structure supporting these operations in Oðlog B nÞ block transfers like in regular B-trees, under the realistic assumption that a block stores B ¼ Oðlog nÞ keys, so that reporting r consecutive keys in sorted order has a cost of Oðlog B n þ r=BÞ block transfers. En route a number of space efficient techniques for handling segments of a large array in a memory hierarchy are developed. Being implicit, the proposed data structure occupies exactly Jn=Bn blocks of memory after each update, where n is the number of keys after each update and B is the number of keys contained in a memory block. In main memory, the time complexity of the operations is Oðlog 2 n=log log nÞ; disproving a conjecture of the mid 1980s.
Skip graphs are a kind of distributed data structure based on skip lists. They have the full functionality of a balanced tree in a distributed systems. Skip graphs are mostly used in searching peer-to-peer (p2p) networks. As they provide the ability to query by key ordering, they improve other search tools based on the hash table functionality only. In compare to skip lists and other tree data structure, they are very resillent and can tolerate a large function of node fails. Simple and straightforward algorithms can be used to construct a skip graph, insert new nodes into it, search it, and detect and repair errors in a skip graph introduced due to node failures.
Journal of Discrete Algorithms, 2012
We present the Skip lifts, a randomized dictionary data structure inspired from the skip list [Pugh '90, Comm. of the ACM]. Similarly to the skip list, the skip lifts has the finger search property: Given a pointer to an arbitrary element f , searching for an element x takes expected O(log δ) time where δ is the rank distance between the elements x and f. The skip lifts uses nodes of O(1) worst-case size and it is one of the few efficient dictionary data structures that performs an O(1) worstcase number of structural changes during an update operation. Given a pointer to the element to be removed from the skip lifts the deletion operation takes O(1) worst-case time.
Lecture Notes in Computer Science, 2001
Scalable Distributed Data Structures (SDDS) are access methods specifically designed to satisfy the high performance requirements of a distributed computing environment made up by a collection of computers connected through a high speed network. In this paper we propose an order preserving SDDS with a worst-case constant cost for exact-search queries and a worst-case logarithmic cost for update queries. Since our technique preserves the ordering between keys, it is also able to answer to range search queries with an optimal worst-case cost of O(k) messages, where k is the number of servers covering the query range. Moreover, our structure has an amortized almost constant cost for any single-key query. Hence, our proposal is the first solution combining the advantages of the constant worst-case access cost featured by hashing techniques (e.g. LH*) and of the optimal worst-case cost for range queries featured by order preserving techniques (e.g., RP* and DRT). Furthermore, recent proposals for ensuring high-availability to an SDDS can be easily combined with our basic technique. Therefore our solution is a theoretical achievement potentially attractive for network servers requiring both a fast response time and a high reliability. Finally, our scheme can be easily generalized to manage k-dimensional points, while maintaining the same costs of the 1-dimensional case.
WSEAS Transactions on Computers, 2008
Abstract: Trees are frequently used data structures for fast access to the stored data. Data structures like arrays, vectors and linked lists are limited by the trade-off between the ability to perform a fast search and the ability to resize easily. Binary Search Trees are an ...
2001
SDDSs (Scalable Distributed Data Structures) are access methods specifically designed to satisfy the high performance requirements of a distributed computing environment made up by a collection of computers connected through a high speed network.
ACM Transactions on Database Systems, 2005
________________________________________________________________________ LH* RS is a high-availability scalable distributed data structure (SDDS). An LH* RS file is hash partitioned over the distributed RAM of a multicomputer, e.g., a network of PCs, and supports the unavailability of any k ≥ 1 of its server nodes. The value of k transparently grows with the file to offset the reliability decline. Only the number of the storage nodes potentially limits the file growth. The high-availability management uses a novel parity calculus that we have developed, based on Reed-Salomon erasure correcting coding. The resulting parity storage overhead is about the lowest possible. The parity encoding and decoding are faster than for any other candidate coding we are aware of. We present our scheme and its performance analysis, including experiments with a prototype implementation on Wintel PCs. The capabilities of LH* RS offer new perspectives to data intensive applications, including the emerging ones of grids and of P2P computing. ________________________________________________________________________ Motto: Here is Edward Bear, coming downstairs now, bump, bump, bump, on the back of his head, behind Christopher Robin. It is, as far as he knows, the only way of coming downstairs, but sometimes he feels that there really is another way, if only he could stop bumping for a moment and think of it. And then he feels that perhaps there isn't. Winnie-the-Pooh. By A. A. Milne, with decorations by E. H. Shepard. Methuen & Co, London (publ.) 1 INTRODUCTION Shared-nothing configurations of computers connected by a high-speed link, often-called multicomputers, allow for high aggregate performance. These systems gained in popularity with the emergence of grid computing and P2P applications. They need new data structures that scale well with the number of components [CACM97]. Scalable Distributed Data Structures (SDDS) aim to fulfill this need [LNS93], [SDDS]. An SDDS file is stored at multiple nodes provided by SDDS servers. As the file grows, so does the number of servers on which it resides. The SDDS addressing scheme has no centralized components. This allows for operation speeds independent of the file size. They provide for hash, range or m-d partitioned files of records identified by a primary or by multiple keys. See [SDDS] for a partial list of references. A prototype system, SDDS 2000, for Wintel PCs, is freely available for a non-commercial use [CERIA]. Among the best-known SDDS schemes is the LH* scheme [LNS93, LNS96, KLR96, BVW96, B99a, K98v3, R98]. LH* creates scalable, distributed, hash-partitioned files. Each server stores the records in a bucket. The buckets split when the file grows. The splits follow the linear hashing (LH) principles [L80a, L80b]. Buckets are stored for fast access in distributed RAM, otherwise they can be on disks. Only the maximum possible number of server nodes limits the file size. A search or an insert of a record in an LH* file can be hundreds times faster than a disk access [BDNL00, B02]. At times, an LH* server can become unavailable. It may fail as the result of a software or hardware failure. It may also stop the service for good or an unacceptably long time, a frequent case in P2P applications. Either way, access to data becomes impossible. The situation may not be acceptable for an application, limiting the utility of the LH* scheme. Data unavailability can be very costly, [CRP06]. An unavailable financial database may easily cost the owner $10K-$27K per minute, [B99]. A file might suffer from the unavailability of several of its servers. We say that it is kavailable, if all data remain available despite the unavailability of any k servers. The information-theoretical minimum storage overhead for k-availability of m data servers is k/m [H&al94]. It requires k additional, so-called parity symbols (records, buckets…) per m data symbols (records, buckets…). Decoding k unavailable symbols requires access to m available symbols of the total of m + k. Large values for m seem impractical. A reasonable approach to limit m is to partition a data file into groups consisting of at most m nodes (buckets) per group, with independent parity calculus.
Workshop on Distributed Data and Structures, 2000
This paper reviews literature on scalable data structures for searching in a distributed computing environment. Starting with a system where one server manages a file of a given size that is accessed by a specific number of clients at a specific rate, a scalable distributed data structures (SDDS) can efficiently manage a file that is n times bigger and accessed
Algorithmica, 2014
We present a new overlay, called the Deterministic Decentralized tree (D 2-tree). The D 2-tree compares favourably to other overlays for the following reasons: (a) it provides matching and better complexities, which are deterministic for the supported operations; (b) the management of nodes (peers) and elements are completely decoupled from each other; and (c) an efficient deterministic load-balancing mechanism is presented for the uniform distribution of elements into nodes, while at the same time probabilistic optimal bounds are provided for the congestion of operations at the nodes.
Skip lists are a data structure that can be used in place of balanced trees. Skip lists use probabilistic balancing rather than strictly enforced balancing and as a result the algorithms for insertion and deletion in skip lists are much simpler and significantly faster than equivalent algorithms for balanced trees.
2010
This paper presents a new balanced, distributed data structure for storing data with multidimensional keys in a peer-to-peer network. It supports range queries as well as single point queries which are routed in O (logn) hops. Our structure, called SkipTree, is fully decentralized with each node being connected to O (logn) other nodes.
2002
Abstract In this paper we analyze the amortized cost of inserts and exact searches in a DRT*, an order preserving scalable distributed data structure able to manage both mono-dimensional and multi-dimensional data.
2013
As other fundamental programming abstractions in energy-efficient computing, search trees are expected to support both high parallelism and data locality. However, existing highly-concurrent search trees such as red-black trees and AVL trees, do not consider data locality while existing locality-aware search trees such as those based on the van Emde Boas layout (vEB-based trees), poorly support concurrent (update) operations. This paper presents DeltaTree, a practical locality-aware concurrent search tree that combines both locality-optimisation techniques from vEB-based trees and concurrency-optimisation techniques from non-blocking highly-concurrent search trees. DeltaTree is a k-ary leaf-oriented tree of DeltaNodes in which each DeltaNode is a size-fixed tree-container with the van Emde Boas layout. The expected memory transfer costs of DeltaTree's Search, Insert and Delete operations are O(log B N), where N, B are the tree size and the unknown memory block size in the ideal cache model, respectively. DeltaTree's Search operation is wait-free, providing prioritised lanes for Search operations, the dominant operation in search trees. Its Insert and Delete operations are non-blocking to other Search, Insert and Delete operations, but they may be occasionally blocked by maintenance operations that are sometimes triggered to keep DeltaTree in good shape. Our experimental evaluation using the latest implementation of AVL, red-black, and speculation friendly trees from the Synchrobench benchmark has shown that DeltaTree is up to 5 times faster than all of the three concurrent search trees for searching operations and up to 1.6 times faster for update operations when the update contention is not too high.
2007
Skip Tree Graph is a novel, distributed, data structure for peer-to-peer systems that supports exact-match and order-based queries such as range queries efficiently. It is based on skip trees, which are randomised balanced search trees equivalent to skip lists and designed to provide improved concurrency. Skip tree graphs constitute an extension of skip graphs enhancing their performance in both, exact-match and range queries. Moreover, skip tree graph maintains the underlying balanced tree structures using randomization and local operations, which provides a greater degree of concurrency and scalability.
2005
This paper presents the SkipTree, a new balanced, distributed data structure for storing data with multidimensional keys in a peer-to-peer network. The SkipTree supports range queries as well as single point queries which are routed in O (log n) hops. SkipTree is fully decentralized with each node being connected to O (log n) other nodes. The memory usage for maintaining the links at each node is O (log n log log n) on average and O (log 2 n) in the worst case. Load balance is also guaranteed to be within a constant factor.
Proposed in 1993 the Scalable Distributed Data Structures (SDDSs) became a profile of basis for the data management on Multi computer. In this paper we propose an organization of a LH* bucket based on the trie hashing in order to improve times of different access request.
Computer Communications, 2008
The support for complex queries, such as range, prefix and aggregation queries, over structured peer-to-peer systems is currently an active and significant topic of research. This paper demonstrates how Skip Tree Graph, as a novel structure, presents an efficient solution to that problem area through provision of a distributed search tree functionality on decentralised and dynamic environments. Since Skip Tree Graph is based on skip trees, a concurrent approach to skip lists, it constitutes an augmentation of skip graphs that extends its functionality and allows for important performance improvements. This work presents a thorough comparison between these two related peer-to-peer overlay networks, their construction, search algorithms and properties. Being based on tree structures, skip tree graphs supports aggregation queries and multicast/broadcast operations, which cannot be directly implemented in its predecessor. The repair mechanism for healing the structure in case of failures is more efficient and harnesses the parallelism inherent in P2P networks. Particular consideration is given to the performance of different range-query schemes over the two related structures. Theoretical and experimental results conclude that Skip Tree Graphs outperform skip graphs on both exact-match and range searches.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.