Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1988, New Generation Computing
In the recent investigations of reducing the relational join operation complexity several hash-based partitioned-join stategies have been introduced. All of these strategies depend upon the costly operation of data space partitioning before the join can be carried out. We had previously introduced a partitioned-join based on a dynamic and order preserving multidimensional data organization called DYOP. The present study extends the earlier research on DYOP and constructs a simulation model. The simulation studies on DYOP and subsequent comparisons of all the partitioned-join methodologies including DYOP have proven that space utilization of DYOP improves with the increasing number of attributes. Furthermore, the DYOP based join outperforms all the hash-based methodologies by greatly reducing the total I/O bandwidth required for the entire partitioned-join operation. The comparison model is independent of the architectural issues such as multiprocessing, multiple disk usage, and large memory availability all of which help to further increase the efficiency of the operation.
Proceedings. 20th International Conference on Data Engineering, 2004
This paper introduces the hash-merge join algorithm (HMJ, for short); a new non-blocking join algorithm that deals with data items from remote sources via unpredictable, slow, or bursty network traffic. The HMJ algorithm is designed with two goals in mind: (1) Minimize the time to produce the first few results, and (2) Produce join results even if the two sources of the join operator occasionally get blocked. The HMJ algorithm has two phases: The hashing phase and the merging phase. The hashing phase employs an in-memory hash-based join algorithm that produces join results as quickly as data arrives. The merging phase is responsible for producing join results if the two sources are blocked. Both phases of the HMJ algorithm are connected via a flushing policy that flushes in-memory parts into disk storage once the memory is exhausted. Experimental results show that HMJ combines the advantages of two state-of-the-art non-blocking join algorithms (XJoin and Progressive Merge Join) while avoiding their shortcomings.
Optimally organizing multidimensional data is NP-hard. The little work that has been done in optimising multidimensional data was limited to uniform data distribution and rarely considered the probability of use of each query. And those who did consider the probability of use of each query, they were limited to either partial match query or range query. This work shows that by combining heuristics and combinatorial algorithms, near-optimal solutions can be found which organize multidimensional data (uniform or skewed) on which join queries are efficiently performed. The experimental results of the proposed algorithms show that performance gains of up to 716% are achieved, when compared with standard schemes. Moreover, the proposed algorithms are not very sensitive to the change in the query distribution. The result show that if the query probabilities change by up to 80% of their original values, the original storage organizations remain near optimal.
Parallel and Distributed …, 1993
In this paper, we describe four parallel pointer-based join algorithms for set-valued attributes. Pointer-based joins will be common in next-generation object-oriented database systems, so efficiently supporting them is crucial to the performance of such systems. Using analysis, we show that while algorithms based on Hybrid-hash provide good performance, algorithms that require less replication will often produce as good or better performance, especially if each set-valued attribute references a small number of nodes.
Proceedings of the …, 2009
Join is an important database operation. As computer architectures evolve, the best join algorithm may change hand. This paper reexamines two popular join algorithms -hash join and sort-merge join -to determine if the latest computer architecture trends shift the tide that has favored hash join for many years. For a fair comparison, we implemented the most optimized parallel version of both algorithms on the latest Intel Core i7 platform. Both implementations scale well with the number of cores in the system and take advantages of latest processor features for performance. Our hash-based implementation achieves more than 100M tuples per second which is 17X faster than the best reported performance on CPUs and 8X faster than that reported for GPUs. Moreover, the performance of our hash join implementation is consistent over a wide range of input data sizes from 64K to 128M tuples and is not affected by data skew. We compare this implementation to our highly optimized sort-based implementation that achieves 47M to 80M tuples per second. We developed analytical models to study how both algorithms would scale with upcoming processor architecture trends. Our analysis projects that current architectural trends of wider SIMD, more cores, and smaller memory bandwidth per core imply better scalability potential for sort-merge join. Consequently, sort-merge join is likely to outperform hash join on upcoming chip multiprocessors. In summary, we offer multicore implementations of hash join and sort-merge join which consistently outperform all previously reported results. We further conclude that the tide that favors the hash join algorithm has not changed yet, but the change is just around the corner.
Optimally organizing multidimensional data is NP-hard. The little work that has been done in optimising multidimensional data was limited to uniform data distribution and rarely considered the probability of use of each query. And those who did consider the probability of use of each query, they were limited to either partial match query or range query. This work shows that by combining heuristics and combinatorial algorithms, near-optimal solutions can be found which organize multidimensional data (uniform or skewed) on which join queries are efficiently performed. The experimental results of the proposed algorithms show that performance gains of up to 716% are achieved, when compared with standard schemes. Moreover, the proposed algorithms are not very sensitive to the change in the query distribution. The result show that if the query probabilities change by up to 80% of their original values, the original storage organizations remain near optimal.
Proceedings of 3rd International Conference on High Performance Computing (HiPC), 1996
The present advances in parallel and distributedprocessing and its application to database operations such as join resulted in investigating parallel algorithms. Hash based join algorithms involve a costly data partitioning phase prior to the join operation. This paperpresents new parallel join algorithms for relations based on gridJiles where no costly partitioning phase is involved, hence the performance can improve. 144 0-8186-7557-8196 $5.00 0 1996 IEEE
We propose an object-oriented framework for one of the most frequent and costly operations in parallel database systems: the parallel join. The framework independently captures a great variety of parameters, such as different load balancing procedures and different synchronization disciplines. The framework addresses DBMS flexibility, configuration and extensibility issues, via the instantiation of known algorithms and facilities for the introduction of new ones. The framework can also be used to compare algorithms and to determine the execution scenario an algorithm is best suited for. Related algorithms are grouped in families, suggesting a taxonomy.
[1991] Proceedings. Seventh International Conference on Data Engineering, 1991
Parallel processing of relational queries has received considerable attention of late. However, in the presence of data skew, the speedup from conventional parallel join algorithms can be very limited, due to load imbalances among the various processors. Even a single large skew element can cause a processor to become overloaded.
2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013
The performance of parallel distributed data management systems becomes increasingly important with the rise of Big Data. Parallel joins have been widely studied both in the parallel processing and the database communities. Nevertheless, most of the algorithms so far developed do not consider the data skew, which naturally exists in various applications. State of the art methods designed to handle this problem are based on extensions to either of the two prevalent conventional approaches to parallel joins-the hash-based and duplicationbased frameworks. In this paper, we introduce a novel parallel join framework, query-based distributed join (QbDJ), for handling data skew on distributed architectures. Further, we present an efficient implementation of the method based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate the performance of our approach on a cluster of 192 cores (16 nodes) and datasets of 1 billion tuples with different skews. The results show that the method is scalable, and also runs faster with less network communication compared to state-of-art PRPD approach in [1] under high data skew.
Proceedings of the International …, 1998
Time of creation is one of the predominant (often implicit) clustering strategies found not only in Data Warehouse systems: line items are created together with their corresponding order, objects are created together with their subparts and so on. The newly created data is then appended to the existing data. We present a new join algorithm, called Diag-Join, which exploits time-of-creation clustering. If we are able to take advantage of timeof-creation clustering, then the performance evaluation reveals the superiority of Diag-Join over standard join algorithms like block-wise nested-loop join, GRACE hash join, and index nested-loop join. We also present an analytical cost model for Diag-Join.
Journal of King Saud University - Computer and Information Sciences, 2010
Enhancing the performance of large database systems depends heavily on the cost of performing join operations. When two very large tables are joined, optimizing such operation is considered one of the interesting research topics to many researchers, especially when both tables, to be joined, are very large to fit in main memory. In such case, join is usually performed by any other method than hash Join algorithms. In this paper, a novel join algorithm that is based on the use of quadtrees, is introduced. Applying the proposed algorithm on two very large tables, that are too large to fit in main memory, is proven to be fast and efficient. In the proposed new algorithm, both tables are represented by a storage efficient quadtree that is designed to handle one-dimensional arrays (1-D arrays). The algorithm works on the two 1-D arrays of the two tables to perform join operations. For the new algorithm, time and space complexities are studied. Experimental studies show the efficiency and superiority of this algorithm. The proposed join algorithm requires minimum number of I/O operations and operates in main memory with O(n log (n/k)) time complexity, where k is number of key groups with same first letter, and (n/k) is much smaller than n.
1992
In this paper we introduce the concept of declustering aware joins (DA-Joins), which is a family of join algorithms with each member being determined by the underlying declustering method and the join technique used. We present and analyze a member of the DA-Join family that is based on the parallel hybrid-hash join and the CMD declustering method, which we call DA-Join HH,CMD . We show that DAJoin HH,CMD has very good overall performance. Its main advantages over non declustering-aware joins are (i) its ability to partition the problem into a set of smaller independent sub-problems, each of which can utilize memory more efficiently, and (ii) pruning the problem size by not considering portions of the relations that cannot join. It shares with hash joins the desirable property of scalability with respect to the degree of parallelism. However, it can avoid problems due to data skew that is faced by parallel hash joins. The CMD declustering method has been shown to be optimal for multi-attribute range queries on parallel I/O architectures, and our analysis and experimental evaluation of DA-Join HH,CMD prove that it is also possible to perform joins efficiently on CMD-stored relations, thus providing evidence for the desirability of CMD as a basic technique for multi-attribute declustering of relations in a parallel database.
Proceedings of the ACM SIGMOD 39th International Conference on Management of Data
There exists a need for high performance, read-only mainmemory database systems for OLAP-style application scenarios. Most of the existing works in this area are centered around the domain of column-store databases, which are particularly well suited to OLAP-style scenarios and have been shown to overcome the memory bottleneck issues that have been found to hinder the more traditional row-store database systems. One of the main database operations these systems are focused on optimizing is the JOIN operation. However, all these existing systems use join algorithms that are designed with the unrealistic assumption that there is unlimited temporary memory available to perform the join. In contrast, we propose a Memory Constrained Join algorithm (MCJoin) which is both high performing and also performs all of its operations within a tight given memory constraint. Extensive experimental results show that MCJoin outperforms a naive memory constrained version of the state-of-the-art Radix-Clustered Hash Join algorithm in all of the situations tested, with margins of up to almost 500%.
IEEE Transactions on Knowledge and Data Engineering, 2002
AbstractÐIn the past decade, the exponential growth in commodity CPU's speed has far outpaced advances in memory latency. A second trend is that CPU performance advances are not only brought by increased clock rate, but also by increasing parallelism inside the CPU. Current database systems have not yet adapted to these trends and show poor utilization of both CPU and memory resources on current hardware. In this paper, we show how these resources can be optimized for large joins and translate these insights into guidelines for future database architectures, encompassing data structures, algorithms, cost modeling, and implementation. In particular, we discuss how vertically fragmented data structures optimize cache performance on sequential data access. On the algorithmic side, we refine the partitioned hash-join with a new partitioning algorithm called radix-cluster, which is specifically designed to optimize memory access. The performance of this algorithm is quantified using a detailed analytical model that incorporates memory access costs in terms of a limited number of parameters, such as cache sizes and miss penalties. We also present a calibration tool that extracts such parameters automatically from any computer hardware. The accuracy of our models is proven by exhaustive experiments conducted with the Monet database system on three different hardware platforms. Finally, we investigate the effect of implementation techniques that optimize CPU resource usage. Our experiments show that large joins can be accelerated almost an order of magnitude on modern RISC hardware when both memory and CPU resources are optimized.
Proceedings of the 2009 ACM symposium on Applied Computing, 2009
Hash joins combine massive relations in data warehouses, decision support systems, and scientific data stores. Faster hash join performance significantly improves query throughput, response time, and overall system performance. In this work, we demonstrate how using join cardinality improves hash join performance. The key contribution is the development of an algorithm to determine join cardinality in an arbitrary query plan. We implemented early hash join and the join cardinality algorithm in PostgreSQL. Experimental results demonstrate that early hash join has an immediate response time that is an order of magnitude faster than the existing hybrid hash join implementation. One-to-one joins execute up to 50% faster and perform significantly fewer I/Os, and one-to-many joins have similar or better performance over all memory sizes.
iaeme
The huge amount of the available data requires that the data be stored at different locations with the least amount of memory requirement and easy retrieval. This gave birth to databases and DBMS. The retrieval is simple and quick when the data is stored at a single location (logical or physical); it becomes complex or non-trivial when the data is not at one place. The technique of getting this data from different locations (here tables) together for use is called joining. Joining has been used since the development of databases; many techniques have since been introduced, some with the modification to existing ones and some with a different approach altogether. In a real-time query execution environment, when the number of tuples is large, it is the join that takes the maximum amount of time and CPU usage. In this paper we will explain and compare the non-blocking joining techniques and their approaches. The joining techniques are compared based on their execution time, flushing policy, the memory requirements, I/O complexity and other factors that make one algorithm more preferable than the other in the appropriate environment. The ability
Proceedings of the Twelfth International Conference on Data Engineering, 1996
Three pointer-based parallel join algorithms are presented and analyzed for environments in which secondary storage is made transparent to the programmer through memory mapping. Buhr, Goel, and Wai [11] have shown that data structures such as B-Trees, R-Trees and graph data structures can be implemented as efficiently and effectively in this environment as in a traditional environment using explicit I/O. Here we show how higher-order algorithms, in particular parallel join algorithms, behave in a memory mapped environment. A quantitative analytical model has been developed to conduct performance analysis of the parallel join algorithms. The model has been validated by experiments.
2002
A join-index is a data structure used for processing join queries in databases. Join-indices use precomputation techniques to speed up online query processing and are useful for data sets which are updated infrequently. The I/O cost of join computation using a join-index with limited buffer space depends primarily on the page-access sequence used to fetch the pages of the base relations. Given a join-index, we introduce a suite of methods based on clustering to compute the joins. We derive upper bounds on the length of the page-access sequences. Experimental results with Sequoia 2000 data sets show that the clustering method outperforms existing methods based on sorting and online-clustering heuristics.
2002
Although communication cost is still a major cost for distributed databases, local cost in distributed q u e y processing cannot be neglecte~1JfzJf3JfsJf10J,. Observing the fact that almost all commercial database products employ Plan Enumeration with Dynamic Programming (PEDP) techniques lZJ, we find reducing the cost of both communication and local processing in 2-way join has potential benejits. Although many methods for reducing communication cost have been proposed, most of them employ a cost model that neglects local processing cost. This paper proposes a join execution method (called virtual join) that considers both of them. Virtual join has two desirable features: I) Being adaptive to different values of selectivity. 2) Giving accurate cardinality of join result before it is materialized. Experiment results showed virtual join was both adaptive and efficient.
Proceedings of the Twelfth International Conference on Data Engineering
The widening performance gap between CPU and disk is significant for hash join performance. Most current hash join methods try t o reduce the volume of data transferred between memory and disk. In this paper, we try to reduce hash-join times by reducing random I/O. We study how current algorithms incur random I/O, and propose a new hash join method, Seq+, that converts much of the random 1/0 to sequential I/O. Seq+ uses a new organization for hash buckets on disk, and larger input and output buffer sizes. We introduce the technique of batch writes to reduce the bucket-write cost, and the concepts of write-and readgroups of hash buckets to reduce the bucket-read cost. We derive a cost model for our method, and present formulas for choosing various algorithm parameters, including input and output buffer sizes. Our performance study shows that the new hash join method performs many times better than current algorithms under various environments. Since our cost functions underestimate the cost of current algorithms and overestimate the cost of Seq+, the actual performance gain of Seq+ is likely to be even greater.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.