Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1991, Springer eBooks
…
18 pages
1 file
2010
This thesis reviews selected topics from the theory of parallel computation. The research begins with a survey of the proposed models of parallel computation. It examines the characteristics of each model and it discusses its use either for theoretical studies, or for practical applications. Subsequently, it employs common simulation techniques to evaluate the computational power of these models. The simulations establish certain model relations before advancing to a detailed study of the parallel complexity theory, which is the subject of the second part of this thesis. The second part examines classes of feasible highly parallel problems and it investigates the limits of parallelization. It is concerned with the benefits of the parallel solutions and the extent to which they can be applied to all problems. It analyzes the parallel complexity of various well-known tractable problems and it discusses the automatic parallelization of the efficient sequential algorithms. Moreover, it ...
Journal of Parallel and Distributed Computing, 1991
The many revolutionary changes brought about by the integrated chip-in the form of significant improvements in processing, storage, and communications-have also brought about a host of related problems for designers and users of parallel and distributed systems. These systems develop and proliferate at an amazing momentum, motivating research in the understanding and testing of complex distributed systems. Unfortunately, these relatively expensive systems are being designed, built, used, refined, and rebuilt (at perhaps an avoidable expense) even before we have developed methodology for understanding the underlying principles of their behavior. Though it is not realistic to expect that the current rate of manufacturing can be slowed down to accommodate research in design principles, it behooves us to bring attention to the importance of design methodology and performance understanding of such systems and, in this way, to attempt to influence parallel system design in a positive manner. At the present time, there is considerable debate among various schools of thought on parallel machine architectures, with different schools proposing different architectures and design philosophies. Consider, for example, one such debate involving tightly coupled systems. Early on, Minsky [l] conjectured a somewhat pessimistic bound of log y1 for typical speedup on n processors. Since then, researchers [2 ] have shown that certain characteristics of programs, such as the DO loops in Fortran, can often be exploited to yield more optimistic levels of speedup. Other researchers [ 31 counter this kind of optimism by pointing out that parallel and vector processing has limitations in potential speedup (i.e., Amdahl's law) to the extent that speedup is bounded from above by n/ (s. n + 1-s), where s is the fraction of a computation that must be done serially. This suggests that it makes more sense to first concentrate on achieving the maximum speedup possible with a single powerful processor. In this view, the distributed approach is not as attractive an option. More recently, work on hypercubes [4] appears to indicate that
2011
For more than thirty years, the parallel programming community has used the dependence graph as the main abstraction for reasoning about and exploiting parallelism in "regular" algorithms that use dense arrays, such as finite-differences and FFTs. In this paper, we argue that the dependence graph is not a suitable abstraction for algorithms in new application areas like machine learning and network analysis in which the key data structures are "irregular" data structures like graphs, trees, and sets.
Texts in Computational Science and Engineering, 2010
1991
In recent years, powerful theoretical techniques have been developed for supporting communication, synchronization and fault tolerance in general purpose parallel computing. The proposition of this thesis is that different techniques should be used to support different algorithms. The determining factor is granularity, or the extent to which an algorithm uses long blocks for communication between processors. We consider the Block PRAM model of Aggarwal, Chandra and Snir, a synchronous model of parallel computation in which the processors communicate by accessing a shared memory. In the Block PRAM model, there is a time cost for each access by a processor to a block of locations in the shared memory. This feature of the model encourages the use of long blocks for communication. In the thesis we present Block PRAM algorithms and lower bounds for specific problems on arrays, lists, expression trees, graphs, strings, binary trees and butterflies. These results introduce useful basic techniques for parallel computation in practice, and provide a classification of problems and algorithms according to their granularity. Also presented are optimal algorithms for universal hashing and skewing, which are techniques for supporting conflict-free memory access in general-and special-purpose parallel computations, respectively. We explore the Block PRAM model as a theoretical basis for the design of scalable general purpose parallel computers. Several simulation results are presented which show the Block PRAM model to be comparable to, and competitive with, other models that have been proposed for this role. Two major advantages of machines based on the Block PRAM model is that they are able to preserve the granularity properties of individual algorithms and can efficiently incorporate a significant degree of fault tolerance. The thesis also discusses methods for the design of algorithms that do not use synchronization. We apply these methods to define fast circuits for several fundamental Boolean functions.
2004
The design of parallel programs requires fancy solutions that are not present in sequential programming. Thus, a designer of parallel applications is concerned with the problem of ensuring the correct behavior of all the processes that the program comprises. There are different solutions to each problem, but the question is to find one, that is general. One possibility is allowing the use of asynchronous groups of processors. We present a general methodology to derive efficient parallel divide and conquer algorithms. Algorithms belonging to this class allow the arbitrary division of the processor subsets, easing the opportunities of the underlying software to divide the network in independent sub networks, minimizing the impact of the traffic in the rest of the network in the predicted cost. This methodology is defined by OTMP model and its expressiveness is exemplified through three divide and conquer programs.
IEEE Transactions on Software Engineering, 1981
Lafayette, IN, a position he has held since 1976. His research interests include algorithms for data storage and retrieval, programming languages, and software engineering. Dr. Comer is a member of the Association for Computing Machinery and Sigma Xi.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
… , University of California at Berkeley, Technical Report No. …
Theoretical Computer Science, 1990
Symposium on the Theory of Computing, 1982
Parallel Computing - Fundamentals and Applications - Proceedings of the International Conference ParCo99, 2000
Theory of Computing Systems / Mathematical Systems Theory, 1999
Computers in Physics, 1989
Journal of Parallel and Distributed Computing, 1992
Lecture Notes in Computer Science, 2002
21st Annual Symposium on Foundations of Computer Science (sfcs 1980), 1980