Parallel Algorithm Fundamentals and Analysis CSC 93-17

Abstract

Republic. A revised version, which with appear hr a volume published by-the IEEE Computer Society Press, appears as Technical Report Number C.Sc. 93-17. available via anonymous ftp from M cs.umr.edu" .

Figures (29)

Table of Contents

Fig. 1. Odd-Even Transposition Sort QDN-FVEN Transposition Sart If we arrange the N processors in a linear array and let processor P; hold value a(i), then processors alternately exchange their values based on whether their index is even or odd.

Fig. 3. Localized Computational Molecule Fig. 2. Discretization of Physical Domain - Domain Decomposition

1.5 Summary In parallel, each P,; gets 1/N’th of the integration to perform, as in Figure 4. A tree reduction summation is used to sum up all the slices in logarithmic time.

Fig. 5. Speedup Models

Fig.6. Sample Multistage Interconnection Networks

Fig. 7. Some Interconnection Topologies

Ring Embedding Rings are of interest, and are of increasing interest, due to the computational problems that arize in genetics. One of the central questions of molecular biology is the discovery of the semantics of DNA. Just knowing the syntax, that is, the sequence, tells the biologist little. The biologist must under- stand the biochemical functions of the DNA, To understand the semantics, one needs to know the relationship between DNA and proteins. The essence of the problem is that given a set of protein sequences, efficient alignment-matching algorithms are needed that can deal elegantly with insertion, deletion, substitu- tion, and even gaps in the series of sequence elements. One way of measuring the optimality of an alignment is by computing a score based on a matrix of weights reflecting the similarity between pairs of sequences. In some situations a penalty is subtracted for each gap introduced. Such a score can be computed by a dynamic programming algorithm in time proportional to the product of the lengths of the sequences.

Mesh Embedding Of great interest in Computational Science and Engineer- ing is programs whose structure is the mesh. Consider the mode fluids problem [86] of cavity-driven flow whose physical domain chosen is shown in Figure 10. The pair of non-linear coupled differential equations 1,2 that describe this flow are easily solved sequentially using a standard second-order central differencing scheme. Central differencing calculates the new values at a particular point by taking a weighted average of the values of the nearest neighbors, as shown in Figure 11, where the weights are dependent on the flow patterns.

Fig. 12. Toroidal Shift and their Gray code ordering

Fig. 14. There can be no cross edges (a), nor edges between the leaf noces and a node at a level an even height away from the leaf n—3 more edges (since each node in a 2-ary n-cube has a total of n edges). The extra node can absorb n edges. Thus,

Proof. Define 5, as the graph obtained by taking 2 disjoint T,,_; and connecting their roots by a path of length 3 as in the construction of S4 shown in Figure 15 Now, we need to show that for n > 3, roots r and u of 7;,-, can be labeled such that dist(r, u) = 3. By induction, assume that 5,_1 is a subgraph of a 2-ary n — 1-cube with

Fig. 16. Two subtrees created by relabeling their node numbers

Fig.17. Tree Constructed from S’ and S$”. The new root node is represented by 3’, t', andt”,

Fig.18. An Example of Tree Embedding with Dy = 2

Fig. 19. Multgrid structure for N=16 Processors at the Finest Level

Fig. 20. Error Frequency Reduction Using Multgrid FMV-cycles start at the coarsest arrays and compute an initial guess that is passed down to the next level. That level then does a few iterations and does a single V-cycle to improve its guesses before passing them down. Once the lowest level is reached, the process continues as a regular V-cycle. Embedding of Pyramid into n-cube

Fig. 21. V and W Cycles of the Multigrid Method

Fig.22. 2D Pyramid Generated by HBRGC HB 4 Models of Embedding, Partitioning and Mapping

Outlines 18, 19 can each be proven by applying the assignment axiom. Out- line 17 can be shown by substituting i for n and using the consequence rule.

Most people are familiar with the N Puzzle Problem. The game » begins with a given board configuration where the tiles are out of order. The objective is to find a solution that takes the minimal moves to go from the initial board configuration to a known final configuration. Figure 23 shows this. a I ae

Fig. 24. An abstraction of the N Puzzle Problem (N = 8) tile positions with N tiles distinctly numbered ranging from 1 to N and one blank space denoted by a zero. The number of ways of reaching the objective is described best by means of a tree structure. The initial configuration corresponds to the root of a search tree (see Figure 24), where its children are the result of moving an adjacent tile into the blank position. A move which is legal is defined below.

Fig. 29. Each Process Sends a Solution Theorem 39. ¢2 + EF¢3, where $2 = at(Ii) Aat(), ¢3 = (S—at(I,) Aat(A)A (Srecu; = Stask;) A (i = 83) A(a’ < @)). (8; = 8;) denotes that process Pj 18 informed of the current task set in process P;, (Sreev; = Stask;) asserts that P; receives the current best bound of P;, and (a’ < a) represents the total number of unexplored states is decreasing. This theorem shows that each process works independently on its lowest cost path(erpansion), and broadcasts when a solution is found.(See Figure 29)

Fig. 30. P; Receives the Current Best Bound and Task Set of P; Theorem 40. ¢3 + EF ¢4, where $3 = at(I,) A at(4), 4 = (at(lh) A at(h) A (Srecu; = Stask;) A(S; = 8i))- (Srecv, = Stask;) asserts that P; receives the current best bound of P;, and (s; = s;) denotes that P; is informed of the current task set in P;, This example shows that P; receives the current best bound and the current task set of P;. (See Figure 30)

Fig. 31. Task Migration.