Matrix Multiplication
Algorithms
U.A.Nuli
quential Matrix Multiplication Algorithm
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
2D Mesh with Wraparound Connections
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
Matrix Multiplication on 2D SIMD Mesh
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
n the Hypercube SIMD model with n3 = 23q Processors, Two nxn matrix multiplica
be carried out in (log n) time
processing elements can be thought of as filling an nxnxn lattice
essor Pm where 0 m 23q -1, has local memory locations a,b,c,s,t
ix Elements a(i, j) and b(i,j) are stored I variable a,b of processor P(2 qi+j)
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
MATRIX MULTIPLICATION (HYPERCUBE
SIMD)
Parameter: q {Matrix size is 2 q 2 q}
Glogal: l Local: a, b, c, s,t
begin
{ Phase 1: Broadcast matrices A and
B}
for l 3q 1 downto 2q do
for all Pm, where BIT(m, l) = 1 do
t BIT COMPLEMENT(m, l)
a [t]a
b [t]b
endfor
endfor
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
for l q 1 downto 0 do
for all Pm, where BIT(m, l) != BIT(m, 2q +
l) do
t BIT COMPLEMENT(m, l)
a [t]a
endfor
endfor
I =0 for q=1
Proce
ssor
ID
BIT(ID, BIT(ID, t
0)
2)
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
P6
P4
P7
P5
P2
P3
P0
P1
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
for l 2q 1 downto q do
for all Pm, where BIT(m, l) != BIT(m, q +
l) do
t BIT COMPLEMENT(m, l)
b [t]b
endfor
endfor
I =1 for q=1
q+l=2
Proce
ssor
ID
BIT(ID, BIT(ID, t
1)
2)
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
P6
P4
P7
P5
P2
P3
P0
P1
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
{ Phase 2: Do the multiplications in parallel
}
for all Pm do
cab
Endfor
{ Phase 3: Sum the products }
for l 2q to 3q 1 do
for all Pm do
t BIT COMPLEMENT(m, l)
s [t]c
cc+s
endfor
endfor
end
MATRIX MULTIPLICATION (HYPERCUBE SIMD)
trix Multiplication Algorithm for UMA Multiprocessor
trix Multiplication Algorithm for UMA Multiprocessor
Total Processors = P
Matrix Size = N*N
Processor P0 (m=0) works on Row no = 0,0+P, 0+2P, ... <=N
Processor P1 (m=1) works on Row no = 1,1+P, 1+2P, <=N
Processor P2 (m=2) works on Row no = 2,2+P, 0+2P, <=N
trix Multiplication Algorithm for UMA Multiprocessor
Matrix Multiplication Algorithm for Multicomputers
Row-Column-Oriented Algorithm
Block-Oriented Algorithm
Row-Column-Oriented Algorithm
Row-Column-Oriented Algorithm
Row-Column-Oriented Algorithm
Row-Column-Oriented Algorithm
Row-Column-Oriented Algorithm
Row-Column-Oriented Algorithm