0% found this document useful (0 votes)

59 views10 pages

Aca UNIT-5

The document discusses multi-vector and SIMD computers, detailing their architectures and processing principles. It explains SIMD array processors, their organization, and the types of vector processing principles, including gather and scatter instructions. Additionally, it covers the architectures of the CM-2 and CM-5 Connection Machines, highlighting their processing nodes, networks, and operational paradigms.

Uploaded by

sarah .s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views10 pages

Aca UNIT-5

Uploaded by

sarah .s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT-5

1Q.Discuss the Multi vector and SIMD Computers

Ans: MULTIVECTOR COMPUTERS
Vector supercomputer (same as unit-1 ans)
SIMD COMPUTERS
1. SIMD array processors:
o A synchronous array of parallel processors is called an array processor. These
processors are composed of N identical processing elements (PES) under the
supervision of a one control unit (CU) This Control unit is a computer with high
speed registers, local memory and arithmetic logic unit.
o An array processor is basically a single instruction and multiple data (SIMD)
computers. There are N data streams; one per processor, so different data can
be used in each processor. The figure below show a typical SIMD or array
processor.

o These processors consist of a number of memory modules which can be either

global or dedicated to each processor. Thus the main memory is the aggregate of
the memory modules. These Processing elements and memory unit
communicate with each other through an interconnection network. SIMD
processors are especially designed for performing vector computations
2. SIMD COMPUTER ORGANIZATIONS:
Vector processing can also be carried out by SIMD computers.
Most SIMD computers use a single control unit and distributed memories, except for a
few that use associative memories.
Based on memory distribution and addressing schemes SIMD computer models are
divided into:
1
a. Distributed-Memory SIMD model
b. Shared-Memory SIMD model
a. Distributed-Memory Model:
o Spatial parallelism is exploited among the PEs in an SIMD computer. A
distributed-memory SIMD computer consists of an array of PEs which are
controlled by the same array control unit, as shown in Fig. a. Program and data
are loaded into the control memory through the host computer.
o An instruction is sent to the control unit for decoding. If it is a scalar or program
control operation, it will be directly executed by a scalar processor attached to
the control unit. If the decoded instruction is a vector operation, it will be
broadcast to all the PEs for parallel execution.
o Partitioned data sets are distributed to all the local memories attached to the
PEs through a vector data bus.
o The PEs are interconnected by a data-routing network which performs inter-PE
data communications such as shifting, permutation, and other routing
operations. The data—routing network is under program control through the
control unit.
o The PEs are synchronized in hardware by the control unit, the same instruction
is executed by all the PEs in the same cycle.

b. Shared-Memory Model:
o An alignment network is used as the inter-PE memory communication
network. Again this network is controlled by the control unit.
o this architecture has n = 16 PEs updating m = 17 shared-memory modules
through a 16x17 alignment network.
o The alignment network must be properly set to avoid access conflicts.
o In Fig, we show a variation of the SIMD computer using shared memory
among the PEs.
2
SIMD Instruction: SIMD computers execute vector instructions for arithmetic, logic, data-
routing, and masking operations over vector quantities. ln bit-slice SIMD machines, the
vectors are nothing but binary vectors. In word-parallel SIMD machines, the vector
components are 4- or 8-byte numerical values.
All SIMD instructions must use vector operands of equal length n, when n is the number of
PEs. SIMD instructions are similar to those used in pipelined vector processors, except that
temporal parallelism in pipelines is replaced by spatial parallelism in multiple PEs.
The data-routing instructions include permutations. broadcasts, multicasts, and various
rotate and shift operations. Masking operations are used to enable or disable a subset of PEs
in any instruction cycle.
Host and I/O: All I/O activities are handled by the host computer in the above SIMD
organizations. A special control memory is used between the host and the array control unit.
This is a staging memory for holding programs and data.
Divided data sets are distributed to the local memories or to the shared memory modules
before starting the program execution. The host manages the mass storage and graphics
display of computational results. The scalar processor operates concurrently with the PE
array under the coordination of the control unit.

2Q. Discuss the Vector processing principles.

Ans: VECTOR PROCESSING PRINCIPLES:
1. Vector processor:
o Vector processor is basically a central processing unit that has the ability to
execute the complete vector input in a single instruction.

3
o It is a complete unit of hardware resources that executes a sequential set of
similar data items in the memory using a single instruction. These instructions
are said to be single instruction multiple data or vector instructions.

o The functional units of a vector computer are as follows:

• IPU or instruction processing unit
• Vector register
• Scalar register
• Scalar processor
• Vector instruction controller
• Vector access controller
• Vector processor
o A vector is defined as an ordered set of a one-dimensional array of data items. A
vector V of length n can be represented as a row vector by V = [V1 V2 V3 · · · Vn].
o Usually, the vector elements are ordered to have a fixed addressing increment
between successive elements, called the stride.
o For a processor with multiple ALUs, it is possible to operate on multiple data
elements in parallel using a single instruction. Such instructions are called single-
instruction multiple-data (SIMD) instructions. They are also called vector
instructions.

2. Vector Instruction Types: (Spectrum pg4.14)

Gather and scatter instruction diagram:
Gather is an operation that fetches from memory the nonzero elements of a sparse
vector using indices that themselves are indexed. Scatter docs the opposite, storing
into memory a vector in a sparse vector whose nonzero entries are indexed. The vector
register V1 contains the data, and the vector register V0 is used as an index to gather
or scatter data from or to random memory locations as illustrated in Figs. a and b,
respectively.
4
Masking Instruction diagram:

3. Vector-access memory schemes or organizations (spectrum pg 4.15 Q16)

3Q. Explain in detail CM-2 Architecture.

Ans : CM-2:
The Connection Machine CM-2 is a fine-grain MPP(supercomputer) computer built using
thousands of parallel bit-slice PEs to achieve a peak processing speed.

5
ARCHITECTURE OF CM-2:
Program Execution Paradigm: All programs started execution on a front-end, which issued
microinstructions to the back-end processing array when data-parallel operations were
desired. The sequencer broke down these microinstructions and broadcast them to all data
processors in the array.
Data sets and results could be exchanged between the front-end and the processing array in
one of three ways: broadcasting, global combining, and scalar memory bus as depicted in Fig.
Broadcasting was carried out through the broadcast bus to all data processors at once.

The CM-2 Architecture consists of the following:

a. Processing Array
b. Processing Nodes
c. Hypercube Routers

6
a. Processing Array:
It is a bit-slice data processor(or PEs) whose size ranges from 4k to 64k which are
controlled by a sequencer. The sequencer decoded microinstructions from the front-
end and broadcast nano instructions to the processors in the array. All processors
could access their memories simultaneously. All processors executed the broadcast
instructions in a lockstep manner.
b. Processing Nodes:
• Each data processing node contained 32 bit-slice data processors, an optional
floating-point accelerator, and interfaces for inter-processor communication.
• Each data processor was implemented with a 3-input and 2-output bit-slice
ALU and associated latches and a memory interface.
• This ALU could perform bit-serial full-adder and Boolean logic operations.
• Each processor chip contained 16 processors. The 18-bit memory address is
used to enable sharing of 256K memory words among 32 processors.

c. Hypercube routers:
• Special hardware is built on each processor chip for data routing among
processor chips are wired together to form a Boolean n-cube
• Each router node is connected to 12 other router nodes, including its paired
node.

7
4Q. Discuss the connection machine CM-5
Ans: CM-5:
CM-5 stands for Connection machine 5th generation. It is a distributed memory parallel
computer that uses a large number of processing nodes each with its own memory. It is
known for its scalability, flexibility and its ability to handle complex scientific simulations.

Key architectural features:

1. A Synchronized MIMD Machine:
• Unlike SIMD architecture of CM-1 & CM-2 ,The CM-5 is designed with a synchronized
MIMD structure. The machine was designed to contain from 32 to 16,384 processing
nodes.
• Instead of using a single sequencer (as in the CM-2), the system used a number of
control processors. Each control processor was configured with memory and disk
based on the needs.
• Input and output were provided via high-bandwidth I/O interfaces to graphics devices,
mass secondary storage such as a data vault, and high-performance networks.

2.The Network Functions:

• The building blocks were interconnected by three networks:
➢ data network
➢ control network
➢ diagnostic network.

8
a. Data Network:
▪ The data network provides high-performance, point-to-point data communications
between the processing nodes.
▪ The date network is based on the fat-tree concept
▪ To route a message from one processor node to another, the message was sent up
the tree to the least common ancestor of the two processors and then down to the
destination.
▪ Fat Trees: A fat tree is more like a real tree in that it becomes thicker as it acquires
more leaves. Processing nodes, control processors, and I/O channels are located at
the leaves of a fat tree. The internal nodes are switches.
▪ The CM-5 data network is implemented with a 4-ary fat tree as shown in Fig. Each
of the internal switch nodes was made up of several router chips. Each router chip
was connected to four child chips and either two or four parent chips.

b. Control Network:
▪ The control network provided cooperative operations, including broadcast,
synchronization, and scans, as well as system management functions.
▪ The architecture of the control network is that of a complete binary tree with all
system components at the leaves. Each user partition was assigned to a subtree of
the network.
▪ Processing nodes were located at leaves of the subtree, and a control processor
was mapped into the partition at an additional leaf. The control processor executed
scalar part of the code, while the processing nodes executed the data-parallel part.

9
▪ CONTROL PROCESSOR:

▪ The basic control processor consisted of a RISC microprocessor (CPU), memory

subsystem, I/O with local disks and Ethernet connections, and a CM-5 network
interface.
▪ The network interface is connected to the control processor to the rest of the
system through the control network and the data network.
▪ Control processors specialized in managerial functions rather than computational
functions.
c. Diagnostic network:
▪ This network is needed for upgrading system availability. Built-in testability was
achieved with scan—based diagnostics.
▪ The diagnostic network allowed groups of pods to be addressed according to a
“hypercube-address" scheme. A special diagnostic interface was designed to form
an in-system check of the integrity of all CM-5 chips

Array & Vector Processor
No ratings yet
Array & Vector Processor
17 pages
SIMD and Associative Computational Models: Parallel & Distributed Algorithms
No ratings yet
SIMD and Associative Computational Models: Parallel & Distributed Algorithms
31 pages
Flynn's Taxonomy of Computer Architecture
No ratings yet
Flynn's Taxonomy of Computer Architecture
36 pages
Vector and SIMD Computer Systems
No ratings yet
Vector and SIMD Computer Systems
59 pages
Overview of Multiprocessors and SIMD Computers
No ratings yet
Overview of Multiprocessors and SIMD Computers
33 pages
Lecture 10 - SIMD Architecture
No ratings yet
Lecture 10 - SIMD Architecture
27 pages
SIMD Array Processor Architectures
No ratings yet
SIMD Array Processor Architectures
37 pages
Unit-1 ACA
No ratings yet
Unit-1 ACA
26 pages
Overview of Array Processor Architecture
100% (3)
Overview of Array Processor Architecture
14 pages
atII Bks Lec 2021 28
No ratings yet
atII Bks Lec 2021 28
6 pages
26-27 SIMD Architecture
No ratings yet
26-27 SIMD Architecture
33 pages
SIMD Architecture
100% (1)
SIMD Architecture
16 pages
SIMD Computer Organizations
0% (1)
SIMD Computer Organizations
20 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
Module 1-3
No ratings yet
Module 1-3
87 pages
CS-482 - Lecture#4 - Vector and Array Processors
No ratings yet
CS-482 - Lecture#4 - Vector and Array Processors
40 pages
Multiprocessor Time Sharing Systems
No ratings yet
Multiprocessor Time Sharing Systems
25 pages
1/1 Multiprocessors (Or) Shared Memory Multi-Processor Model
No ratings yet
1/1 Multiprocessors (Or) Shared Memory Multi-Processor Model
17 pages
Chapter
No ratings yet
Chapter
9 pages
Perfect ? I
No ratings yet
Perfect ? I
7 pages
Symmetric & Distributed Memory Architectures
No ratings yet
Symmetric & Distributed Memory Architectures
31 pages
Chapter 6 Parallel and Concurrent Computing
No ratings yet
Chapter 6 Parallel and Concurrent Computing
27 pages
UNIT-V-Pipeline and Array Processing and Multi Processors
No ratings yet
UNIT-V-Pipeline and Array Processing and Multi Processors
51 pages
Flynn's Taxonomy and SISD SIMD MISD MIMD
86% (14)
Flynn's Taxonomy and SISD SIMD MISD MIMD
7 pages
Module 4 Chapter 2
No ratings yet
Module 4 Chapter 2
42 pages
Unit 3: Parallel Processing Concepts
No ratings yet
Unit 3: Parallel Processing Concepts
17 pages
Advanced Computer Architecture Assigment
No ratings yet
Advanced Computer Architecture Assigment
60 pages
Flynn's Taxonomy of Parallel Processing
No ratings yet
Flynn's Taxonomy of Parallel Processing
7 pages
Single vs Multi-Core Processor Architectures
No ratings yet
Single vs Multi-Core Processor Architectures
31 pages
Baker CHPT 5 SIMD Good
No ratings yet
Baker CHPT 5 SIMD Good
94 pages
15CS72 ACA Module1 Chapter1Final
No ratings yet
15CS72 ACA Module1 Chapter1Final
25 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
35 pages
Computer Organization Exam Questions
No ratings yet
Computer Organization Exam Questions
15 pages
ch.9 Pipeline MoDIFIED
No ratings yet
ch.9 Pipeline MoDIFIED
76 pages
Associative Computing Models: SIMD Background
No ratings yet
Associative Computing Models: SIMD Background
39 pages
Parallel Computing for CS Students
No ratings yet
Parallel Computing for CS Students
9 pages
RV64V: A Vector Architecture Overview
No ratings yet
RV64V: A Vector Architecture Overview
29 pages
Parallel Processing Explained
No ratings yet
Parallel Processing Explained
33 pages
ACA1
No ratings yet
ACA1
29 pages
Onur Digitaldesign 2020 Lecture19 Simd Beforelecture
No ratings yet
Onur Digitaldesign 2020 Lecture19 Simd Beforelecture
64 pages
NOTES
No ratings yet
NOTES
19 pages
5 Marks Q. Describe Array Processor Architecture
No ratings yet
5 Marks Q. Describe Array Processor Architecture
11 pages
Parallel Computer Models: PCA Chapter 1
No ratings yet
Parallel Computer Models: PCA Chapter 1
61 pages
Parallel Processors: Session 2
No ratings yet
Parallel Processors: Session 2
32 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Advanced Computer Architecture: Presented By, Krishna
No ratings yet
Advanced Computer Architecture: Presented By, Krishna
35 pages
Multicore Architecture Insights
No ratings yet
Multicore Architecture Insights
186 pages
Embedded Systems and SIMD Architectures
No ratings yet
Embedded Systems and SIMD Architectures
3 pages
Flynn's Computer Architecture Overview
No ratings yet
Flynn's Computer Architecture Overview
17 pages
COE4590 10 Flyns
No ratings yet
COE4590 10 Flyns
15 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
Aca Unit 1.1
No ratings yet
Aca Unit 1.1
20 pages
Chapter - 5 Parallel Processing
No ratings yet
Chapter - 5 Parallel Processing
117 pages
Zareen 6
No ratings yet
Zareen 6
11 pages
Edu Crackers 4
No ratings yet
Edu Crackers 4
15 pages
Unit 4 - Parallel Computer Structures Word
No ratings yet
Unit 4 - Parallel Computer Structures Word
12 pages
A Comparative Analysis of SIMD and MIMD Architectures
No ratings yet
A Comparative Analysis of SIMD and MIMD Architectures
6 pages
The Rise and Fall of Thinking Machines
No ratings yet
The Rise and Fall of Thinking Machines
14 pages
Parallel VLSI Layout Verification
No ratings yet
Parallel VLSI Layout Verification
34 pages
SUPERCOMPUTERS1
No ratings yet
SUPERCOMPUTERS1
12 pages
Data Parallel Algorithms
No ratings yet
Data Parallel Algorithms
14 pages
Design Thinking Handbook
No ratings yet
Design Thinking Handbook
124 pages
Fine-Grain Multicomputers Explained
No ratings yet
Fine-Grain Multicomputers Explained
17 pages
Unit5 Aca
100% (1)
Unit5 Aca
11 pages
Aca UNIT-5
No ratings yet
Aca UNIT-5
10 pages
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
No ratings yet
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
34 pages
CSE 260 - Introduction To Parallel Computation: Larry Carter Carter@cs - Ucsd.edu
No ratings yet
CSE 260 - Introduction To Parallel Computation: Larry Carter Carter@cs - Ucsd.edu
22 pages
The Use of Huygens' Equivalence Principle For Solving 3-D Volume Integral Equation Scattering
No ratings yet
The Use of Huygens' Equivalence Principle For Solving 3-D Volume Integral Equation Scattering
8 pages
Articles by and On Philip Emeagwali at Michigan Today, University of Michigan Paper, Februray 1991, Vol.23.no.1
No ratings yet
Articles by and On Philip Emeagwali at Michigan Today, University of Michigan Paper, Februray 1991, Vol.23.no.1
16 pages
Looking at The Whole Problem at Once
No ratings yet
Looking at The Whole Problem at Once
19 pages
How To Program A Quantum Computer
No ratings yet
How To Program A Quantum Computer
175 pages

Aca UNIT-5

Uploaded by

Aca UNIT-5

Uploaded by

UNIT-5

1Q.Discuss the Multi vector and SIMD Computers

o These processors consist of a number of memory modules which can be either

2Q. Discuss the Vector processing principles.

o The functional units of a vector computer are as follows:

2. Vector Instruction Types: (Spectrum pg4.14)

3. Vector-access memory schemes or organizations (spectrum pg 4.15 Q16)

3Q. Explain in detail CM-2 Architecture.

The CM-2 Architecture consists of the following:

Key architectural features:

2.The Network Functions:

▪ The basic control processor consisted of a RISC microprocessor (CPU), memory

You might also like