0% found this document useful (0 votes)
136 views33 pages

RISC Pipeline Challenges

This document discusses pipelining in computer processors. It introduces pipelining and how it can increase instruction throughput by overlapping the execution of multiple instructions. A classic five-stage RISC processor pipeline including fetch, decode, execute, memory, and writeback stages is described. The document discusses pipeline performance metrics including speedup, efficiency, and throughput. It also discusses challenges to pipelining like data dependencies that prevent dependent instructions from executing simultaneously.

Uploaded by

Rachu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views33 pages

RISC Pipeline Challenges

This document discusses pipelining in computer processors. It introduces pipelining and how it can increase instruction throughput by overlapping the execution of multiple instructions. A classic five-stage RISC processor pipeline including fetch, decode, execute, memory, and writeback stages is described. The document discusses pipeline performance metrics including speedup, efficiency, and throughput. It also discusses challenges to pipelining like data dependencies that prevent dependent instructions from executing simultaneously.

Uploaded by

Rachu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Unit-3

Pipelining
By
Dr. Sarvesh Vishwakarma
Professor – CSE
TCS 704 - Advanced Computer Architecture

Dr. Sarvesh Vishwakarma TCS-704 ACA


Introduction of pipelining
• Pipelining can efficiently increase the performance of processor by
overlapping execution of instructions. But the efficiency of the pipelining
depends upon, how the problem encounter during the implementation of
pipelining is handled. These problems are known as Hazards.

• Pipelining is an implementation techniques where by multiple instructions


are overlapped in execution. It takes advantage of parallelism that exists
among the actions needed to execute an instruction. A pipeline is like an
assembly line, In an automobile assembly line, there are many steps, each
constituting something to the construction of the car. Each steps operate
in parallel with the other steps.

• Pipelining increases the CPU instruction throughput the number of


instruction completed per unit of time-but it does not reduce the
execution time of an individual instruction. In fact, it usually slightly
increase the execution time of each instruction due to overhead in the
control of the pipeline.

Dr. Sarvesh Vishwakarma TCS-704 ACA


Classic five-stage pipeline for a RISC processor

Input
Common Clock
L
S1

L
S2

L
Sk

Output

Dr. Sarvesh Vishwakarma TCS-704 ACA


Classic five-stage pipeline for a RISC processor
The Logic circuitry in each stage 𝑆𝑖 has a time delay denoted by τ𝑖
Let τ𝑙 be time delay of each interface latch
Clock period of a linear pipeline is defined by speed of the slowest stage plus overhead
τ = max τ𝑖 1𝑘 + τ𝑙
τ = 𝜏𝑚 + 𝜏𝑙
1
The reciprocal of the clock period is called the frequency ‘𝑓’ of a pipeline processor 𝑓 =
τ
𝐼𝑗𝑖 ∶ 𝑡ℎ𝑒 𝑗𝑡ℎ 𝑖𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛 𝑜𝑛 𝑖𝑡ℎ 𝑠𝑡𝑎𝑔𝑒

1 2 3 4 5 6 7 8 9 10 11 12 13
S 𝐼11 𝐼21 𝐼31 𝐼41 𝐼51 𝐼61 𝐼71
Space

S1 𝐼12 𝐼22 𝐼32 𝐼42 𝐼52 𝐼62 𝐼72


S2 𝐼13 𝐼23 𝐼33 𝐼43 𝐼53 𝐼63 𝐼73
S3 𝐼14 𝐼24 𝐼34 𝐼44 𝐼54 𝐼64 𝐼74
S4 𝐼15 𝐼25 𝐼35 𝐼45 𝐼55 𝐼65 𝐼75
5

Time

Dr. Sarvesh Vishwakarma TCS-704 ACA


Classic five-stage pipeline for a RISC processor
Ideally, a linear pipeline processor with k-stage can process n instructions in
𝑇𝑘 = 𝑘 + (𝑛 − 1) clock periods
𝑘 cycles are used to filled up the pipeline or to complete execution of the first instruction I 1
𝑛 − 1 cycles are needed to complete the remaining 𝑛 − 1 instructions
a non-pipeline processor with k-stage can process n instructions in
𝑇1 = 𝑘(𝑛) clock periods

𝐼𝑗𝑖 ∶ 𝑡ℎ𝑒 𝑗𝑡ℎ 𝑖𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛 𝑜𝑛 𝑖𝑡ℎ 𝑠𝑡𝑎𝑔𝑒

1 2 3 4 5 6 7 8 9 10 11 12 13
S 𝐼11 𝐼21 𝐼31
Space

S1 𝐼12 𝐼22 𝐼32


S2 𝐼13 𝐼23 𝐼33
S3 𝐼14 𝐼24
S4 𝐼15 𝐼25
5

Time

Dr. Sarvesh Vishwakarma TCS-704 ACA


Classic five-stage pipeline for a RISC processor
Speedup (𝑆𝑘 ) of a k-stage linear pipeline processor over an equivalent non-pipeline processor as
𝑇1 𝑛×𝑘
Speedup = =
𝑇𝑘 𝑘+(𝑛−1)
For n ≫ 𝑘
𝑛×𝑘
Speedup 𝑆𝑘 ≈ =𝑘
𝑛
The maximum speedup that a linear pipeline can provide is k, where k is the number of stages in the pipeline . This
max. speedup is never fully achievable because of data dependencies between instructions, interrupts, program
branches, and other factors.

Pipeline efficiency (η): Large the number of instructions flowing through the pipeline, the better is its efficiency.
Let n : Nos. of instructions

𝑘 : No. of pipeline stages


τ ∶ 𝑐𝑙𝑜𝑐𝑘 𝑝𝑒𝑟𝑖𝑜𝑑 𝑜𝑓 𝑎 𝑙𝑖𝑛𝑒𝑎𝑟 𝑝𝑖𝑝𝑒𝑙𝑖𝑛𝑒

𝑛×𝑘×τ 𝑛
Pipeline efficiency η = {𝑘+ 𝑛−1 }×τ
= 𝑘+𝑛−1 (if 𝑛 ≫ 𝑘 )

η should approach to 1

Dr. Sarvesh Vishwakarma TCS-704 ACA


Classic five-stage pipeline for a RISC processor
Throughput (W): Number of instructions that can be executed by a pipeline processor per unit time is called its
throughput rate. This reflect the computing power of a pipeline processor.
𝑛 𝑛 1
W= = ×
{𝑘+ 𝑛−1 }×τ 𝑘+𝑛−1 τ
𝑛
Since η =
𝑘+𝑛−1
1
Therefore, Throughput W = η ×
τ
when 𝑛 ≫ 𝑘 then, η → 1
1
So, Throughput W ≈
τ
Throughput become equivalent to frequency of pipeline processor.

Dr. Sarvesh Vishwakarma TCS-704 ACA


Classic five-stage pipeline for a RISC processor
1 2 3 4 5 6 7 8 9
S 𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚
Space

S1 𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚
2
S 𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚
3 𝐼𝑗 ∶ 𝑡ℎ𝑒 𝑗𝑡ℎ 𝑖𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛
S 𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚
4
S
𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚
5 𝐵𝑢𝑠𝑦 𝑆𝑡𝑎𝑡𝑒
Performance =
Time 𝐵𝑢𝑠𝑦 𝑆𝑡𝑎𝑡𝑒+𝑖𝑑𝑒𝑎𝑙 𝑆𝑡𝑎𝑡𝑒
Can be shifted here 𝑘×𝑛×τ 𝑛
= =
𝑘×{𝑘+ 𝑛−1 }×τ 𝑘+ 𝑛−1
1 2 3 4 5 6 7 8 9 If 𝑛 → ∞,
𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚
Space

S 𝑛
Then Performance ≈ = 1
S1 𝑛
𝐼𝑚 𝐼𝑖 𝐼𝑗 𝐼𝑘 𝐼𝑙
2
S 𝐼𝑙 𝐼𝑚 𝐼𝑖 𝐼𝑗 𝐼𝑘
3
S 𝐼𝑘 𝐼𝑙 𝐼𝑚 𝐼𝑖 𝐼𝑗
4
S
𝐼𝑗 𝐼𝑘 𝐼𝑙 𝐼𝑚 𝐼𝑖
5 Time
Busy state Ideal state

Dr. Sarvesh Vishwakarma TCS-704 ACA


Introduction
Ii……………
Pipelining Ij……………
Ik……………
Il……………
• Challenges: Im……………

– Data dependency
• Instruction Ij is data dependent on instruction Ii if
– Instruction Ii produces a result that may be used by instruction Ij

• Dependent instructions cannot be executed


simultaneously

Dr. Sarvesh Vishwakarma TCS-704 ACA


Introduction
Ii……………
Pipelining Ik……………

Ij……………

• Challenges: Il……………

– Data dependency Im……………


– Instruction Ij is data dependent on instruction Ik and

• Dependent instructions cannot be executed


simultaneously

Dr. Sarvesh Vishwakarma TCS-704 ACA


Introduction
Pipelining Ii……………
Ik……………

Ij……………
• Challenges: Il……………
– Data dependency
Im……………

– Instruction Ij is data dependent on instruction Ik and


instruction Ik is data dependent on instruction Ii.

• Dependent instructions cannot be executed


simultaneously

Dr. Sarvesh Vishwakarma TCS-704 ACA


Introduction
Pipelining Ii……………
Ik……………

• Challenges: Ij……………

Il……………
– Data dependency Im……………

• Instruction Ij is data dependent on instruction Ii if


– Instruction Ij is data dependent on instruction Ik and
instruction Ik is data dependent on instruction Ii.

• Dependent instructions cannot be executed


simultaneously
Dr. Sarvesh Vishwakarma TCS-704 ACA
Introduction
Pipelining

• Challenges:
– Data dependency
• Instruction Ij is data dependent on instruction Ii if
– Instruction Ii produces a result that may be used by instruction Ij
– Instruction Ij is data dependent on instruction Ik and instruction
Ik is data dependent on instruction Ii.

• Dependent instructions cannot be executed


simultaneously

Dr. Sarvesh Vishwakarma TCS-704 ACA


Range of Instruction
ADD.D F4, F0, F2 F4 = F0 + F2
SUB.D F6, F5, F4 F6 = F5 – F4
SUB.D instruction depends on
Ii ADD.D F4, F0, F2
ADD.D due to transfer of data Ij SUB.D F6, F5, F4
through resource object F4.
Ik……………
Il……………
Im……………

Dr. Sarvesh Vishwakarma TCS-704 ACA


Non-pipeline processor
ADD.D F4, F0, F2 F4 = F0 + F2
SUB.D F6, F5, F4 F6 = F5 – F4
SUB.D instruction depends on
ADD.D due to transfer of data
through resource object F4.

stage 1 2 3 4 5 6 7 8 9 10
IF ADD SUB
ID ADD SUB
Rop ADD SUB
EXE ADD SUB
WB ADD SUB

Dr. Sarvesh Vishwakarma TCS-704 ACA


Pipeline processor
Data Hazards

ADD.D F4, F0, F2 F4 = F0 + F2


SUB.D F6, F5, F4 F6 = F5 – F4
SUB.D instruction depends on
ADD.D due to transfer of data Reading a value from F4
through resource object F4.

stage 1 2 3 4 5 6 7 8 9 10
IF ADD SUB
ID ADD SUB
Rop ADD SUB
EXE ADD SUB
WB ADD SUB

Value written
in F4 TCS-704 ACA
Dr. Sarvesh Vishwakarma
Pipeline processor
Avoid Data Hazards

ADD.D F4, F0, F2 F4 = F0 + F2


SUB.D F6, F5, F4 F6 = F5 – F4
SUB.D instruction depends on Reading a value from F4
ADD.D due to transfer of data
through resource object F4.

stage 1 2 3 4 5 6 7 8 9 10
IF ADD SUB
ID ADD SUB
Rop ADD STALL STALL SUB
EXE ADD STALL SUB
WB ADD SUB

Value written
in F4 TCS-704 ACA
Dr. Sarvesh Vishwakarma
Introduction
Data Dependence
• Dependencies are a property of programs
• Pipeline organization determines if dependence is
detected and if it causes a stall

• Data dependence conveys:


– Possibility of a hazard
– Order in which results must be calculated
– Upper bound on exploitable instruction level
parallelism

• Dependencies that flow through memory


locations are difficult to detect
Dr. Sarvesh Vishwakarma TCS-704 ACA
Name Dependence continue…….

Anti dependence

I0…………… Instruction I1 precedes Instruction I3 in


program order.
I1…………… READ There is no flow of
I2…………… Resource data between
I3…………… WRITE Object Instructions
I4…………… associated with
resource object.

The Original ordering must be preserved to ensure


No data flow from
that Instruction I1 read the correct value.
Instruction I1 to
Instruction I3 in
this program.

Dr. Sarvesh Vishwakarma TCS-704 ACA


Name Dependence continue…….

Output dependence

I0…………… Instruction I1 precedes Instruction I3 in


program order.
I1…………… WRITE There is no flow of
I2…………… Resource data between
I3…………… WRITE Object Instructions
I4…………… associated with
resource object.

The ordering between the instructions must be


No data flow from
preserved to ensure that the value finally written Instruction I1 to
corresponds to Instruction I3 . Instruction I3 in
this program.

Dr. Sarvesh Vishwakarma TCS-704 ACA


Introduction
Name Dependence
• Two instructions use the same name but no flow
of information
– Not a true data dependence, but is a problem when
reordering instructions
– Antidependence: instruction j writes a register or
memory location that instruction i reads
• Initial ordering (i before j) must be preserved
– Output dependence: instruction i and instruction j
write the same register or memory location
• Ordering must be preserved

• To resolve, use renaming techniques


Dr. Sarvesh Vishwakarma TCS-704 ACA
Introduction
Other Factors
• Data Hazards
– Read after write (RAW)
– Write after write (WAW)
– Write after read (WAR)

• Control Dependence
– Ordering of instruction i with respect to a branch
instruction
• Instruction control dependent on a branch cannot be moved
before the branch so that its execution is no longer
controller by the branch
• An instruction not control dependent on a branch cannot be
moved after the branch so that its execution is controlled by
the branch

Dr. Sarvesh Vishwakarma TCS-704 ACA


Pipeline Hazards
Pipelining can efficiently increase the performance of
processor by overlapping execution of instructions. But
the efficiency of the pipelining depends upon, how the
problem encountered during the implementation of
pipelining is handled. These problems are known as
Hazards.

1) Data Hazards:- due to true data & name dependence


2) Control Hazards:- due to presence of branch instruction
3) Resource/structure Hazards:- due to lack of resources

Dr. Sarvesh Vishwakarma TCS-704 ACA


• Structural hazards
– If some combination of instructions can not be
accommodated because of a resource conflict, the
machine is said to have a structural hazards.

– This type of hazard occurs, when two activities require


the same resource simultaneously. Under this
situation hardware can not support all possible
combination of instruction simultaneously.

– Also, named as Resource Hazard.

Dr. Sarvesh Vishwakarma TCS-704 ACA


Implementation of Pipeline
• Need to take stalls at run time during
execution of instructions if there exist a data
dependence.

• Pipeline CPI = (Ideal pipeline CPI + Structural


stalls + Data hazard stalls + Control stalls)

Dr. Sarvesh Vishwakarma TCS-704 ACA


What makes pipelining hard to
implement?
• Pipeline hazards
– True data dependence
– Name dependence
• Anti dependence
• Output dependence
– Control dependence
– Resource hazards

Dr. Sarvesh Vishwakarma TCS-704 ACA


Domain of Instruction
• Set of resource objects whose data objects may
affect the execution of instruction `I’
• Denoted by D(I)

L.D F0, 0(R1) D(L.D) = {R1}


ADD.D F4, F0, F2 D(ADD.D) = {F0, F2}
S.D F4, 0(R1) D(S.D) = {R1,F4}

Dr. Sarvesh Vishwakarma TCS-704 ACA


Range of Instruction
• Set of resource objects whose data objects may
be modified by the execution of instruction `I’
• Denoted by R(I)

L.D F0, 0(R1) R(L.D) = {F0}


ADD.D F4, F0, F2 R(ADD.D) = {F4}
S.D F4, 0(R1) R(S.D) = {Nil}

Dr. Sarvesh Vishwakarma TCS-704 ACA


Pipeline Hazard Detection and
Resolution

R (I )  D(J )   for RAW

D(I )  R (J )   for WAR

R (I )  R (J )   for WAW

Dr. Sarvesh Vishwakarma TCS-704 ACA


RAW hazard condition
(Pure data dependence)

Instruction I
D(I) R(I)
(Write)
Instruction J
D(J) R(J)
(Read)

Dr. Sarvesh Vishwakarma TCS-704 ACA


WAR hazard condition
(Anti dependence)

Instruction I
D(I) R(I)
(Read)
Instruction J
D(J) R(J)
(Write)

Dr. Sarvesh Vishwakarma TCS-704 ACA


WAW hazard condition
(Output dependence)

Instruction I
D(I) R(I)
(Write)

R(J)
Instruction J
(Write)
D(J)

Dr. Sarvesh Vishwakarma TCS-704 ACA


Find Data Dependence
True Data depenence (RAW)
D(ADD.D)  R(LD) = {F0} L.D F0, 0(R1) D(L.D) = {R1} R(L.D) ={F0}
ADD.D F4, F0, F2 D(ADD.D) = {F0, F2} R(ADD.D) ={F4}
D(SD)  R(LD) =  S.D F4, 0(R1) D(S.D) = {R1,F4} R(S.D) ={Nil}
D(SD)  R(ADD.D) = {F4}
Two cases of true data dependence.

Output dependence (WAW)


R(LD)  R(ADD.D) = 
R(LD)  R(SD) = 
R(ADD.D)  R(SD) = 
No case of output dependence.

Antidependence (WAR)
D(LD)  R(ADD.D) = 
D(LD)  R(SD) = 
D(ADD.D)  R(SD) = 
No case of anti dependence.
Dr. Sarvesh Vishwakarma TCS-704 ACA

You might also like