0% found this document useful (0 votes)
70 views23 pages

Course 3 Module 5

Uploaded by

Samantha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views23 pages

Course 3 Module 5

Uploaded by

Samantha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Computer Architecture

ELE 475 / COS 475


Slide Deck 4: Superscalar 1
David Wentzlaff
Department of Electrical Engineering
Princeton University

1
Types of Data Hazards
Consider executing a sequence of
rk ri op rj
type of instructions
Data-dependence
r3  r1 op r2 Read-after-Write
r5  r3 op r4 (RAW) hazard

Anti-dependence
r3  r1 op r2 Write-after-Read
r1  r4 op r5 (WAR) hazard

Output-dependence
r3  r1 op r2 Write-after-Write
r3  r6 op r7 (WAW) hazard
2
Introduction to Superscalar Processor
• Processors studied so far are fundamentally
limited to CPI >= 1
• Superscalar processors enable CPI < 1 (IPC > 1)
by executing multiple instructions in parallel
• Can have both in-order and out-of-order
superscalar processors. We will start with in-
order.

3
Baseline 2-Way In-Order Superscalar
Processor

IR0 Branch Cond.


ALU
PC
addr
rdata A
RF RF
Instr. IR1 Read
Cache
Write
ALU
addr
B rdata

Data
Cache

Pipe A: Integer Ops., Branches


Pipe B: Integer Ops., Memory 4
Baseline 2-Way In-Order Superscalar
Processor

4 Read 2 Write
IR0 Ports Branch Cond. Ports
ALU
PC
addr
rdata A
RF RF
Instr. IR1 Read
Cache
Write
ALU
addr
B rdata

Data
Fetch 2 Instructions at Cache
same time
Pipe A: Integer Ops., Branches
Pipe B: Integer Ops., Memory 5
Baseline 2-Way In-Order Superscalar
Processor

IR0 Branch Cond.


ALU
PC
addr
rdata A
RF RF
Instr. IR1 Read
Cache
Write
ALU
addr
B rdata

Data
Issue Logic / Cache
Instruction
Steering Pipe A: Integer Ops., Branches
Pipe B: Integer Ops., Memory 6
Baseline 2-Way In-Order Superscalar
Duplicate Control
Processor
IR0 Decode A

IR1 Decode B

IR0 Branch Cond.


ALU
PC
addr
rdata A
RF RF
Instr. IR1 Read
Cache
Write
ALU
addr
B rdata

Data
Cache

Pipe A: Integer Ops., Branches


Pipe B: Integer Ops., Memory 7
Issue Logic Pipeline Diagrams
OpA F D A0 A1 W CPI = 0.5 (IPC = 2)
OpB F D B0 B1 W
OpC F D A0 A1 W Double Issue Pipeline
Can have two instructions in
OpD F D B0 B1 W same stage at same time
OpE F D A0 A1 W
OpF F D B0 B1 W

ADDIU F D A0 A1 W
LW F D B0 B1 W
Instruction Issue Logic swaps from
LW F D B0 B1 W natural position
ADDIU F D A0 A1 W
LW F D B0 B1 W
Structural
LW F D D B0 B1 W Hazard
8
Dual Issue Data Hazards
No Bypassing:
ADDIU R1,R1,1 F D A0 A1 W
ADDIU R3,R4,1 F D B0 B1 W
ADDIU R5,R6,1 F D A0 A1 W
ADDIU R7,R5,1 F D D D D A0 A1 W

Full Bypassing:
ADDIU R1,R1,1 F D A0 A1 W
ADDIU R3,R4,1 F D B0 B1 W
ADDIU R5,R6,1 F D A0 A1 W
ADDIU R7,R5,1 F D D A0 A1 W 9
Dual Issue Data Hazards
Order Matters:
ADDIU R1,R1,1 F D A0 A1 W
ADDIU R3,R4,1 F D B0 B1 W
ADDIU R7,R5,1 F D A0 A1 W
ADDIU R5,R6,1 F D B0 B1 W

WAR Hazard Possible?

10
Fetch Logic and Alignment
Cyc Addr Instr
0 0x000 OpA 0x000 0 0 1 1
0 0x004 OpB
1 0x008 OpC …
1 0x00C J 0x100
… 0x100 2 2
2 0x100 OpD
2 0x104 J 0x204 …

3 0x204 OpE 0x200 3 3
3 0x208 J 0x30C


4 0x30C OpF 0x300 4
4 0x310 OpG
5 0x314 OpH 0x310 4 5

Fetching across cache Lines is


very hard. May need extra ports 11
Fetch Logic and Alignment
Cyc Addr Instr
0 0x000 OpA Ideal, No Alignment Constraints
0 0x004 OpB
1 0x008 OpC OpA F D A0 A1 W
1 0x00C J 0x100 OpB F D B0 B1 W
… OpC F D B0 B1 W
2 0x100 OpD J F D A0 A1 W
2 0x104 J 0x204 OpD F D B0 B1 W
… J F D A0 A1 W
3 0x204 OpE OpE F D B0 B1 W
3 0x208 J 0x30C J F D A0 A1 W
… OpF F D A0 A1 W
4 0x30C OpF OpG F D B0 B1 W
4 0x310 OpG OpH F D A0 A1 W
5 0x314 OpH

12
With Alignment Constraints
Cyc Addr Instr
? 0x000 OpA 0x000 0 0 1 1
? 0x004 OpB
? 0x008 OpC …
? 0x00C J 0x100
… 0x100 2 2
? 0x100 OpD
? 0x104 J 0x204 …

? 0x204 OpE 0x200 3 3 4 4
? 0x208 J 0x30C


? 0x30C OpF 0x300 5 5
? 0x310 OpG
? 0x314 OpH 0x310 6 6

13
With Alignment Constraints
Cyc Addr Instr
1 0x000 OpA F D A0 A1 W
1 0x004 OpB F D B0 B1 W
2 0x008 OpC F D B0 B1 W
2 0x00C J 0x100 F D A0 A1 W
3 0x100 OpD F D B0 B1 W
3 0x104 J 0x204 F D A0 A1 W
4 0x200 ? F - - - -
4 0x204 OpE F D A0 A1 W
5 0x208 J 0x30C F D A0 A1 W
5 0x20C ? F - - - -
6 0x308 ? F - - - -
6 0x30C OpF F D A0 A1 W
7 0x310 OpG F D A0 A1 W
7 0x314 OpH F D B0 B1 W
14
Precise Exceptions and Superscalars
• Similar to tracking program order for data
dependencies, we need to track order for
exceptions

LW F D B0 B1 W
SYSCALL F D A0 A1 W

LW is in B pipeline, but commits first in logical


order!

15
Bypassing in Superscalar Pipelines

IR0 Branch Cond.


ALU
PC
addr
rdata A
RF RF
Instr. IR1 Read
Cache
Write
ALU
addr
B rdata

Data
Cache

16
Bypassing in Superscalar Pipelines

Branch Cond.
ALU

A
RF
Write
ALU
addr
B rdata

Data
Cache

17
Bypassing in Superscalar Pipelines

Branch Cond.
ALU

A
RF
Write
ALU
addr
B rdata

Data
Cache

18
Bypassing in Superscalar Pipelines

Branch Cond.
ALU

A1 3 5
RF
Write
ALU
addr
B2 rdata

Data
4 6
Cache

19
123456
Breaking Decode and Issue Stage
• Bypass Network can become very complex
• Can motivate breaking Decode and Issue Stage
D = Decode, Possibly resolve structural Hazards
I = Register file read, Bypassing, Issue/Steer
Instructions to proper unit

OpA F D I A0 A1 W
OpB F D I B0 B1 W
OpC F D I A0 A1 W
OpD F D I B0 B1 W
20
Superscalars Multiply Branch Cost
BEQZ F D I A0 A1 W
OpA F D I B0 - -
OpB F D I - - -
OpC F D I - - -
OpD F D - - - -
OpE F D - - - -
OpF F - - - - -
OpG F - - - - -
OpH F D I A0 A1 W
OpI F D I B0 B1 W
21
Acknowledgements
• These slides contain material developed and copyright by:
– Arvind (MIT)
– Krste Asanovic (MIT/UCB)
– Joel Emer (Intel/MIT)
– James Hoe (CMU)
– John Kubiatowicz (UCB)
– David Patterson (UCB)
– Christopher Batten (Cornell)

• MIT material derived from course 6.823


• UCB material derived from course CS252 & CS152
• Cornell material derived from course ECE 4750

22
Copyright © 2013 David Wentzlaff

23

You might also like