0% found this document useful (0 votes)
142 views33 pages

Computer Architecture ECE 361 Lecture 6: ALU Design

This document discusses the design of an arithmetic logic unit (ALU) for a computer architecture course. It reviews ALU design concepts like bit-slicing and overflow handling. It then outlines the lecture topics which will be on deriving the ALU from instruction set requirements and the design of a multiplier. The document provides details on common arithmetic and logical instructions in MIPS, as well as requirements for the ALU based on supporting these instructions. It discusses adding an XOR function to the ALU and designs for different types of shifters. Finally, it introduces the topic of unsigned multiplication through examples and different hardware approaches.

Uploaded by

Rahul Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views33 pages

Computer Architecture ECE 361 Lecture 6: ALU Design

This document discusses the design of an arithmetic logic unit (ALU) for a computer architecture course. It reviews ALU design concepts like bit-slicing and overflow handling. It then outlines the lecture topics which will be on deriving the ALU from instruction set requirements and the design of a multiplier. The document provides details on common arithmetic and logical instructions in MIPS, as well as requirements for the ALU based on supporting these instructions. It discusses adding an XOR function to the ALU and designs for different types of shifters. Finally, it introduces the topic of unsigned multiplication through examples and different hardware approaches.

Uploaded by

Rahul Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 33

Computer Architecture

ECE 361
Lecture 6: ALU Design

361 ALU.1
Review: ALU Design

° Bit-slice plus extra on the two ends


° Overflow means number too large for the representation
° Carry-look ahead and other adder tricks

A 32 B 32

signed-arith
and cin xor co

a31 b31 a0 b0 4
ALU31 ALU0
M
co cin co cin
s31 s0
C/L to
produce
select,
comp,
Ovflw 32
S c-in
361 ALU.2
Review: Elements of the Design
Process
° Divide and Conquer (e.g., ALU)
• Formulate a solution in terms of simpler components.
• Design each of the components (subproblems)

° Generate and Test (e.g., ALU)


• Given a collection of building blocks, look for ways of putting
them together that meets requirement

° Successive Refinement (e.g., multiplier, divider)


• Solve "most" of the problem (i.e., ignore some constraints or
special cases), examine and correct shortcomings.

° Formulate High-Level Alternatives (e.g., shifter)


• Articulate many strategies to "keep in mind" while pursuing any
one approach.

° Work on the Things you Know How to Do


• The unknown will become “obvious” as you make progress.

361 ALU.3
Outline of Today’s
Lecture

° Deriving the ALU from the Instruction Set

° Multiply

361 ALU.4
MIPS arithmetic
° instructions
Instruction Example Meaning Comments
° add add $1,$2,$3 $1 = $2 + $3 3 operands; exception possible
° subtract sub $1,$2,$3 $1 = $2 – $3 3 operands; exception possible
° add immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possible
° add unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; no exceptions
° subtract unsigned subu $1,$2,$3 $1 = $2 – $3 3 operands; no exceptions
° add imm. unsign. addiu $1,$2,100 $1 = $2 + 100 + constant;
no exceptions
° multiply mult $2,$3 Hi, Lo = $2 x $3 64-bit signed product
° multiply unsigned multu$2,$3 Hi, Lo = $2 x $3 64-bit unsigned product
° divide div $2,$3 Lo = $2 ÷ $3, Lo = quotient, Hi = remainder
° Hi = $2 mod $3
° divide unsigned divu $2,$3 Lo = $2 ÷ $3, Unsigned quotient & remainder
° Hi = $2 mod $3
° Move from Hi mfhi $1 $1 = Hi Used to get copy of Hi
° Move from Lo mflo $1 $1 = Lo Used to get copy of Lo

361 ALU.5
MIPS logical
instructions
Instruction Example Meaning Comment
and and $1,$2,$3 $1 = $2 & $3 3 reg. operands; Logical AND
or or $1,$2,$3 $1 = $2 | $3 3 reg. operands; Logical OR
xor xor $1,$2,$3 $1 = $2 $3 3 reg. operands; Logical XOR
nor nor $1,$2,$3 $1 = ~($2 |$3) 3 reg. operands; Logical NOR
and immediate andi $1,$2,10 $1 = $2 & 10 Logical AND reg, constant
or immediate ori $1,$2,10 $1 = $2 | 10 Logical OR reg, constant
xor immediate xori $1, $2,10 $1 = ~$2 &~10 Logical XOR reg, constant
shift left logical sll $1,$2,10 $1 = $2 << 10 Shift left by constant
shift right logical srl $1,$2,10 $1 = $2 >> 10 Shift right by constant
shift right arithm. sra $1,$2,10 $1 = $2 >> 10 Shift right (sign extend)
shift left logical sllv $1,$2,$3 $1 = $2 << $3 Shift left by variable
shift right logical srlv $1,$2, $3 $1 = $2 >> $3 Shift right by variable
shift right arithm. srav $1,$2, $3 $1 = $2 >> $3 Shift right arith. by variable

361 ALU.6
Additional MIPS ALU
requirements

° Xor, Nor, XorI


=> Logical XOR, logical NOR or use 2 steps: (A OR B) XOR 1111....1111
° Sll, Srl, Sra
=> Need left shift, right shift, right shift arithmetic by 0 to 31 bits
° Mult, MultU, Div, DivU
=> Need 32-bit multiply and divide, signed and unsigned

361 ALU.7
Add XOR to
ALU

° Expand Multiplexor CarryIn

Mux
Result

1-bit
Full
B Adder

CarryOut

361 ALU.8
Shifters
Three different kinds:

logical-- value shifted in is always "0"

"0" msb lsb "0"

arithmetic-- on right shifts, sign extend

msb lsb "0"

rotating-- shifted out bits are wrapped around (not in MIPS)

left right
msb lsb msb lsb

Note: these are single bit shifts. A given instruction might request
0 to 32 bits to be shifted!
361 ALU.9
Administrative
Matters

361 ALU.10
MULTIPLY
(unsigned)
° Paper and pencil example (unsigned):

Multiplicand 1000
Multiplier 1001
1000
0000
0000
1000
Product 01001000

° m bits x n bits = m+n bit product

° Binary makes it easy:


•0 => place 0 ( 0 x multiplicand)
•1 => place a copy ( 1 x multiplicand)

° 4 versions of multiply hardware & algorithm:


•successive refinement

361 ALU.11
Unsigned Combinational
Multiplier 0 0 0 0
A3 A2 A1 A0
B0

A3 A2 A1 A0
B1

A3 A2 A1 A0
B2

A3 A2 A1 A0
B3

P7 P6 P5 P4 P3 P2 P1 P0

° Stage i accumulates A * 2 i if Bi == 1

° Q: How much hardware for 32 bit multiplier? Critical path?

361 ALU.12
How does it
work? 0 0 0 0 0 0 0
A3 A2 A1 A0
B0

A3 A2 A1 A0
B1

A3 A2 A1 A0
B2

A3 A2 A1 A0
B3

P7 P6 P5 P4 P3 P2 P1 P0

° at each stage shift A left ( x 2)

° use next bit of B to determine whether to add in shifted multiplicand

° accumulate 2n bit partial product at each stage

361 ALU.13
Unisigned shift-add multiplier (version
1)

° 64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg,


32-bit multiplier reg

Shift Left
Multiplicand
64 bits

Multiplier Shift Right


64-bit ALU
32 bits

Write
Product Control
64 bits

Multiplier = datapath + control

361 ALU.14
Multiply Algorithm Version Start
1
Multiplier0 = 1 1. Test Multiplier0 = 0
Multiplier0

1a. Add multiplicand to product &


place the result in Product register

° Product Multiplier Multiplicand


0000 0000 0011 0000 0010 2. Shift the Multiplicand register left 1 bit.
° 0000 0010 0001 0000 0100
3. Shift the Multiplier register right 1 bit.
° 0000 0110 0000 0000 1000

° 0000 0110 32nd No: < 32 repetitions


repetition?
Yes: 32 repetitions
Done
361 ALU.15
Observations on Multiply Version
1

° 1 clock per cycle => ­100 clocks per multiply


• Ratio of multiply to add 5:1 to 100:1

° 1/2 bits in multiplicand always 0


=> 64-bit adder is wasted

° 0’s inserted in left of multiplicand as shifted


=> least significant bits of product never changed
once formed

° Instead of shifting multiplicand to left, shift product to


right?

361 ALU.16
MULTIPLY HARDWARE Version 2

° 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg,


32-bit Multiplier reg

Multiplicand
32 bits
Multiplier Shift Right
32-bit ALU 32 bits
Shift Right
Product Control
64 bits Write

361 ALU.17
Multiply Algorithm Version Start
2
Multiplier Multiplicand Product
0011 0010 0000 0000
Multiplier0 = 1 1. Test Multiplier0 = 0
Multiplier0

1a. Add multiplicand to the left half of product &


place the result in the left half of Product register

° Product Multiplier Multiplicand


0000 0000 0011 0010
2. Shift the Product register right 1 bit.

3. Shift the Multiplier register right 1 bit.

32nd No: < 32 repetitions


repetition?
Yes: 32 repetitions
Done
361 ALU.18
What’s going
on? 0 0 0 0

A3 A2 A1 A0
B0

A3 A2 A1 A0
B1

A3 A2 A1 A0
B2

A3 A2 A1 A0
B3

P7 P6 P5 P4 P3 P2 P1 P0
° Multiplicand stay’s still and product moves right

361 ALU.19
Observations on Multiply Version
2

° Product register wastes space that exactly matches


size of multiplier
=> combine Multiplier register and Product register

361 ALU.21
MULTIPLY HARDWARE Version 3

° 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product


reg, (0-bit Multiplier reg)

Multiplicand
32 bits

32-bit ALU

Shift Right
Product (Multiplier) Control
64 bits Write

361 ALU.22
Multiply Algorithm Version Start
3
Multiplicand Product
0010 0000 0011
Product0 = 1 1. Test Product0 = 0
Product0

1a. Add multiplicand to the left half of product &


place the result in the left half of Product register

2. Shift the Product register right 1 bit.

32nd No: < 32 repetitions


repetition?
Yes: 32 repetitions
Done
361 ALU.23
Observations on Multiply Version
3

° 2 steps per bit because Multiplier & Product combined

° MIPS registers Hi and Lo are left and right half of Product

° Gives us MIPS instruction MultU

° How can you make it faster?

° What about signed multiplication?


• easiest solution is to make both positive & remember whether to
complement product when done (leave out the sign bit, run for 31
steps)
• apply definition of 2’s complement
- need to sign-extend partial products and subtract at the end
• Booth’s Algorithm is elegant way to multiply signed numbers using
same hardware as before and save cycles
- can handle multiple bits at a time

361 ALU.24
Motivation for Booth’s
° Algorithm
Example 2 x 6 = 0010 x 0110:

0010
x 0110
+ 0000 shift (0 in multiplier)
+ 0010 add (1 in multiplier)
+ 0100 add (1 in multiplier)
+ 0000 shift (0 in multiplier)
00001100

° ALU with add or subtract gets same result in more than one way:
6 = – 2 + 8 , or
0110 = – 0010+ 1000

° Replace a string of 1s in multiplier with an initial subtract when we first


see a one and then later add for the bit after the last one. For example

0010
x 0110
+ 0000 shift (0 in multiplier)
– 0010 sub (first 1 in multiplier)
+ 0000 shift (middle of string of 1s)
+ 0010 add (prior step had last 1)
00001100
361 ALU.25
Booth’s Algorithm
Insight
middle of run
end of run beginning of run
0 1 1 1 1 0
Current Bit Bit to the Right Explanation Example

1 0 Beginning of a run of 1s 0001111000

1 1 Middle of a run of 1s 0001111000

0 1 End of a run of 1s 0001111000

0 0 Middle of a run of 0s 0001111000

Originally for Speed since shift faster than add for his machine

Replace a string of 1s in multiplier with an initial subtract when


we first see a one and then later add for the bit after the last one
361 ALU.26
Booths Example (2 x
7)
Operation Multiplicand Product next?

0. initial value 0010 0000 0111 0 10 -> sub


1a. P = P - m 1110 + 1110
1110 0111 0 shift P (sign ext)

1b. 0010 1111 0011 1 11 -> nop, shift

2. 0010 1111 1001 1 11 -> nop, shift

3. 0010 1111 1100 1 01 -> add

4a. 0010 + 0010


0001 1100 1 shift
4b. 0010 0000 1110 0 done

361 ALU.27
Booths Example (2 x
-3)
Operation Multiplicand Product next?

0. initial value 0010 0000 1101 0 10 -> sub


1a. P = P - m 1110 + 1110
1110 1101 0 shift P (sign ext)

1b. 0010 1111 0110 1 01 -> add


+ 0010

2a. 0001 0110 1 shift P

2b. 0010 0000 1011 0 10 -> sub


+ 1110

3a. 0010 1110 1011 0 shift

3b. 0010 1111 0101 1 11 -> nop


4a 1111 0101 1 shift
4b. 0010 1111 1010 1 done

361 ALU.28
Booth’s
Algorithm

1. Depending on the current and previous bits, do one of the following:


00: a. Middle of a string of 0s, so no arithmetic operations.
01: b. End of a string of 1s, so add the multiplicand to the left
half of the product.
10: c. Beginning of a string of 1s, so subtract the multiplicand
from the left half of the product.
11: d. Middle of a string of 1s, so no arithmetic operation.

2.As in the previous algorithm, shift the Product register right (arith) 1 bit.

361 ALU.29
MIPS logical
instructions
° Instruction Example Meaning Comment
° and and $1,$2,$3 $1 = $2 & $3 3 reg. operands; Logical AND
° or or $1,$2,$3 $1 = $2 | $3 3 reg. operands; Logical OR
° xor xor $1,$2,$3 $1 = $2 $3 3 reg. operands; Logical XOR
° nor nor $1,$2,$3 $1 = ~($2 |$3) 3 reg. operands; Logical NOR
° and immediate andi $1,$2,10 $1 = $2 & 10 Logical AND reg, constant
° or immediate ori $1,$2,10 $1 = $2 | 10 Logical OR reg, constant
° xor immediate xori $1, $2,10 $1 = ~$2 &~10 Logical XOR reg, constant
° shift left logical sll $1,$2,10 $1 = $2 << 10 Shift left by constant
° shift right logical srl $1,$2,10 $1 = $2 >> 10 Shift right by constant
° shift right arithm. sra $1,$2,10 $1 = $2 >> 10 Shift right (sign extend)
° shift left logical sllv $1,$2,$3 $1 = $2 << $3 Shift left by variable
° shift right logical srlv $1,$2, $3 $1 = $2 >> $3 Shift right by variable
° shift right arithm. srav $1,$2, $3 $1 = $2 >> $3 Shift right arith. by variable

361 ALU.30
Shifters

Two kinds:

logical-- value shifted in is always "0"


"0" msb lsb "0"

arithmetic-- on right shifts, sign extend


msb lsb "0"

Note: these are single bit shifts. A given instruction might request
0 to 32 bits to be shifted!

361 ALU.31
Combinational Shifter from
MUXes
Basic Building Block A B

sel 1 0

D
8-bit right shifter
A7 A6 A5 A4 A3 A2 A1 A0 S2 S1 S0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

R7 R6 R5 R4 R3 R2 R1 R0

° What comes in the MSBs?


° How many levels for 32-bit shifter?
° What if we use 4-1 Muxes ?
361 ALU.32
General Shift Right Scheme using 16 bit
example

S0
(0,1)

S1
(0, 2)

S2
(0, 4)

S3
(0, 8)

If added Right-to-left connections could


support Rotate (not in MIPS but found in ISAs)

361 ALU.33
Summary

° Instruction Set drives the ALU design

° Multiply: successive refinement to see final design


• 32-bit Adder, 64-bit shift register, 32-bit Multiplicand Register
• Booth’s algorithm to handle signed multiplies

° There are algorithms that calculate many bits of multiply per cycle
(see exercises 4.36 to 4.39 )

° What’s Missing from MIPS is Divide & Floating Point Arithmetic

361 ALU.34

You might also like