CSE477
VLSI Digital Circuits
Fall 2002
Lecture 10: The Inverter, A Dynamic
View
Mary Jane Irwin ( [Link]/~mji )
[Link]/~cg477
[Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
CSE477 L10 Inverter, Dynamic.1 Irwin&Vijay, PSU, 2002
Inverter Propagation Delay
Propagation delay is proportional to the time-constant of
the network formed by the pull-down resistor and the load
capacitance
VDD tpHL = f(Rn, CL)
Vout = 0 tpHL = ln(2) Reqn CL = 0.69 Reqn CL
Rn CL tpLH = ln(2) Reqp CL = 0.69 Reqp CL
Vin = V DD tp = (tpHL + tpLH)/2 = 0.69 CL(Reqn + Reqp)/2
To equalize rise and fall times make the on-resistance of
the NMOS and PMOS approximately equal.
CSE477 L10 Inverter, Dynamic.2 Irwin&Vijay, PSU, 2002
Inverter Transient Response
VDD=2.5V
3
Vin 0.25m
2.5 W/Ln = 1.5
W/Lp = 4.5
2
Reqn= 13 k ( 1.5)
1.5 Reqp= 31 k ( 4.5)
Vout (V)
tf tr
1 tpHL tpLH tpHL = 36 psec
0.5 tpLH = 29 psec
0 so
-0.5 tp = 32.5 psec
0 0.5 1 1.5 2 2.5
x 10-10
t (sec)
From simulation: tpHL = 39.9 psec and tpLH = 31.7 psec
CSE477 L10 Inverter, Dynamic.4 Irwin&Vijay, PSU, 2002
Inverter Propagation Delay, Revisited
To see how a designer can optimize the delay of a gate
have to expand the Req in the delay equation
5.5
5
4.5
4
tp(normalized)
3.5
3
2.5
tpHL = 0.69 Reqn CL 2
1.5
1
= 0.69 (3/4 (CL VDD)/IDSATn ) 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4
VDD (V)
0.52 CL / (W/Ln k’n VDSATn )
CSE477 L10 Inverter, Dynamic.5 Irwin&Vijay, PSU, 2002
Design for Performance
Reduce CL
internal diffusion capacitance of the gate itself
- keep the drain diffusion as small as possible
interconnect capacitance
fanout
Increase W/L ratio of the transistor
the most powerful and effective performance optimization
tool in the hands of the designer
watch out for self-loading! – when the intrinsic capacitance
dominates the extrinsic load
Increase VDD
can trade-off energy for performance
increasing VDD above a certain level yields only very minimal
improvements
reliability concerns enforce a firm upper bound on VDD
CSE477 L10 Inverter, Dynamic.6 Irwin&Vijay, PSU, 2002
NMOS/PMOS Ratio
So far have sized the PMOS and NMOS so that the Req’s
match (ratio of 3 to 3.5)
symmetrical VTC
equal high-to-low and low-to-high propagation delays
If speed is the only concern, reduce the width of the
PMOS device!
widening the PMOS degrades the tpHL due to larger parasitic
capacitance
= (W/Lp)/(W/Ln)
r = Reqp/Reqn (resistance ratio of identically-sized PMOS and NMOS)
opt = r when wiring capacitance is negligible
CSE477 L10 Inverter, Dynamic.7 Irwin&Vijay, PSU, 2002
PMOS/NMOS Ratio Effects
5 x 10
-11
tpLH tpHL
4.5 of 2.4 (= 31 k/13 k)
gives symmetrical
4 tp
response
tp(sec)
of 1.6 to 1.9 gives
3.5 optimal performance
3
1 2 3 4 5
= (W/Lp)/(W/Ln)
CSE477 L10 Inverter, Dynamic.8 Irwin&Vijay, PSU, 2002
Device Sizing for Performance
Divide capacitive load, C , into
L
Cint : intrinsic - diffusion and Miller effect
Cext : extrinsic - wiring and fanout
tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint)
where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the
gate
Widening both PMOS and NMOS by a factor S reduces
Req by an identical factor (Req = Rref/S), but raises the
intrinsic capacitance by the same factor (Cint = SCiref)
tp = 0.69 Rref Ciref (1 + Cext/(SCiref)) = tp0(1 + Cext/(SCiref))
tp0 is independent of the sizing of the gate; with no load the drive
of the gate is totally offset by the increased capacitance
any S sufficiently larger than (Cext/Cint) yields the best
performance gains with least area impact
CSE477 L10 Inverter, Dynamic.9 Irwin&Vijay, PSU, 2002
Sizing Impacts on Delay
x 10-11 The majority of the
3.8
for a fixed load improvement is already
3.6
obtained for S = 5. Sizing
3.4
factors larger than 10
3.2
barely yield any extra gain
3
(and cost significantly
tp(sec)
2.8
2.6
more area).
2.4
2.2
2
1 3 5 7 9 11 13 15
S self-loading effect
(intrinsic capacitance
dominates)
CSE477 L10 Inverter, Dynamic.10 Irwin&Vijay, PSU, 2002
Impact of Fanout on Delay
Extrinsic capacitance, Cext, is a function of the fanout of
the gate - the larger the fanout, the larger the external
load.
First determine the input loading effect of the inverter.
Both Cg and Cint are proportional to the gate sizing, so Cint
= Cg is independent of gate sizing and
tp = tp0 (1 + Cext/ Cg) = tp0 (1 + f/)
i.e., the delay of an inverter is a function of the ratio
between its external load capacitance and its input gate
capacitance: the effective fan-out f
f = Cext/Cg
CSE477 L10 Inverter, Dynamic.11 Irwin&Vijay, PSU, 2002
Inverter Chain
Real goal is to minimize the delay through an inverter
chain
In Out
1 2 N
Cg,1 CL
the delay of the j-th inverter stage is
tp,j = tp0 (1 + Cg,j+1/(Cg,j)) = tp0(1 + fj/ )
and tp = tp1 + tp2 + . . . + tpN
so tp = tp,j = tp0 (1 + Cg,j+1/(Cg,j))
If CL is given
How should the inverters be sized?
How many stages are needed to minimize the delay?
CSE477 L10 Inverter, Dynamic.12 Irwin&Vijay, PSU, 2002
Sizing the Inverters in the Chain
The optimum size of each inverter is the geometric mean
of its neighbors – meaning that if each inverter is sized up
by the same factor f wrt the preceding gate, it will have the
same effective fan-out and the same delay
N N
f = CL/Cg,1 = F
where F represents the overall effective fan-out of the
circuit (F = CL/Cg,1)
and the minimum delay through the inverter chain is
N
tp = N tp0 (1 + ( F ) / )
The relationship between tp and F is linear for one inverter,
square root for two, etc.
CSE477 L10 Inverter, Dynamic.13 Irwin&Vijay, PSU, 2002
Example of Inverter Chain Sizing
In Out
1 f=2 f2 = 4
Cg,1 CL = 8 Cg,1
CL/Cg,1 has to be evenly distributed over N = 3 inverters
CL/Cg,1 = 8/1
3
f = 8 = 2
CSE477 L10 Inverter, Dynamic.15 Irwin&Vijay, PSU, 2002
Determining N: Optimal Number of Inverters
What is the optimal value for N given F (=fN) ?
if the number of stages is too large, the intrinsic delay of the
stages becomes dominate
if the number of stages is too small, the effective fan-out of each
stage becomes dominate
The optimum N is found by differentiating the minimum
delay expression divided by the number of stages and
setting the result to 0, giving
N N
+ F - ( F lnF)/N = 0
For = 0 (ignoring self-loading) N = ln (F) and the
effective-fan out becomes f = e = 2.71828
For = 1 (the typical case) the optimum effective fan-out
(tapering factor) turns out to be close to 3.6
CSE477 L10 Inverter, Dynamic.16 Irwin&Vijay, PSU, 2002
Optimum Effective Fan-Out
5 7
normalized delay
4.5
5
4 4
Fopt
3.5 3
2
3
1
2.5 0
0 0.5 1 1.5 2 2.5 3 1 1.5 2 2.5 3 3.5 4 4.5 5
f
Choosing f larger than optimum has little effect on delay
and reduces the number of stages (and area).
Common practice to use f = 4 (for = 1)
But too many stages has a substantial negative impact on delay
CSE477 L10 Inverter, Dynamic.17 Irwin&Vijay, PSU, 2002
Example of Inverter (Buffer) Staging
N f tp
1
Cg,1 = 1 CL = 64 Cg,1 1 64 65
1 8
2 8 18
Cg,1 = 1 CL = 64 Cg,1
1 4 16
3 4 15
Cg,1 = 1 CL = 64 Cg,1
1 2.8 8 22.6
4 2.8 15.3
Cg,1 = 1 CL = 64 Cg,1
CSE477 L10 Inverter, Dynamic.18 Irwin&Vijay, PSU, 2002
Impact of Buffer Staging for Large CL
F Unbuffered Two Stage Opt. Inverter
( = 1) Chain Chain
10 11 8.3 8.3
100 101 22 16.5
1,000 1001 65 24.8
10,000 10,001 202 33.1
Impressive speed-ups with optimized cascaded
inverter chain for very large capacitive loads.
CSE477 L10 Inverter, Dynamic.19 Irwin&Vijay, PSU, 2002
Input Signal Rise/Fall Time
In reality, the input signal
changes gradually (and both x 10-11
PMOS and NMOS conduct for 5.4
a brief time). This affects the 5.2
current available for 5
charging/discharging CL and 4.8
impacts propagation delay. 4.6
tp(sec)
4.4
4.2
tp increases linearly with 4
increasing input slope, ts, 3.8
once ts > tp
3.6
0 2 4 6 8 x 10-11
ts(sec)
ts is due to the limited driving for a minimum-size inverter with
capability of the preceding gate a fan-out of a single gate
CSE477 L10 Inverter, Dynamic.20 Irwin&Vijay, PSU, 2002
Design Challenge
A gate is never designed in isolation: its performance is
affected by both the fan-out and the driving strength of the
gate(s) feeding its inputs.
tip = tistep + ti-1step ( 0.25)
Keep signal rise times smaller than or equal to the gate
propagation delays.
good for performance
good for power consumption
Keeping rise and fall times of the signals small and of
approximately equal values is one of the major challenges
in high-performance designs - slope engineering.
CSE477 L10 Inverter, Dynamic.21 Irwin&Vijay, PSU, 2002
Delay with Long Interconnects
When gates are farther apart, wire capacitance and
resistance can no longer be ignored.
(rw, cw, L)
Vin Vout
cint cfan
tp = 0.69RdrCint + (0.69Rdr+0.38Rw)Cw + 0.69(Rdr+Rw)Cfan
where Rdr = (Reqn + Reqp)/2
= 0.69Rdr(Cint+Cfan) + 0.69(Rdrcw+rwCfan)L + 0.38rwcwL2
Wire delay rapidly becomes the dominate factor (due to
the quadratic term) in the delay budget for longer wires.
CSE477 L10 Inverter, Dynamic.22 Irwin&Vijay, PSU, 2002
Next Lecture and Reminders
Next lecture
Designing fast logic
- Reading assignment – Rabaey, et al, 6.2.1
Reminders
Project specifications due today
HW3 due next Thursday, Oct 10th (hand in to TA)
Class cancelled on Oct 10th as make up for evening midterm
I will be out of town Oct 10th through Oct 15th and Oct 18th
through Oct 23rd, so office hours during those periods are
cancelled
We will have a guest lecturer on Oct 22nd
Evening midterm exam scheduled
- Wednesday, October 16th from 8:15 to 10:15pm in 260 Willard
- Only one midterm conflict filed for so far
CSE477 L10 Inverter, Dynamic.23 Irwin&Vijay, PSU, 2002