0% found this document useful (0 votes)

33 views16 pages

Gate Sizing

The document discusses gate sizing, a critical process in integrated circuit design that optimizes transistor widths to achieve timing closure and minimize power consumption. It covers theoretical and practical aspects, including delay models, the impact of gate sizing on power and area, and various algorithms for discrete and continuous sizing. The introduction highlights the trade-offs involved in sizing, such as the balance between speed and increased capacitance, and the implications for circuit performance.

Uploaded by

Freddy González

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views16 pages

Gate Sizing

Uploaded by

Freddy González

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Gate Sizing 10

Stephan Held and Jiang Hu

CONTENTS
10.1 Introduction 246
10.1.1 Static Timing 247
10.1.2 Delay Models 247
10.1.2.1 The RC Delay Model 248
10.1.2.2 Model-Independent Delay Characteristics 249
10.1.3 Impact on Power and Area 249
10.1.3.1 Area 250
10.1.4 Problem Formulations 250
10.2 Theoretical Background 251
10.2.1 Discrete Sizing 251
10.2.2 Continuous Sizing 251
10.2.3 Lagrangian Relaxation 252
10.2.3.1 Solving the Lagrangian Subproblem 253
10.2.3.2 Solving the Lagrangian Dual Problem 253
10.3 Practical Algorithms 254
10.3.1 Sensitivity-Based Heuristics 254
10.3.2 Dedicated Worst-Slack Maximization 255
10.3.3 Continuous Optimization with Self-Snapping 255
10.3.4 Practical Gate Sizing by Lagrangian Relaxation 255

245

© 2016 by Taylor & Francis Group, LLC

246 10.1 Introduction

10.3.5 Heuristics Based on Timing Budgeting 257

10.3.6 Gate Sizing in Electronic Design Automation Tools 258
10.4 Conclusions and Outlook 258
References 259

10.1 INTRODUCTION

Gate sizing is a central step for achieving timing closure and minimizing the power consumption
of integrated circuits. It refers to determining the widths of the transistors inside a logic gate,
where wider transistors imply a faster charging and discharging of the output capacitance. As a
drawback, larger transistors increase the input capacitances and thus the load, delays, and transition
times at the upstream gates. An increased transition time slows down sibling gates driven by
common predecessors.
The delay characteristics of a gate are also affected by its threshold voltage Vt. By varying the
oxide thickness or materials in the gate, the threshold voltage can be lowered to speed it up. As a
drawback, the leakage current Ileak grows exponentially with a falling threshold voltage:

I leak = ae - bVt

for constants a, b Î + (see Section 3.1.3). As every alternative threshold voltage on a design
requires a separate production step with additional masks, usually no more than four threshold
voltages are available.
Further techniques for optimizing gate delays have a smaller impact on the end result compared
to sizing and Vt-assignment and are, therefore, seldom used. First, the ratio of the p-type and
n-type transistor sizes (β-ratio) is adjusted, trading off the rising and falling signal delay. Second,
the sizes of serially connected transistors inside a gate are modified to balance the delays from
different inputs to the output pin (tapering).
Most integrated circuits are designed using a circuit library B that contains a variety of
predesigned physical layouts implementing different logic functions and are of varying sizes and
threshold voltages. Instead of implementing each gate in the netlist individually, each gate is
mapped to one of the library circuits. Let G denote the set of all gates in the netlist. The implementation
of all gates is described by a vector x ÎB G , where x g ÎB specifies the physical implementation
of g ÎG . Of course, the logical function of a gate restricts the set of implementations to which
it can be assigned. By X Í B G we denote the set of assignments, where each gate is assigned to a
logically equivalent library circuit. In discrete gate sizing, B and thus X are finite sets. In continuous
gate sizing, logic gates can take any size between a lower- and upper-size bound, extending the
solution space compared to discrete sizing.
A key reason for using discrete gate libraries is that preprocessed timing models for each
library gate significantly speed up timing analysis compared to the numerical simulation
of continuously designed custom gates. Thus, they allow the processing of multimillion gate
instances by physical design tools. More recently, FinFET transistors introduced additional
discreteness in standard cell design. FinFETs are sized by adding multiple fins of unit size.
In microprocessor design, chips used to be partitioned into small custom circuits of a few
thousand gates, where the layout was essentially defined manually. Here, numerical timing
simulation is possible, and transistors can be sized continuously for optimum sizes, β-ratios, and
taperings. For algorithms designed for transistor sizing of custom circuits, we refer the reader
to the optimization methods in [7,42] that combine timing simulation with gradient descent
or interior point methods. The benefit of the flexibility of transistor sizing is hampered by the
restrictions imposed by partitioning the design into small blocks. Thus, transistor sizing on small
blocks has mostly been replaced by discrete gate sizing on larger units, even in microprocessor
design.
In this chapter, we first present theoretical results on discrete and continuous gate sizing
in Section 10.2. Then in Section 10.3, we focus on practical algorithms for discrete gate sizing.

© 2016 by Taylor & Francis Group, LLC

Chapter 10 – Gate Sizing 247

Readers that are mostly interested in practical algorithms can skip the theoretical part in
Section 10.2. We begin with the basic concepts of gate sizing.

10.1.1 STATIC TIMING

Physical design optimization is usually based on static timing constraints, represented by a

directed acyclic timing graph G (Chapter 6). G contains a vertex for each pin on the chip, and
edges are inserted between source and sinks of a net and the inputs and outputs of logic gates.
Starting at primary inputs and register outputs, the latest possible arrival time av is propagated
to each pin v ∈ V(G) along the topological order of G. Figure 10.1 shows a small netlist of a chip
and its timing graph G.
Static timing analysis implies the following (simplified) mathematical constraints:

av ³0 "v Î PI È RO,
(10.1) av + delay v ,w ( x) £ aw "(v , w) Î E(G ),
av £T "v Î PO È RI ,

where
E(G) is the edge set of the graph
PI, PO are the sets of primary inputs and outputs
RI, RO are register inputs and outputs
T is the design cycle time

The solution vector x ∈ X determines the gate sizes and threshold voltages. The edge delays
delayv,w(x) are functions of x, as we will describe in Section 10.1.2.
By introducing a super-vertex v⋆ representing the time zero (av = 0) and edges with constant
delay zero or −T, the three constraint types can be written in an uniform way:

(10.2) av + delay v ,w ( x) £ aw "(v , w) Î E(G ).

10.1.2 DELAY MODELS

Traditionally, the shape of a signal transition is approximated by (1) its arrival time, which is the
point where the voltage is half the supply voltage Vdd, and (2) its slew, which is the time of the
transition between 10% and 90% Vdd, as indicated in Figure 10.2. The 10/90 interval is chosen
arbitrarily. It should reflect the region where the voltage changes almost linearly, and sometimes,
the slew is defined by the 20/80 or 30/70 intervals.
The delay between an input and an output signal is the difference of their arrival times. For
higher accuracy, the signal transition can also be approximated by piecewise linear functions
with multiple pieces. The most accurate delay analysis is based on numerical circuit simulation
using the SPICE software [26]. Due to the high running times of simulation, industry standard
static timing sign-off tools, such as Synopsys PrimeTime or Cadence Tempus, use discrete gate

FIGURE 10.1 Example of a timing graph and its underlying netlist.

© 2016 by Taylor & Francis Group, LLC

248 10.1 Introduction

V V

Vdd Vdd
0.9Vdd 0.9Vdd

0.5Vdd 0.5Vdd

0.1Vdd 0.1Vdd
V0 V0
at t at t

(a) Slew (b) Slew

FIGURE 10.2 A (rising) signal transition (a) and its approximation (b).

libraries with preprocessed delay characteristics to analyze multi-million gate designs, providing
SPICE simulation as an option for the accurate analysis of the most critical paths.
Traditionally, the gate delays were approximated by two nondecreasing functions: one for the
delay and the other for the output slew. Both of these were taking the load capacitance and input
slew as arguments and were computed by interpolated table lookup. Wire delays were computed
as Elmore delays [10] or using asymptotic waveform emulation to account for resistive shield-
ing [32]. Due to the growing importance of capacitive coupling between wires, the current source
model has become the most popular delay model [8], usually analyzing stages from the input pins
of a gate to the inputs of the downstream gates.
The delay rules are characterized only within some upper bounds for the load capacitance
and input slews. Meaningful delays are only given if the capacitance and slew limits are met.
Thus, capacitance and slew constraints have a higher priority than the delay constraints in
Equation 10.2.

10.1.2.1 THE RC DELAY MODEL

A delay model that is often used in the literature is the RC delay model [4,11]. Here, the Elmore
delay formula is also used for modeling gate delays. The delay through a gate is proportional
to the product of its resistance and load capacitance. While the resistance of a gate is inversely
proportional to the gate size, the input pin capacitance of a gate is proportional to the size. Let
us assume that xg reflects the size of a gate g ÎG, that is, x g Î . Now, the RC delay through g
coincides for all timing arcs to the output pin. It is defined as

rg æ ö
delay g ( x) = ç C wire +
xg ç
è
å
g ¢Î fanout ( g )
c g ¢ x g ¢ + c intrinsic
g ÷,
÷
ø

where C wire , rg , c g ¢ , c intrinsic

g Î + are constants for the wire capacitance of the output net, the base
resistance, the base input pin capacitance, and the intrinsic capacitance, which depend on the
gate type. Wire delays are also modeled as RC delays using the Elmore formula [10]. Under the
RC delay model, efficient algorithms for gate sizing are possible, as we will discuss in Section 10.2.
A further simplification is the logical effort model [37], where the resistances and capacitances
are normalized to a reference inverter, yielding a technology-independent delay model, where
delays depend only on the topology of the pull-up and pull-down transistor networks in the gate.
In this model, a minimum delay solution for a single path can be computed by uniformly dis-
tributing the so-called stage delay. From the stage delay, an optimum size can be derived locally
at every gate. However, this approach does not easily extend to general netlists. Neither model
reflects the nonlinear nature of the delays. In the RC delay model, the gate delay is a linear

© 2016 by Taylor & Francis Group, LLC

Chapter 10 – Gate Sizing 249

EDA for . . .

160
100

Gate delay (ps)

140

120
50
100

80
0 10 20 0 200 400
(a) Load capacitance (fF) (b) Input slew (ps)

FIGURE 10.3 A typical gate delay as a (concave) function of load capacitance (a) and input slew (b),
given a fixed input slew (left) and load capacitance (right).

function in the load capacitance, while in reality it is rather a concave function, as shown in
Figure 10.3. The nonlinearity is further amplified by the slew dependency of the delay. Therefore,
these two models are used at most to guess initial sizes before placement but not during
subsequent optimization.

10.1.2.2 MODEL-INDEPENDENT DELAY CHARACTERISTICS

Delay and slew are increasing functions of load capacitance and input slew as in Figure 10.3.
Changing the size of a gate influences the delays and slews at the gate itself as well as all its
neighboring gates and all gates in their downstream cone. Assuming a constant input slew, an
increasing size reduces the delay through the gate and the output slews and thereby decreases
delays and slews at the downstream gates. On the other hand, it increases the load capacitance of
the upstream gates, their delays and output slews, and thereby the delays of the other sink gates
in the input nets as well as their successors. So, in fact, the input slew increases with the gate size.
If the upstream gate is very load sensitive, a strongly increasing input slew may degrade the delay
through the upsized gate instead of reducing it.

10.1.3 IMPACT ON POWER AND AREA

For details on power analysis, see Chapter 3 or [44]. Here, we emphasize the factors relevant for
gate sizing. The power consumption P(g) of a logic gate g ÎG is composed of the dynamic power
Pdynamic(g) that occurs when a gate changes its logic state and static power Pstatic(g) that occurs
while the gate is in a steady state:

P( g ) = Pdynamic ( g ) + Pstatic ( g ).

The dynamic power is, in turn, composed of the switching power and the short-circuit power:

Pdynamic ( g ) = Pswitching ( g ) + Pshort ( g ),

where the Pswitching measures the power for charging and discharging its input pin capacitances.
Wire capacitances are usually considered constant during gate sizing as interconnects can
stay in place after minor adjustments to ensure pin access. For the purpose of gate sizing, we
assume that the dynamic power of a gate is more or less proportional to its size. Other factors
that influence the dynamic power are the supply voltage that is not sensitive to gate sizing
and the switching behavior, which is hard to predict and usually estimated by a switching
frequency per net.
Modeling becomes more complicated for the short-circuit power, which is observed under
simulation. It is dissipated at low-Vt gates during an input signal transition. If the threshold volt-
age is below 50% of the supply voltage, there is a period where both p-type and n-type transistors

© 2016 by Taylor & Francis Group, LLC

250 10.1 Introduction

in the gate are turned on, forming a short circuit. The short-circuit current increases with a larger
input slew, a lower threshold voltage, and a larger gate size. If a gate has steep input slews or a high
threshold voltage, the short-circuit current is usually negligible. As low-Vt gates are used mostly
on critical parts, small delays and slews are targeted anyway. But a designer or a tool should be
aware that downsizing a gate to save power will increase the short-circuit power at downstream
gates with a low threshold voltage. Often, the short-circuit power is not considered directly but
mitigated by enforcing low slew limits for low-Vt gates.
While dynamic and short-circuit power dissipation occurs while a gate is switching, the leak-
age current is also present at a steady state. There is a permanent small current leaking through
the transistor gates. This current is higher for lower threshold voltages and wider gates, with an
exponential dependency on the threshold voltage. In particular, for mobile devices, it is critical
to keep the static power small. Among many physical factors, the leakage power depends on the
current state of the input signals. These state patterns are not affected by gate sizing. We assume
that the leakage power of a gate is defined by its physical realization and not by the realization of
other gates.
For simplicity, gate sizing algorithms estimate power by separable functions, where the
contribution of one gate does not affect the contributions of other gates:

(10.3) P total = Pstatic

total total
+ Pdynamic = å( P
gÎG
static ( g ) + Pdynamic (gg )) .

This is justifiable for the switching power and the leakage power. The short-circuit power can and
usually is added as a small slew-independent constant to the dynamic power after imposing low
upper bounds on the input slews of low-Vt gates.

10.1.3.1 AREA

The form factor of a gate is also an important aspect of gate sizing. If the chip gets locally over-
filled, it can become difficult or even infeasible to legalize the gates. This can, of course, also
be adjusted by changing the floorplan, but at some point in the design cycle, gate sizing should
match local placement density targets to avoid disturbance by legalization. At the very end, it
should avoid gate overlaps. In the literature, placement density is a rarely considered constraint
for gate sizing. One approach is proposed in [6]. Here, a single global density violation is punished by
a weighted quadratic penalty in the objective function during gate sizing. Gate sizing is repeated
with an increasing penalty weight until the density constraint is met. Production tools, such as
those used in [41], partition the chip into small placement bins and meet the placement density
target inside each bin.

10.1.4 PROBLEM FORMULATIONS

Timing, power, and area can be considered as objectives or constraints in various combinations.
A typical objective in gate sizing is to minimize the power consumption such that the timing
constraints in Equation 10.2 are met:

min P total ( x)
(10.4) s.t . av + delay v ,w ( x) £ aw "(v , w) Î E(G )
xÎX

In practice, area constraints must be fulfilled as well. Early in the design cycle, when global gate
sizing is performed, timing constraints are usually unsatisfiable. Therefore, minimizing the cycle
time T or the absolute total negative slack (TNS) is usually the prevalent goal during most parts
of the design cycle, where the TNS is the sum of negative slacks at the endpoints of signal paths.
This can be done in linear combination with power and area or imposing area constraints. But
one could also ask for the minimum area under timing and power constraints.

© 2016 by Taylor & Francis Group, LLC

Chapter 10 – Gate Sizing 251

10.2 THEORETICAL BACKGROUND

The discrete and nonlinear nature of the problem makes gate sizing practically and mathemati-
cally difficult. Assuming well-behaved delay functions and continuous sizability, gate sizing
becomes tractable to some extent. In this section, we give an overview on the most important
theoretical results for discrete and continuous gate sizing.

10.2.1 DISCRETE SIZING

Even under mild assumptions on the delay model, discrete gate sizing is a hard problem, as
the connection to the discrete time-cost trade-off problem shows. Here, we are given an acyclic
directed graph G = (V, E), and for each edge e ∈ E, there is a finite set T e = {te1 ,¼, tek } Ì R of edge
delays and a nonincreasing cost function ce : T e ® R + . In the deadline problem, we wish to choose
a delay for each edge such that the length of each path stays within a given deadline D minimiz-
ing the total cost. This problem cannot be approximated within a factor of 1.36 unless P = NP,
because it is as hard as the vertex cover problem [9,14]. Assuming the unique games conjecture,
which is a stronger assumption than P ≠ NP, there is even no constant-factor approximation
algorithm [39]. Theoretically, polynomial time algorithms or approximation algorithms with
better guarantees may exist for special delay functions. But in gate sizing, the delay functions
appear rather more involved due to their nonlinearity and nonseparability. The nonseparability
means that the delay of an edge in the timing graph depends on the sizes of multiple gates.
For netlists with a tree structure and certain delay models, pseudo-polynomial dynamic
programming (DP) algorithms have been used successfully. One of the first was presented in [3].
They can be turned into a fully polynomial time approximation scheme as a consequence of the
approximation scheme for buffering by [17]. For netlists with bounded width, that is, with a path-
like structure, an approximation scheme for delay minimization obeying an upper bound on the
power was developed in [22].
When nonlinear delay models with slew propagation come into play, the only known algo-
rithms for general discrete gate sizing that provide a size guarantee while meeting the timing
constraints are based on complete enumeration, potentially sped up by branch and bound [45].
Such techniques can be used to postoptimize small groups of gates [20].

10.2.2 CONTINUOUS SIZING

Mathematical tractability improves the plain gate sizing problem (without discrete Vt changes)
if one assumes continuously sizable gates and posynomial delay functions, which are defined as
follows. For each gate g ÎG, the realization xg can be chosen continuously within a lower and an
upper bound lg ≤ xg ≤ ug, where l g , u g Î . Now the set X of feasible assignments consists of gate
sizing solutions x Î X = { x¢ Î RG : l g £ x g ¢ £ u g for all g Î G }.
A posynomial function f : n ® is of the form
K

(10.5) f ( x) = åc x
k =1
b1 k
k 1 x2b2 k xnbnk ,

where ck > 0 and bik Î . Posynomial delay functions occur, for instance, if gates are modeled as
RC circuits with bij ∈ {−1, 0, 1} [4,11].
The posynomial formulation for the circuit/transistor sizing problem was introduced by
Fishburn and Dunlop [11]. A globally optimum solution to the transistor sizing problem based on
geometric programming was first given in [35]. The nice property of geometric programs is that
they can be turned into convex programs by the variable transformation yi = logxi. Therefore, a
local optimum is always a global optimum.
General-purpose geometric programming solvers mostly implement an interior point method
for convex programs with polynomial running time. The drawback of interior point methods is

© 2016 by Taylor & Francis Group, LLC

252 10.2 Theoretical Background

that their iterations require a significant memory and running time. Solvable instance sizes are
limited to approximately 100,000 gates according to [1,2], which also give a detailed tutorial on
geometric programming.

10.2.3 LAGRANGIAN RELAXATION

A prominent practical approach to gate sizing is based on Lagrangian relaxation (LR), which was
analyzed in detail in [4] and later in [43]. As this is also the motivation for many discrete gate
sizing heuristics, we will describe this approach in detail.
In the LR approach, the timing constraints are moved to the objective weighted by Lagrange
multipliers l Î E³(0G ) that penalize constraint violations. Applied to Equation 10.4, this leads to
the following Lagrange function (called Lagrangian):

(10.6) L(l, x, a) = P total ( x) + å

( v ,w )ÎE (G )
l v ,w ( av + delay v ,w ( x) - aw ) .

For given λ ≥ 0, the Lagrangian dual function

(10.7) {
L (l) : = inf L(l, x, a) : x Î X , a Î RV (G ) }
is a lower bound for the objective value in Equation 10.4. To get the best possible lower bound,
we have to solve

(10.8) sup L (l).

l ³0

If the gate sizing problem in Equation 10.4 has a strictly feasible solution, Equations 10.4 and 10.8
have the same value and the Lagrangian duality gap is zero.
If no strictly feasible solutions exists, Equation 10.8 may not have a finite solution. Note that
for a minimum cycle time T in Equation 10.1 the problem is not strictly feasible, because the
constraints on the critical path can only be satisfied with equality. However, Wang et al. [43]
specify regularity conditions for the objective function and the delay functions so that the duality
gap between the gate sizing problems (Equation 10.4) and the Lagrangian dual problem (Equation
10.8) is zero. But the supremum in Equation 10.8 is not attained by a finite value, and instead, the
multipliers of the tight constraints tend toward infinity.
The crucial observation in [4] is that the Lagrangian can be separated into two parts, one
depending only on the size vector x and the other only on the arrival time vector a:

(10.9) L(l, x, a) = L1(l, x) + L2 (l, a)

with

(10.10) L1(l, x) = P total ( x) + å

( v ,w )ÎE (G )
l v ,wdelay v ,w ( x)

and

L2 (l, a) = å l v , w (a v - a w )
( v ,w )ÎE (G )

æ ö
å å å
1
(10.11) = av ç l v ,w - l u ,v ÷ .
2 vÎV (G ) ç (v ,w )ÎE (G ) ÷
è ( u ,v )ÎE (G ) ø

© 2016 by Taylor & Francis Group, LLC

Chapter 10 – Gate Sizing 253

Now Equation 10.11 shows that the Lagrange multipliers must obey network flow conservation
constraints:

å
( v ,w )ÎE (G )
l v ,w = å
( u ,v )ÎE (G )
l u ,v .

Otherwise, as the arrival time variables av (v ∈ V(G)) are unrestricted,

min L2 (l, a) = -¥ and thus as X is compact L (l) = -¥.
a

Thus, to solve the Lagrangian dual problem in Equation 10.8, the Lagrange multipliers must be
restricted to the space N of nonnegative network flows, as can also be derived from the Karush–
Kuhn–Tucker optimality conditions [4]. In this case, L2(l, a) = 0 for all a and L(l, x, a) = L1(l, x).
The Lagrangian dual problem (Equation 10.8) is usually solved by a subgradient method and
involves solving the Lagrangian subproblem (Equation 10.7) in every iteration.

10.2.3.1 SOLVING THE LAGRANGIAN SUBPROBLEM

The Lagrangian subproblem (Equation 10.7) is again a geometric program with the box con-
straints x ∈ X. It can be solved by a simple descent algorithm. In [4], convergence was proven
for a simple greedy algorithm if gates are modeled as RC circuits. In fact, it converges with a
linear convergence rate [5,19,34]. These theoretical statements rely on a specific gate delay model.
Empirically, linear running times were also observed for general delay models. They give reasons
to apply LR in gate sizing [4,12,21,23,24,28,40].

10.2.3.2 SOLVING THE LAGRANGIAN DUAL PROBLEM

By the Lagrangian duality theory, L is a concave function in the domain N of nonnegative net-
work flows. It is continuous but not everywhere differentiable in N . A prominent method for
solving the Lagrangian dual problem (Equation 10.8) is the projected subgradient method [4].
Starting with some initial vector of multipliers λ 0 ≥ 0, it determines in iteration k ≥ 0, a
subgradient sk at L (l k ) in ascent direction and updates the multipliers by

(10.12) l k +1 = p N (l k + a k sk ),

where
αk > 0 is the step length
p N is the projection to the set N of nonnegative network flows

For the plain subgradient method without projection and Lipschitz continuous functions,
convergence rates have been established. However, L⋆ is not Lipschitz continuous at the transi-
E (G )
tions between N and R ³0 N . Polyak [29] showed that the solution sequence (l k )kÎ in the
projected subgradient method contains a subsequence that converges to the optimum, that is,
k ®¥
the limit superior lim k ®¥L(l k ) converges. To prove this, the series αk must fulfill a k ® 0 and
å ( )
¥
a k = ¥. The projected subgradient steps are iterated until the duality gap P total ( xk ) - L(l k )
k =0
and the maximum or total delay constraint violation are within some positive error bound. Note
that the optimum solution x⋆ will likely be located on the boundary of the feasible set and most
iteration values xk will not satisfy all delay constraints.
Computing the exact projection p N is a quadratic minimum-cost flow problem that can be
solved in polynomial time [25]. However, this would still be too time consuming in practice.
Therefore, essentially, all papers employing the subgradient method for gate sizing resort to a
heuristic [40] that maps the multipliers into N , node by node in a backward topological order.
Convergence results similar to that in [29] are lacking for most of the employed heuristics. In [33],
convergence was shown when projecting with the method of alternating projections. This implies
that one can proceed as in [40] and replace their heuristic local mapping into N by a projection.

© 2016 by Taylor & Francis Group, LLC

254 10.3 Practical Algorithms

10.3 PRACTICAL ALGORITHMS

The gate sizing problem is so complicated and involves so many details that theoretically elegant
approaches are inadequate. Many practical techniques have been developed and shown to be
effective.

10.3.1 SENSITIVITY-BASED HEURISTICS

If one iteratively sizes a single gate at a time, it is natural to choose a gate whose change achieves
the largest benefit at the lowest cost. The TILOS algorithm in [11] was the first to systematically
use this idea. Since then, it became the rationale behind many sensitivity-based gate sizing
heuristics. The idea is intuitive yet can be very effective if implemented well. One recent example
is [15], which will be described in this section. An earlier work [38] will also be introduced to
illustrate different flavors of this approach.
A sensitivity-based gate sizing heuristic usually consists of two phases—one for timing
improvement and the other for power reduction. It typically starts with the timing phase
followed by the power phase. Sometimes, the two phases can be interleaved and applied
iteratively.
The sensitivity in the timing phase points to changes with the largest timing improvement and
the least power overhead. In [15], it is defined by

DTNS
(10.13) Sensitivity timing =
Dpower power _ exponent

where
TNS is the total negative slack
Δ indicates change
Sizing based on this sensitivity aims to minimize the absolute TNS. Since TNS is expensive to
compute precisely, it is approximated by

(10.14) DTNS(mik ) = å -Ddelay

g j ÎN i
k
j × N Paths j

where mik means changing gate g i ÎG to implementation k. The set Ni is the neighbors of gi and
consists of gates that share fanin nodes with gi. The delay change Ddelay kj of g j ∈ Ni is due to the
size or Vt (threshold voltage) change at gi. The notation NPathsj tells the total number of fanin and
fanout gates of gate g j that is affected by the delay change at g j. The power exponent is between
0 and 3. Extensions to this approach including an integration with a complex yet relatively slow
sign-off timer can be found in [20].
In [38], the timing phase sensitivity is defined as

å slack
1 -Ddelay
Sensitivity timing =
(10.15) Dpower arcs arc - slack min + K

where
Δpower is the power increase
Δdelay is the delay change when upsizing a gate
K is a constant

The arcs are (falling and rising) input arcs to a gate. In the timing recovery phase, one or several
gates with the maximum sensitivity are selected to be sized up or assigned to a lower Vt level.
This is repeated until the timing constraints are satisfied or no more improvement can be found.
Upon each gate size or Vt change, an incremental timing analysis is performed to obtain timing
updates.

Chapter 10 – Gate Sizing 255

A power reduction phase starts when the overall timing slack is nonnegative. In [15], the
power-phase sensitivity is defined as
-Dpower × slack
Sensitivity power =
(10.16) Ddelay × # paths

where
slack is at the output pin of the gate
#paths is the total fanin and fanout of the gate

The power-phase sensitivity in [38] is defined by

å Ddelay
slack arc
Sensitivity power = -Dpower
(10.17) arcs

In the power reduction phase, gates with large sensitivity are selected for downsizing or for
increasing the Vt level. This is repeated until the timing slack reaches minimal acceptable values.
Specialized heuristics for power minimization maintaining timing feasibility are given in [31].
Here, not only one gate is powered down at a time, but an independent set of gates is chosen based
on its total sensitivity.

10.3.2 DEDICATED WORST-SLACK MAXIMIZATION

To improve the worst slack near the end of gate sizing, the focus of local search can be adjusted
by changing the objectives in Equations 10.13 and 10.15: Instead of taking the sum of sensitivities
among all neighbors or input arcs, the worst slack over all upstream driver gates is maximized.
Under this objective, a gate on the critical path is sized for minimum delay. Furthermore, the
size of a gate with a more critical upstream gate cannot increase because this would increase the
capacitance and worsen the slack at the upstream gate. Instead, its size rather decreases to reduce
the load capacitance and delay of the upstream gate.
For the second reason, it is important to consider not only the gates on the critical path but
also their less critical downstream gates in the local search [18,20] regardless of the local objec-
tive. In this setting, it can be beneficial to proceed in a forward topological order because an
increased gate size on the critical path improves the input slews in the fanout and might allow
a more aggressive downsizing. With such an extended gate selection and the local worst-slack
objective explained earlier, close to optimum global worst slacks are achievable [18].

10.3.3 CONTINUOUS OPTIMIZATION WITH SELF-SNAPPING

With guidance from the function gradient, continuous optimization is usually more capable of
handling large problems than discrete optimization. However, Vt levels are highly discrete, and
its assignment is very difficult to be cast into a continuous optimization problem. The work by
Shah et al. [36] overcomes this difficulty by implementing each gate with two parallel parts of dif-
ferent Vt levels (Figure 10.4) and by sizing the two parts continuously. Since gate sizes are available
at a finer granularity than Vt levels, the discretization of a continuous sizing solutions is relatively
manageable [16]. Interestingly, it is observed in [36] that solving this new sizing problem through
convex programming usually results in integer Vt assignment solutions. More specifically, usually
one part (either low Vt or high Vt) of a parallel gate has size 0. It is further mathematically proved
that Vt assignment always self-snaps to integer solutions if there is no constraint on the gate sizes.

10.3.4 PRACTICAL GATE SIZING BY LAGRANGIAN RELAXATION

LR is a popular approach in solving gate sizing problems [4,12,21,23,24,28,40] mainly due to its
theoretical advantage on the trade-off between conflicting objectives, such as performance and
power. The foundation framework of LR-based gate sizing is introduced in the pioneer work [4]
and summarized in Section 10.2.3.

256 10.3 Practical Algorithms

A B B
A

Out

A Low Vt A High Vt

B B

FIGURE 10.4 Implementing a NAND2 gate by a parallel connection of two logic gates with differ-
ent Vt levels.

Later on, many practical techniques have been proposed to improve or extend this framework.
The Lagrangian dual problem is usually solved by subgradient methods [4]. In [40], it is observed
that the practical convergence of subgradient methods is very sensitive to the initial solution and
step size. The convergence is faster if the initial solution satisfies timing constraints. The work
by Tennakoon and Sechen [40] suggests invoking the iterative greedy algorithm [4] at the begin-
ning so that a timing-feasible initial solution is obtained. It also shows how to estimate Lagrange
multiplier values corresponding to the initial solution. With such an initial solution, the LR can
focus more on area minimization subject to timing constraints rather than delay reduction.
Tennakoon and Sechen [40] also propose the following Lagrange multiplier update method that
does not rely on a constant step size:

ì ai
ïl ji A if i Î PO, j Î fanin(i)
ï 0
ï aj
(10.18) l ji ,new = íl ji if j Î fanin(i)
ï ai - Di
ïl Di if i Î PI , j Î fanin(i)
ïî ji ai

where
λji is the multiplier for timing arc (j, i)
ai is the arrival time at node i
A0 is the required arrival time at the primary output (PO)
Di is the delay of gate i

Please note that the primary input (PI) nodes of a combinational logic circuit are driven by
ip-flops and, therefore, they have fanin nodes.
fl
Note that the multipliers are updated multiplicatively in Equation 10.18 and not in subgradient
direction. In practice, multiplicative updates often lead to faster convergence than the subgradi-
ent update shown in Equation 10.12, as demonstrated by the results discussed in [12] on ISPD
2012 and 2013 gate sizing benchmarks.

Chapter 10 – Gate Sizing 257

LR is also combined with DP-like solution search [23] so that a discrete solution is obtained
directly without the error-prone rounding procedure. DP has been quite successful for problems
with tree topologies, such as buffer insertion. Since the topology of combinational logic circuits is
a directed acyclic graph, the solution propagation of DP faces the challenge of history consistency
constraint. When two candidate solutions are merged at a node, the choices on their common
ancestor nodes should be identical. This constraint entails either large memory space or long
computation time as a large volume of history information needs to be stored or traced. Moreover,
it complicates pruning, which is a key component of the DP. The work of Liu and Hu [23] suggests
an interesting idea to solve this challenge. It first runs DP with the history consistency constraint
relaxed. Although the solution may be illegal, it provides a timing bound for each node. Next, it
applies a greedy-like heuristic to obtain a history consistent solution. Since the heuristic utilizes
the timing bound obtained from the DP, it is not myopic as a stand-alone greedy algorithm. Then,
the algorithm continues solution search in a manner that is between DP and a greedy heuristic.
Overall, it strives to gain the greatest possible benefit from DP, while avoiding excessive enumera-
tion for resolving inconsistencies.
The algorithm in [28] is also based on LR and DP. Unlike most previous works that constrain
only the longest path slack to be nonnegative, the problem formulation of [28] attempts to
minimize the absolute TNS and includes it as a part of the objective function. For cases without
timing-feasible solutions, the result from [28] is more usable. The algorithm of this work is built
upon a graph model as illustrated in Figure 10.5. In this model, gate delay and power are captured
in node and edge weight. Then, the Lagrangian subproblem, which optimizes a weighted sum
of gate delay and power, is solved by DP. In order to avoid the troublesome history consistency
problem, the entire circuit is partitioned into a set of disjoint trees where DP is conducted on each
individual tree.
In [24], the Lagrangian subproblem is solved by a discrete greedy heuristic. The algorithm
proceeds in circuit traversals. When a gate is encountered during traversal, its implementation
options are enumerated and the one that minimizes the overall cost function is selected.
This heuristic is very simple yet effective.

10.3.5 HEURISTICS BASED ON TIMING BUDGETING

Another approach to handle the trade-off between circuit timing and power is through timing
budgeting [27]. First, a minimum delay sizing solution is obtained using a certain simple heuristic.
Next, the timing slack is allocated to each gate. Last, each gate is sized down or changed to higher

A B

(a) (b)

FIGURE 10.5 (a) Circuit and its (b) graph model assuming each gate has only two implementation
options.

258 10.4 Conclusions and Outlook

Vt level according to its slack budget. The key component of this approach is a systematic alloca-
tion of timing slack to each node, which can be formulated as follows:

(10.19) Max å vi ÎV
di × xi

(10.20) s.t. ai + d j + x j £ a j "vi , v j ÎV , vi Î fanin(v j )

(10.21) a j £ T "v j Î PO

(10.22) 0 £ xi £ U i "vi ÎV

where
ai, di, and xi are the arrival time, delay, and slack budget for node vi
T is the required arrival time at the PO
Ui is an upper bound for the budget at node vi
δi is the power reduction per delay increase

The budgeting problem can be solved by linear programming [27] or network flow algorithms [13].
Noticing the correlation between gate delay and signal slew rate, the work of [18] proposes
to size gates through slew budgeting instead of delay budgeting. The slew target for each gate is
initialized with the design rules. Then, in a reverse topological order traversal of the circuit, a size
is found for each gate to meet its slew target. When sizing a gate, its fanout gate sizes have already
been updated, and therefore, the load capacitance is known. Its input slew can be estimated
according to the slew target of its fanin gates. After one circuit traversal, slew targets are adjusted
according to timing criticality. Timing critical gates are assigned with tighter slew constraints,
and the slew-driven sizing is repeated. Since slew estimation is much faster than timing analysis,
this approach can easily deal with very large circuits using a relatively slow sign-off timer.

10.3.6 GATE SIZING IN ELECTRONIC DESIGN AUTOMATION TOOLS

Some commercial tools use global gate sizing algorithms based on LR (Section 10.3.4) or timing
budgeting (Section 10.3.5) early in the design flow to get good starting solutions. Such global
solutions are improved by local search, where sensitivity-based heuristics (Section 10.3.1) are
used predominantly.
Gate sizing is usually not performed as an isolated task but incrementally alternated or integrated
with interconnect optimization, timing-driven detailed placement, or logic restructuring.
Again sensitivity-based heuristics are mostly used in the interplay with other operations [30].
As LR does not guarantee fast convergence, there is always the danger of a timing degradation,
when applied late in the flow. Again sensitivity-based heuristics that do not introduce new path delay
violations are prevalent for timing final power reduction [31].

10.4 CONCLUSIONS AND OUTLOOK

Gate sizing is an important and challenging problem in very-large-scale integration design. The
continuous gate sizing problem without Vt assignment can be solved in polynomial time for
posynomial delay functions.
In practice, discrete gate libraries are used, and no polynomial time algorithm with a perfor-
mance guarantee is known, except for trees or path-like netlists. On the other hand, there has
been substantial practical improvement, in particular, after the ISPD gate sizing contests 2012
and 2013 [12,15,20,21,23,24]. It can be expected that several of these new heuristics will find their
way into electronic design automation tools. Thereby, the major challenge is the interaction with
sign-off timing engines that, due to the support of many features, tend to be much slower than the
slim static timing implementations used in research papers. It is not clear how far the currently

Chapter 10 – Gate Sizing 259

best solutions are away from the optimum. Even if LR is used, the Lagrangian subproblems are
not solved to optimality. A fast computation of good lower bounds for the power consumption is
an important open problem.
On the theoretical side, discrete gate sizing algorithms with good performance guarantees are
lacking. As explained in Section 10.2.1, it aims high to achieve a constant-factor approximation
algorithm. However, providing a polynomial-time algorithm with a logarithmic performance
guarantee would be a significant result.

REFERENCES
1. S. Boyd, S.-J. Kim, L. Vandenberghe, and A. Hassibi. A tutorial on geometric programming.
Optimization and Engineering, 8(1), 67–127, 2007.
2. S. P. Boyd, S.-J. Kim, D. D. Patil, and M. A. Horowitz. Digital circuit optimization via geometric
programming. Operations Research, 53(6), 899–932, 2005.
3. P. K. Chan. Algorithms for library-specific sizing of combinational logic. Proceedings of the 27th
ACM/IEEE Design Automation Conference, Orlando, FL, 1990, pp. 353–356.
4. C. P. Chen, C. C.-N. Chu, and D. F. Wong. Fast and exact simultaneous gate and wire sizing by
Lagrangian relaxation. IEEE Transactions on Computer-Aided Design, 18(7), 1014–1025, 1999.
5. C. Chu and D. F. Wong. VLSI circuit performance optimization by geometric programming. Annals
of Operations Research, 105, 37–60, 2001.
6. J. Cong, J. Lee, and G. Luo. A unified optimization framework for simultaneous gate sizing and
placement under density constraints. Proceedings of 2011 IEEE International Symposium on
Circuits and Systems (ISCAS), Rio De Janeiro, Brazil, 2011, pp. 1207–1210.
7. A. R. Conn, I. M. Elfadel, W. W. Molzen, Jr., P. R. O’Brien, P. N. Strenski, C. Visweswariah, and
C. B. Whan. Gradient-based optimization of custom circuits using a static-timing formulation.
Design Automation Conference (DAC), New Orleans, LA, 1999, pp. 452–459.
8. J. F. Croix and D. F. Wong. Blade and razor: Cell and interconnect delay analysis using current-
based model. Proceedings of 40th ACM/IEEE Design Automation Conference, Anaheim, CA, 2003,
pp. 386–389.
9. I. Dinur and S. Safra. On the hardness of approximating minimum vertex cover. Annals of
Mathematics, 162, 439–485, 2005.
10. W. C. Elmore. The transient response of damped linear networks with particular regard to wide-band
amplifiers. Journal of Applied Physics, 19(1), 55–64, 1948.
11. J. P. Fishburn and A. E. Dunlop. TILOS: A posynomial programming approach to transis-
tor sizing. Proceedings of the IEEE/ACM International Conference on Computer-Aided Design,
Santa Clara, CA, 1985, pp. 326–328.
12. G. Flach, T. Reimann, G. Posser, M. de O. Johann, R. Reis. Effective method for simultaneous gate
sizing and Vth assignment using Lagrangian relaxation. IEEE Transactions on Computer Aided
Design of Integrated Circuits and Systems, 33(4), 546–557, 2014.
13. S. Ghiasi, E. Bozorgzadeh, P.-K. Huang, R. Jafari, and M. Sarrafzadeh. A unified theory of timing
budget management. IEEE Transactions on Computer-Aided Design, 25(11), 2364–2375, 2006.
14. A. Grigoriev and G. J. Woeginger. Project scheduling with irregular costs: Complexity, approximability,
and algorithms. Acta Informatica, 41, 83–97, 2004.
15. J. Hu, A. B. Kahng, S. Kang, M.-C. Kim, and I. L. Markov. Sensitivity-guided metaheuristics for accu-
rate discrete gate sizing. Proceedings of the IEEE/ACM International Conference on Computer-Aided
Design, San Jose, CA, 2012, pp. 233–239.
16. S. Hu, M. Ketkar, and J. Hu. Gate sizing for cell-library-based designs. IEEE Transactions on Computer-
Aided Design, 28(6), 818–825, 2009.
17. S. Hu, Z. Li, and C. J. Alpert. A fully polynomial time approximation scheme for timing driven
minimum cost buffer insertion. Proceedings of the 46th Annual Conference on Design Automation,
San Francisco, CA, 2009, pp. 424–429.
18. S. Held. Gate sizing for large cell-based designs. Proceedings of Design, Automation and Test in Europe
Conference, Nice, France, 2009, pp. 827–832.
19. S. Joshi and S. Boyd. An efficient method for large-scale gate sizing. IEEE Transactions on Circuits and
Systems, 55(9), 2760–2773, 2008.
20. A. B. Kahng, S. Kang, H. Lee, I. L. Markov, and P. Thapar. High-performance gate sizing with a signoff
timer. Proceedings of the International Conference on Computer-Aided Design, San Jose, CA, 2013,
pp. 450–457.
21. L. Li, P. Kang, Y. Lu, and H. Zhou. An efficient algorithm for library-based cell-type selection in
high-performance low-power designs. Proceedings of the IEEE/ACM International Conference on
Computer-Aided Design, San Jose, CA, 2012, pp. 226–232.

260 References

22. C. Liao and S. Hu. Approximation scheme for restricted discrete gate sizing targeting delay minimi-
zation. Journal of Combinatorial Optimization, 21(4), 497–510, 2011.
23. Y. Liu and J. Hu. A new algorithm for simultaneous gate sizing and threshold voltage assignment.
IEEE Transactions on Computer-Aided Design, 29(2), 223–234, 2010.
24. V. S. Livramento, C. Guth, J. L. Güntzel, and M. de O. Johann. Fast and efficient Lagrangian
relaxation-based discrete gate sizing. ACM Transactions on Design Automation of Electronic
Systems, 19(4), Article No. 40, 2014.
25. M. Minoux. A polynomial algorithm for minimum quadratic cost flow problems. European Journal
of Operational Research, 18(3), 377–387, 1984.
26. L. W. Nagel and D. O. Pederson. SPICE (Simulation Program with Integrated Circuit Emphasis).
Technical Report M382, EECS Department, University of California, Berkeley, CA, 1973.
27. D. Nguyen, A. Davare, M. Orshansky, D. Chinnery, B. Thompson, and K. Keutzer. Minimization of
dynamic and static power through joint assignment of threshold voltages and sizing optimization.
Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, Seoul,
Korea, 2003, pp. 158–163.
28. M. Ozdal, S. Burns, and J. Hu. Algorithms for gate sizing and device parameter selection for high-
performance designs. IEEE Transactions on Computer-Aided Design, 31(10), 1558–1571, 2012.
29. B. T. Polyak. A general method for solving extemum problems. Doklady Akademii Nauk SSSR, 174(1),
33–36, 1967. Translation in Soviet Mathematics Doklady, 8(3), 593–597, 1967.
30. R. Puri, M. Choudhury, H. Qian, and M. Ziegler. Bridging high performance and low power in processor
design. Proceedings of the 2014 International Symposium on Low Power Electronics and Design, La Jolla,
CA, 2014, pp. 183–188.
31. H. Qian and E. Acar. Timing-aware power minimization via extended timing graph methods. ASP
Journal of Low Power Electronics, 3(3), 318–326, 2007.
32. C. Ratzlaff and L. T. Pillage. RICE: Rapid interconnect circuit evaluation using asymptotic waveform
evaluation. IEEE Transactions on Computer-Aided Design, 13(6), 763–776, 1994.
33. D. Rautenbach and C. Szegedy. A subgradient method using alternating projections. Technical Report
04940, Research Institute for Discrete Mathematics, University of Bonn, Bonn, Germany, 2004.
34. D. Rautenbach and C. Szegedy. A class of problems for which cyclic relaxation converges linearly.
Computational Optimization and Applications, 41, 52–60, 2008.
35. S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S.-M. Kang. An exact solution to the transis-
tor sizing problem for CMOS circuits using convex programming. IEEE Transactions on Computer-
Aided Design, 12(11), 1621–1634, 1993.
36. S. Shah, A. Srivastava, D. Sharma, D. Sylvester, D. Blaauw, and V. Zolotov. Discrete vt assignment and
gate sizing using a self-snapping continuous formulation. Proceedings of the IEEE/ACM International
Conference on Computer-Aided Design, 2005, pp. 705–712.
37. R. F. Sproull and I. E. Sutherland. Logical effort: Designing for speed on the back of an envelope. IEEE
Advanced Research in VLSI C, Sequin, ed., MIT Press, Cambridge, MA, 1991, pp. 1–16.
38. A. Srivastava, D. Sylvester, and D. Blaauw. Power minimization using simultaneous gate sizing, mini-
mization using simultaneous GatDual-Vdd and Dual-Vth assignment. ACM/IEEE Design Automation
Conference, San Diego, CA, 2004, pp. 783–787.
39. O. Svensson. Hardness of vertex deletion and project scheduling. Theory of Computing, 9(24),
759–781, 2013.
40. H. Tennakoon and C. Sechen. Gate sizing using Lagrangian relaxation combined with a fast
gradient-based pre-processing step. Proceedings of the IEEE/ACM International Conference on
Computer-Aided Design, San Jose, CA, 2002, pp. 395–402.
41. L. Trevillyan, D. S. Kung, R. Puri, L. N. Reddy, and M. A. Kazda. An integrated environment for
technology closure of deep-submicron IC designs. IEEE Design & Test of Computers, 21(1), 14–22,
2004.
42. A. Wächter, C. Visweswariah, and A. R. Conn. Large-scale nonlinear optimization in circuit tuning.
Future Generation Computer Systems, 21(8), 1251–1262, 2005.
43. J. Wang, D. Das, and H. Zhou. Gate sizing by Lagrangian relaxation revisited. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 28(7), 1071–1084, 2009.
44. N. H. E. Weste and D. Harris. CMOS VLSI Design: A Circuits and Systems Perspective, 4th edn.,
Addison-Wesley Publishing Company, Boston, MA, 2010.
45. T.-H. Wu and A. Davoodi PaRS: Parallel and near-optimal grid-based cell sizing for library-based
design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 28(11),
1666–1678, 2009.

Unit 2
No ratings yet
Unit 2
26 pages
Module 4 - Combination - Circuit - Design - Notes
No ratings yet
Module 4 - Combination - Circuit - Design - Notes
28 pages
Unit - 2: Asic Library Design O Delay Estimation
No ratings yet
Unit - 2: Asic Library Design O Delay Estimation
43 pages
Sutherland Ch1
No ratings yet
Sutherland Ch1
26 pages
CO221 KBS7 Timing Hazards
No ratings yet
CO221 KBS7 Timing Hazards
17 pages
Gditechnology 151220144934
No ratings yet
Gditechnology 151220144934
80 pages
Lect 6
No ratings yet
Lect 6
14 pages
541 6
No ratings yet
541 6
6 pages
Unit-2 2marks With Answers
No ratings yet
Unit-2 2marks With Answers
5 pages
Floor Plan
No ratings yet
Floor Plan
40 pages
VLSI Floorplanning & Placement Guide
No ratings yet
VLSI Floorplanning & Placement Guide
66 pages
Gated Clock Cloning For Timing Fixing 12 16
No ratings yet
Gated Clock Cloning For Timing Fixing 12 16
19 pages
Advanced VLSI Design: Dr. Premananda B.S
No ratings yet
Advanced VLSI Design: Dr. Premananda B.S
43 pages
Efficient Design of 1
No ratings yet
Efficient Design of 1
7 pages
Optimization of Power Consumption in VLSI Circuit
No ratings yet
Optimization of Power Consumption in VLSI Circuit
6 pages
ASIC Design Flow Guide
No ratings yet
ASIC Design Flow Guide
14 pages
What Is Physical Synthesis?: Zhuo Li and Charles J. Alpert, IBM
No ratings yet
What Is Physical Synthesis?: Zhuo Li and Charles J. Alpert, IBM
3 pages
Implementation of Synchronous Up Counter by Using Self Resetting Logic
No ratings yet
Implementation of Synchronous Up Counter by Using Self Resetting Logic
10 pages
FPGA Technology in VLSI Design
No ratings yet
FPGA Technology in VLSI Design
4 pages
A Linear Programming Based Static Power Optimization Scheme For Digital CMOS Circuits AMSC 662 - Project Proposal
No ratings yet
A Linear Programming Based Static Power Optimization Scheme For Digital CMOS Circuits AMSC 662 - Project Proposal
3 pages
Timing & Design for Engineers
100% (3)
Timing & Design for Engineers
42 pages
LNCS 3254 - Delay Evaluation of High Speed Data-Path Circuits Based On Threshold Logic
No ratings yet
LNCS 3254 - Delay Evaluation of High Speed Data-Path Circuits Based On Threshold Logic
8 pages
Data To Data Checks
No ratings yet
Data To Data Checks
37 pages
Optimising A D Flip Flop Through Delay and Power Estimation Using An RC Model and Transistor Sizing
No ratings yet
Optimising A D Flip Flop Through Delay and Power Estimation Using An RC Model and Transistor Sizing
7 pages
Advanced ASIC Design Techniques
No ratings yet
Advanced ASIC Design Techniques
5 pages
M2 VLSIb
No ratings yet
M2 VLSIb
20 pages
Modern ASIC - Design - Chapter 2 - Flow & Perspective
No ratings yet
Modern ASIC - Design - Chapter 2 - Flow & Perspective
57 pages
Design and Implementation of VLSI Systems
No ratings yet
Design and Implementation of VLSI Systems
46 pages
Timing Analysis
No ratings yet
Timing Analysis
22 pages
Efficient 3-Input XOR Parity Design
100% (1)
Efficient 3-Input XOR Parity Design
6 pages
Cell Design Delay Calculation Guide
No ratings yet
Cell Design Delay Calculation Guide
6 pages
Part3 Delay High Speed Design Logical Effort
No ratings yet
Part3 Delay High Speed Design Logical Effort
29 pages
Digital Electronics Lab 1 Guide
No ratings yet
Digital Electronics Lab 1 Guide
10 pages
De Paper Solution 2024 (Pran Tehare)
No ratings yet
De Paper Solution 2024 (Pran Tehare)
15 pages
Digital IC Design Essentials
100% (1)
Digital IC Design Essentials
16 pages
VLSI Design Flow (Study Material)
No ratings yet
VLSI Design Flow (Study Material)
3 pages
VLSI ARI-304 Unit 1
No ratings yet
VLSI ARI-304 Unit 1
45 pages
VLSI Design Automation and Methodologies
No ratings yet
VLSI Design Automation and Methodologies
284 pages
Clock Gating: Smart Use Ensures Smart Returns
No ratings yet
Clock Gating: Smart Use Ensures Smart Returns
4 pages
V.L.S.I. I: Page 1 of 50
No ratings yet
V.L.S.I. I: Page 1 of 50
50 pages
Razor Thesis
No ratings yet
Razor Thesis
13 pages
Slide Group2 - Power Consumption in CMOS Logic Gates
No ratings yet
Slide Group2 - Power Consumption in CMOS Logic Gates
19 pages
Logic Gates and Power in VLSI
No ratings yet
Logic Gates and Power in VLSI
36 pages
ECO Timing Optimization Using Spare Cells2
No ratings yet
ECO Timing Optimization Using Spare Cells2
31 pages
TC Lecture1
No ratings yet
TC Lecture1
45 pages
Dac80 338
No ratings yet
Dac80 338
2 pages
Adv Vlsi hw2 Static Timing
No ratings yet
Adv Vlsi hw2 Static Timing
5 pages
Designing of Combinational Logic Gates in Cmos: G.Susmitha Roll No:06
No ratings yet
Designing of Combinational Logic Gates in Cmos: G.Susmitha Roll No:06
49 pages
Timing Hazards in Digital Logic Circuits
No ratings yet
Timing Hazards in Digital Logic Circuits
105 pages
Genetic Synthesis Techniques For Low Power Digital Signal Processing Circuits PDF
No ratings yet
Genetic Synthesis Techniques For Low Power Digital Signal Processing Circuits PDF
5 pages
A Design Approach For Fine-Grained Run-Time Power Gating Using Locally Extracted Sleep Signals
No ratings yet
A Design Approach For Fine-Grained Run-Time Power Gating Using Locally Extracted Sleep Signals
7 pages
Essentials of Physical Design Process
No ratings yet
Essentials of Physical Design Process
111 pages
Latches vs. Flip-Flops: Key Differences
No ratings yet
Latches vs. Flip-Flops: Key Differences
24 pages
VLSIGuru Physical Design
No ratings yet
VLSIGuru Physical Design
30 pages
Lec 11
No ratings yet
Lec 11
42 pages
Multi-Objective Circuit Partitioning For Cutsize and Path-Based Delay Minimization
No ratings yet
Multi-Objective Circuit Partitioning For Cutsize and Path-Based Delay Minimization
5 pages
Transistor Sizing of Energy-Delay - Efficient Circuits
No ratings yet
Transistor Sizing of Energy-Delay - Efficient Circuits
7 pages
Static Timing Analysis
No ratings yet
Static Timing Analysis
22 pages
Physical Verification
No ratings yet
Physical Verification
1 page
Physical Design For 3D ICs
No ratings yet
Physical Design For 3D ICs
27 pages
Design and Analysis of Power Networks
No ratings yet
Design and Analysis of Power Networks
18 pages
Design Flows
No ratings yet
Design Flows
23 pages
Design Closure
No ratings yet
Design Closure
39 pages
Digital Signal Processing in Telecommunication
No ratings yet
Digital Signal Processing in Telecommunication
874 pages
Ransomware Impacting Pipeline Operations - CISA
No ratings yet
Ransomware Impacting Pipeline Operations - CISA
11 pages
Material Prices
No ratings yet
Material Prices
2 pages
SlatePlan Shingle Roof Fastening
No ratings yet
SlatePlan Shingle Roof Fastening
2 pages
Sales Pipeline and Business Forecast
No ratings yet
Sales Pipeline and Business Forecast
27 pages
5 Ci Sinif Word Definition-4-2025
No ratings yet
5 Ci Sinif Word Definition-4-2025
2 pages
Thesis Statement For Ford Motor Company
100% (3)
Thesis Statement For Ford Motor Company
4 pages
Important Telephone Nos
No ratings yet
Important Telephone Nos
1 page
G-33 (Rev. 1) : Indian Rail Ways
No ratings yet
G-33 (Rev. 1) : Indian Rail Ways
29 pages
Inam Ullah Khan (editor), Salma El Hajjami (editor), Mariya Ouai - Cognitive Machine Intelligence_ Applications, Challenges, and Related Technologies (Intelligent Data-Driven Systems and Artific (2024, CRC Press) -
No ratings yet
Inam Ullah Khan (editor), Salma El Hajjami (editor), Mariya Ouai - Cognitive Machine Intelligence_ Applications, Challenges, and Related Technologies (Intelligent Data-Driven Systems and Artific (2024, CRC Press) -
373 pages
Checklist For Electrical Safety Inspection
100% (5)
Checklist For Electrical Safety Inspection
1 page
Data Analytics & Mining Course Guide
No ratings yet
Data Analytics & Mining Course Guide
2 pages
AutoNEST FX V9.5.2.2 Update Guide
No ratings yet
AutoNEST FX V9.5.2.2 Update Guide
6 pages
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
No ratings yet
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
8 pages
Amazon Seller Acronyms Explained
No ratings yet
Amazon Seller Acronyms Explained
1 page
Bureaucracy
No ratings yet
Bureaucracy
29 pages
Theories of IR and Their Approach To Security Studies (2021) - ARTICLE (SK)
No ratings yet
Theories of IR and Their Approach To Security Studies (2021) - ARTICLE (SK)
8 pages
Address: Plot 71 Stadium Road and ISPON Secretariat, Igwuruta, R/S
No ratings yet
Address: Plot 71 Stadium Road and ISPON Secretariat, Igwuruta, R/S
4 pages
0.01.balance Verification Form
No ratings yet
0.01.balance Verification Form
1 page
DLL - English 3 - Q2 - W6
No ratings yet
DLL - English 3 - Q2 - W6
3 pages
Cryptobazaar
No ratings yet
Cryptobazaar
22 pages
About Earthyn
No ratings yet
About Earthyn
6 pages
Unani Medicine Expert's Profile
100% (2)
Unani Medicine Expert's Profile
7 pages
Latika Thapliyal
No ratings yet
Latika Thapliyal
5 pages
Tle9 q1 Mod5 Common OHS Hazards Risks and Its Control v3
No ratings yet
Tle9 q1 Mod5 Common OHS Hazards Risks and Its Control v3
23 pages
Lenovo Thinkpad L420 DAGC9EMB8E0 REV-E PDF
No ratings yet
Lenovo Thinkpad L420 DAGC9EMB8E0 REV-E PDF
53 pages
STGB30H60DFB, STGP30H60DFB: Trench Gate Field-Stop 600 V, 30 A High Speed HB Series IGBT
No ratings yet
STGB30H60DFB, STGP30H60DFB: Trench Gate Field-Stop 600 V, 30 A High Speed HB Series IGBT
21 pages
Synthesis and Electrochemistry of LiNiMn2 - O4
No ratings yet
Synthesis and Electrochemistry of LiNiMn2 - O4
9 pages
Backend Syllabus
No ratings yet
Backend Syllabus
12 pages
Brine Solution Power Bank
No ratings yet
Brine Solution Power Bank
14 pages
Importance of Lesson Planning
No ratings yet
Importance of Lesson Planning
20 pages
Indian Monsoon
No ratings yet
Indian Monsoon
14 pages

Gate Sizing

Uploaded by

Gate Sizing

Uploaded by

Gate Sizing 10

Stephan Held and Jiang Hu

© 2016 by Taylor & Francis Group, LLC

10.3.5 Heuristics Based on Timing Budgeting 257

© 2016 by Taylor & Francis Group, LLC

10.1.1 STATIC TIMING

Physical design optimization is usually based on static timing constraints, represented by a

(10.2) av + delay v ,w ( x) £ aw "(v , w) Î E(G ).

10.1.2 DELAY MODELS

FIGURE 10.1 Example of a timing graph and its underlying netlist.

© 2016 by Taylor & Francis Group, LLC

(a) Slew (b) Slew

10.1.2.1 THE RC DELAY MODEL

where C wire , rg , c g ¢ , c intrinsic

© 2016 by Taylor & Francis Group, LLC

Gate delay (ps)

Gate delay (ps)

10.1.2.2 MODEL-INDEPENDENT DELAY CHARACTERISTICS

10.1.3 IMPACT ON POWER AND AREA

Pdynamic ( g ) = Pswitching ( g ) + Pshort ( g ),

© 2016 by Taylor & Francis Group, LLC

(10.3) P total = Pstatic

10.1.4 PROBLEM FORMULATIONS

© 2016 by Taylor & Francis Group, LLC

10.2 THEORETICAL BACKGROUND

10.2.1 DISCRETE SIZING

10.2.2 CONTINUOUS SIZING

© 2016 by Taylor & Francis Group, LLC

10.2.3 LAGRANGIAN RELAXATION

(10.6) L(l, x, a) = P total ( x) + å

For given λ ≥ 0, the Lagrangian dual function

(10.8) sup L (l).

(10.9) L(l, x, a) = L1(l, x) + L2 (l, a)

(10.10) L1(l, x) = P total ( x) + å

© 2016 by Taylor & Francis Group, LLC

Otherwise, as the arrival time variables av (v ∈ V(G)) are unrestricted,

10.2.3.1 SOLVING THE LAGRANGIAN SUBPROBLEM

10.2.3.2 SOLVING THE LAGRANGIAN DUAL PROBLEM

© 2016 by Taylor & Francis Group, LLC

10.3 PRACTICAL ALGORITHMS

10.3.1 SENSITIVITY-BASED HEURISTICS

(10.14) DTNS(mik ) = å -Ddelay

© 2016 by Taylor & Francis Group, LLC

The power-phase sensitivity in [38] is defined by

10.3.2 DEDICATED WORST-SLACK MAXIMIZATION

10.3.3 CONTINUOUS OPTIMIZATION WITH SELF-SNAPPING

10.3.4 PRACTICAL GATE SIZING BY LAGRANGIAN RELAXATION

© 2016 by Taylor & Francis Group, LLC

© 2016 by Taylor & Francis Group, LLC

10.3.5 HEURISTICS BASED ON TIMING BUDGETING

© 2016 by Taylor & Francis Group, LLC

(10.20) s.t. ai + d j + x j £ a j "vi , v j ÎV , vi Î fanin(v j )

10.3.6 GATE SIZING IN ELECTRONIC DESIGN AUTOMATION TOOLS

10.4 CONCLUSIONS AND OUTLOOK

© 2016 by Taylor & Francis Group, LLC

© 2016 by Taylor & Francis Group, LLC

© 2016 by Taylor & Francis Group, LLC

You might also like

(10.8) sup L (l).