0% found this document useful (0 votes)
96 views14 pages

Physical Design of CMOS Chips in Six Easy Steps

The document describes the 6-step process for designing CMOS chips: 1. Design capture - Logic is designed using a hardware description language like Verilog. 2. Synthesis - The HDL code is synthesized into basic logic gates and flip-flops. 3. Floorplanning - Major blocks are arranged and optimized for layout. 4. Placement and routing - Logic gates are mapped to physical cells and connected. 5. Extraction - Circuit properties are extracted for timing/power analysis. 6. Verification - The design is verified through simulation against requirements.

Uploaded by

kjvivek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views14 pages

Physical Design of CMOS Chips in Six Easy Steps

The document describes the 6-step process for designing CMOS chips: 1. Design capture - Logic is designed using a hardware description language like Verilog. 2. Synthesis - The HDL code is synthesized into basic logic gates and flip-flops. 3. Floorplanning - Major blocks are arranged and optimized for layout. 4. Placement and routing - Logic gates are mapped to physical cells and connected. 5. Extraction - Circuit properties are extracted for timing/power analysis. 6. Verification - The design is verified through simulation against requirements.

Uploaded by

kjvivek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Physical Design of CMOS Chips

in Six Easy Steps

Sidney E. Benda

Intel Corp., Colorado Springs Design Center


[Link]@[Link]

Abstract. Aimed at software professional not familiar with design pro-


cesses of semiconductor elements that execute their programs. This paper
focuses on algorithms and software involved in a design, layout and ver-
ification of CMOS parts. The first section introduces basic building ele-
ments of a design — transistor and standard cell — and shows their rep-
resentation in design database. The design flow is segmented into six
distinct steps: design capture, synthesis, floorplanning, placement and
routing, extraction and verification.

1 Design Database
Image of physical structures on a chip is typically stored in a form of hierar-
chically linked lists that can be accessed either programmatically via LISP-like
interface or by graphical front-end that allows drawing and manipulation of poly-
gons, transistors, cells and blocks. The database access and manipulation will
be shown on examples in the following paragraphs.

1.1 MOS Transistor

Proceeding bottom up in a physical layout of a chip, the smallest circuit unit is


a transistor.

Vg ≥ Vt
Source Gate Drain Channel
Vs Vd

n n n n
p p

Fig. 1. The cross section of n-channel MOS transistor

V. Hlaváč, K. G. Jeffery, and J. Wiedermann (Eds.): SOFSEM 2000, LNCS 1963, pp. 115–128, 2000.

c Springer-Verlag Berlin Heidelberg 2000
116 Sidney E. Benda

To fabricate n-channel MOS (Metal Oxide Semiconductor) transistor, p-type


substrate is covered with an insulating layer of silicon dioxide. Two windows are
cut into oxide to allow diffusion to create two separate n-regions, the source and
drain. A conductive layer of polysilicon is laid on top. When positive voltage
greater than threshold voltage is applied to the gate, n-type conductive channel
is created between the source and drain. To capture footprint of a transistor in
a database, we need two rectangles on respective layers “poly” and “diffusion”:

((objType "rectangle" ((-0.5 -1.0) (0.5 1.0))


layer "poly")
(objType "rectangle" ((-1.0 -0.5) (1.0 0.5))
layer "diffusion")
)

1.2 Standard Cell

Although individual transistors can be created in layout of a chip, the usual


building unit is a standard cell composed of several transistors.
Common standard cell implementing two-input logical function NAND is
shown in Fig. 2. The cell uses two n-channel transistors shown in Fig. 1 and two
p-channel transistors whose behavior is complementary, i. e. they open when n-
channel closes and vice versa. This combination results in low power consumption
as the cell dissipates power only while switching. Assuming that low voltage on
input/output in logical 0, while high voltage is 1, the truth table for this cell is:
Input A Input B Output C
0 0 1
0 1 1
1 0 1
1 1 0
The database record for this cell contains polygons on layers “poly”, “diffu-
sion”, “metal1” and “contact” as well as properties describing connectivity and
location:

(objType "cellInst"
(cellName "NAND")
(instName "nx1")
(shapes ((... list of cell geometries ...)))
(xy nil)
(orient "R0")
(pins (input "A") (input "B") (output "C))
)
Physical Design of CMOS Chips in Six Easy Steps 117

VDD

VDD
A B p-channel
C pull-ups

A C B

n-channel
pull-downs

GND GND

Fig. 2. Layout and schematic of two-input NAND gate

1.3 Design Rules


As indicated in previous discussion of transistor and standard cell, each fab-
rication process puts several layers on the surface of silicon substrate. Besides
diffusion and poly layers used inside cells, there is up to six metal layers for
connections. The metal layers are separated by oxide or connected — where nec-
essary — by pseudo-layer called via (cut in oxide). In order to guarantee accept-
able yield of CMOS fabrication, each process has set of layer properties and
constraints called design rules. The rules are usually grouped into the following
three types:
1. Size rules. For each layer, there is minimum feature size below which the
functionality is not guaranteed. For example, poly width in transistor needs
to be at least 0.35 microns, otherwise it may not be able to open the con-
ductive channel.
2. Separation rules. Different features on the same layer, e. g. two adjacent
wires on the same metal layer must be spaced certain distance so they do
not accidentally short.
3. Overlap rules. These are necessary for proper function of transistors (poly
over diffusion) and for connection between two layers of metal (metal — via
overlap).
The set of rules, along with additional physical properties of the process is con-
tained in technology file that is integral part of design database. Technology file
contains also physical dimensions and properties of the process utilized in latter
steps of the physical design:
((Technology "pseudoProcess"
(timeUnit "ps")
(capUnit "fF")
118 Sidney E. Benda

(dielectric 3.9)
)
(Layer "metal1"
(layerNum 8)
(minWidth 0.35)
(minSpacing 0.35)
(pitch 1.0)
(nomResistance 0.0615)
(nomCapacitance 6.31e-3)
(heightFromSub 1.50)
(nomThickness 0.60)
)
(DesignRule "minSpacing" ("poly" "via1" 0.4))
(DesignRule "minEnclosure" ("metal1" "via1" 0.2))
)

2 Design Capture
The original way of logic design using schematic capture is clearly not feasible
in projects containing up to 109 elements. The need for higher level structured
description led to origin of hardware description languages (HDL) such as Verilog
and VHDL. The following Verilog code segment is to illustrate several most
common HDL constructs:
module multiplier(result, op_a, op_b);
parameter size=8;
input [size:1] op_a, op_b;
output [2*size:1] result;
reg [2*size:1] shift_opa, result;
reg [size:1] shift_opb;
always @(op_a or op_b)
begin
result = 0;
shift_opa = op_a; // zero extend left
shift_opb = op_b;
repeat (size)
begin
#10 if(shift_opb[1]) result = result + shift_opa;
shift_opa = shift_opa<<1; // logical left shift
shift_opb = shift_opb>>1; // logical right shift
end
end
endmodule
The code above multiplies two operands op_a, op_b by repeated shifts of both
operands and conditional adding of shifted op_a.
Physical Design of CMOS Chips in Six Easy Steps 119

2.1 Modules
The basic building block in Verilog is a module. The complete design is tree
of module instances with root in the top-level module. Module header specifies
boundary of logic block, direction and size of its ports. Component instantiation
creates new module instance inside higher level module:
module ALU (...);
...
reg [7:0] OUT1, OUT2;
wire [7:0] A, B;
...
// positional port mapping
multiplier MULT1 (OUT1, A, B);

// named port mapping


multiplier MULT2 (.op_a(A), .op_b(B), .res(OUT2));
endmodule

2.2 Data Types


Register data type is declared using keyword reg. This type of variable actually
stores data. The net data type is declared as wire. Wire is a physical connection
between structural entities of Verilog. It does not store data, only propagates
them from register to register. The third data type shown — parameter — is
essentially symbolic constant. All three data types can take a form of bit vector
by specifying its range as [most significant bit : least significant bit] .

2.3 Behavior
Most of the operators and control structures in a module body resemble con-
ventional programming languages and therefore do not need to be discussed in
detail. The exception is always block. Each of these blocks is sequentially exe-
cuted inside, but all always blocks run concurrently. The @(...) part of each
always block is a list of variables that trigger its execution upon change of their
values.
Another important HDL feature is concept of sequencing of events in time.
The notation #10 on line 14 in the mutiplier example is delay specification indi-
cating that this statement takes 10 time units to execute.
Let it be noted that majority of designs are synchronous. In these designs,
stages of combinatorial logic feed into register type variables that are synchro-
nized by clock signal (Fig. 3).

3 Synthesis
Synthesis is a process in which high-level HDL description of a design is converted
into primitives that can be directly laid out on silicon die. There are three tasks
performed during synthesis:
120 Sidney E. Benda

REG1 REG2 REG3


Stage 1 ✲ ✲ Stage 2 ✲ ✲ Stage 3 ✲ ✲
Logic Logic Logic
✲ ✲ ✲

CLK

Fig. 3. Example of synchronous logic: Three-stage pipeline

1. Mapping of logic-level HDL constructs onto target library components.


2. Logic optimization attempting to reduce the area and number of compo-
nents.
3. Selecting sizes of components to meet the timing requirements.
Step 1 is similar to compilation of a high-level programming language. The HDL
operators and structures are mapped on logical equations that are subject to
Boolean optimization in step 2. Step 3 considers non-zero delay of combinatorial
logic components and tries to make sure that overall delay by each logic stage
(see Fig. 3) satisfies requirement given by clock rate (i. e. signal delay through
logic has to be less than period of the clock). Selecting faster cell to meet timing
requirement causes increased power consumption so these two requirements need
to be carefully balanced. Synthesis run typically needs to set boundary conditions
(external clock rate, set input delay, set load) and take into account various
constraints imposed by technology (max fanout, max transition).
As an example, let us use dataflow description of a device that selects one of
its two inputs A, B by the value of signal SEL:
module MUX2(OUT, A, B, SEL);
output OUT;
input A,B,SEL;
if (SEL == 1’b0) OUT = A; else OUT = B;
endmodule
The module will synthesize into gate-level netlist as
module MUX2(OUT, A, B, SEL);
output OUT;
input A,B,SEL;
wire a1, b1, sel;
NOT not1(sel,SEL);
AND and1(a1,A,sel);
AND and2(b1,B,SEL);
OR or1(OUT,a1,b1);
endmodule
Physical Design of CMOS Chips in Six Easy Steps 121

In Verilog terms, this is structural module that contains nothing else but
instances of cells from target library interconnected by wires.
The target library contains standard cells (as described in the intro section)
annotated by information on internal cell delays, operating conditions and cell
size. In our example, the target library contained only AND, OR and NOT gates
while typical production libraries contains hundreds of cells of different functions
and sizes, including 2-input multiplexer synthesized above.

4 Floorplanning
Physical representation of modules in design hierarchy can be sorted into several
categories:
1. Bond pads that represent inputs and outputs of the root module in the
design. They are essentially rectangles of metal for soldering of wires during
packaging. Their location is always on perimeter of a layout.
2. Re-used blocks. Layout of modules like RAM, ROM, register files and others
may be imported from previous designs.
3. Custom blocks. Typical for analog circuitry (for example A/D converter),
layout of some modules need to be created by custom drawing of circuit
components.
4. Standard cell blocks. These are gate-level netlist modules produced by syn-
thesis as shown in step 2. As opposed to previous three categories which have
fixed size, shape and connecting points (pins), standard cell blocks (some-
times called soft blocks) need to be assigned these properties in the first step
of floorplanning.

The second step of floorplanning is placement of defined blocks. There are num-
ber of tools available, mostly based on minimizing total length of nets connecting
the blocks. There are algorithms [7,11] producing very good results as far as area
optimization, yet there is observed lack of algorithm that would factor in a degree
of rectilinearity (“number of corners”) that results in routing penalties.
Final step of floorplanning creates power/ground grid and performs block
level signal routing.
Fig. 4 shows intermediate stage of floorplanning with pad ring in place,
unplaced hard blocks (left side) and defined but unplaced soft blocks of standard
cells.

5 Placement and Routing


5.1 Placement

A synthesized block contains anything between 103 to 106 standard cells. The
placement algorithm is mapping the set of cells onto placement grid while mini-
mizing total net length and congestion. The congestion metric is usually defined
122 Sidney E. Benda

Pad1 Pad2 Pad3 Pad4 Pad5

Pad20 Pad6

Pad19 Pad7

Pad18 Pad8

Pad17 Pad9

Pad16 Pad10

Pad15 Pad14 Pad13 Pad12 Pad11

std cell block


18
RAM 413

24 25

ROM
REG

Fig. 4. Design blocks in initialized floorplan with display of connectivity

in terms of net connections per unit of square area [8]. There is significant num-
ber of placement algorithms discussed in [10,8] out of which the first usable ones
were based on simulated annealing: the optimization of a cell placement is anal-
ogous to the process of annealing melted material into highly ordered crystalline
state. Besides being extremely time-consuming, these algorithms did very little
for removing congestion.
Significantly faster are placement tools based on min-cut. The min-cut algo-
rithm slices a block into two halves and swaps cells to minimize number of nets
crossing the cut. An alternative is quad-cut in which block is sliced horizon-
tally and vertically into four sections. This is repeated recursively inside the cut
sections.
Quadratic method stores the cost of connecting each pair of cells (net length)
in connection matrix. The process of net length minimization and proof of exis-
tence of non-trivial solution are discussed in [3].
The congestion problem typically happens in sections of combinatorial logic
with high connectivity and low porosity of cells. Advanced placement tools allow
Physical Design of CMOS Chips in Six Easy Steps 123

displaying of congestion map — color-coded indication of required number of net


connections versus available routing resources. Example of placement conges-
tion map is in Fig. 5. In this design, most connections run in vertical direction
with congestion in upper right corner. The congestion removal is accomplished
by inserting appropriate amount of white space between cell instances in the
congested area.

5.2 Global Route


In the placement phase, the exact location of cell instances and their connecting
points has been determined. The task of global routing is to distribute connecting
nets over the block as uniformly as possible to avoid routing congestion. The area
of a standard cell block is first divided into switchboxes — rectangular regions
similar to area units used in evaluation of placement congestion. When routing
net from one connecting point to another, each switchbox counts number of nets
crossing each of its sides. The goal is to find configuration of the nets yielding
minimum congestion and minimum total net length. Some global routers show
congestion map similar to the one shown in Fig. 5.

5.3 Detail Route


Considering net as n-tuple (p1 , p2 , . . . , pn ) where pi is pin of a cell, detail rout-
ing is a step that actually creates conductive path connecting together all pins
belonging to the net. This is performed for all nets in a given block.
In dark ages of VLSI technology when maximum of two interconnecting metal
layers were available, all routing was done in channels — white space between
rows of standard cells. Metal 1 used for connections inside cell would repre-
sent significant blockage to do much of over-the-cell routing. Utilization — in
terms of ratio of cell area to total block area — would rarely exceed 50 %. More
recent processes use up to six metal layers, which allow much higher utilization
(approaching 100 %) with all routing done over the cells.
Detail routing algorithm works on 3-dimensional routing grid extending over
the entire routed block. The z-dimension corresponds to each metal layer while x-
and y-dimensions specify possible location for horizontal and vertical wires. The
pitch of the grid is given by design rule specifying minimum spacing of wires
with allowance for adjacent vias (transition from one metal layer to another)
that are usually wider than wires. The routing pitch also needs to be considered
in construction of standard cells so that when cells are placed, all their pins lie
on the routing grid.
The first task of a detail router is to map blockages from underlying cells
into the routing grid. Space occupied by polygons on “metal1” or “metal2” used
in connections inside cells obviously cannot be used for net connections and has
to be blocked out of the routing grid (black circles in Fig. 6). This reduces the
number of available routing grid vertices on corresponding planes of the routing
grid space. The main task of the routing algorithm is to find a path between
a pair of points on a net. The areas available for routing are represented as
124 Sidney E. Benda

The 2-D congestion map shows overo w


on edges of the placement grid

1-D congestion map for vertical routing

Standard cells

1-D congestion map for horizontal routing

Fig. 5. Placement congestion map

unblocked vertices of the graph (white circles in Fig. 6). The search algorithm
for path connecting points S and T in such graph was developed by Lee [5]. It can
be visualized as a wave propagating from S to all unblocked adjacent vertices.
Labeling these vertices as ‘1’, the algorithm proceeds to propagate wave to all
unblocked adjacent vertices ‘2’ and so on, until the target vertex T is reached
or no further expansion can be carried out. Due to nature of the search, Lee’s
maze router is guaranteed to find a solution, if one exists. Soukup [12] modified
Lee’s algorithm to reduce search time by limiting the wave propagation only in
direction towards the target.
A variation of the above algorithms is base for most recent routing tools.
However, instead operating on one plane, routing tools work in one more dimen-
sion, utilizing all available metal layers. If solution cannot be found on one plane
(layer), router uses via to get to one of adjacent planes and continues there.
Unlike global router that considers the entire layout of a block, detail router
considers just one switchbox at a time. If we attempt to route entire nets, the
first several nets will be easily routable, while routing would get progressively
Physical Design of CMOS Chips in Six Easy Steps 125

4 3 4 5 6 7 8 9

3 2 3 4 5 6 7 8 9

2 1 2 6 7 8 9

1 S 1 9

2 1 2

3 2 3 9 T

4 3 4 8 9

5 4 5 6 7 8 9

6 5 6 7 8 9

7 6 7 8 9

Fig. 6. Maze route between points S and T on one plane

more difficult because of decreasing number of available vertices in routing grid.


To even out the chances for every net, detail router first works on individual
switchboxes that are connected together later on in several iterative search and
repair cycles.
On occasions, synthesis step may produce block that is unroutable — the
internal high connectivity of the block exceeds available routing resources. An
efficient placement routability modeling technique is described in [1].

6 Extraction
When all blocks of a design are placed and routed, it is desirable to find out elec-
trical properties of the design before it goes to actual fabrication. More specifi-
cally, designer is interested whether his block meets timing specifications so the
chip will work at desired speed or clock rate. Signal delays on the chip have two
components:
– Cell delays caused by finite transition time of transistors. Before synthesis,
all the cells in library are characterized and synthesis tool uses these values
to select proper cell size or drive strength.
126 Sidney E. Benda

metal2
metal3

4 3

metal1

1 2

substrate

Fig. 7. Coupling capacitance between conductive layers

– Interconnect delays caused by parasitic capacitance of nets. This capacitive


cell load means that every state transition takes time proportional to charg-
ing/discharging time of related parasitic capacitor.
The second component — parasitic capacitance — is not known at synthesis time
as it depends on placement of cells and specific routing of connected nets. The
synthesis tool makes assumption about placement and estimates capacitance
based on wire model and manhattan distance between cells.
Tools for parasitic extraction can provide fairly accurate values of electrical
properties by traversing the design database. Extraction of net resistance is fairly
simple — it is directly proportional to net length. The capacitance extraction is
more computationally intensive, as suggested by Fig. 7, here limited to three
metal layers for simplicity. Geometry of wires on every conducting layer gives
rise to several types of capacitive coupling, listing just the most dominant ones:
1. wire bottom to substrate,
2. wire side to substrate,
3. wire top to wire bottom on upper level,
4. wire side to wire top/bottom of different layer,
5. wire side to another wire side (the same or different layer).
Thanks to fixed pitch of routing grid and inter-layer distances, capacitance for
unit wire length can be pre-computed for every type of coupling, yet the task of
repeated traversing all nets in the database is extensive.
Physical Design of CMOS Chips in Six Easy Steps 127

Extracted values of resistance and capacitance are converted to time units


and then back-annotated into gate-level netlist (see Sect. 3). Subsequent simu-
lation using this netlist gives fairly accurate picture of circuit delays. If timing
constraints are not met, it is necessary to go back to Step 2 and re-synthesize
in order to change driving strength of cells. To avoid potentially infinite loop
and speed up the timing closure, recent tools attempt to do synthesis and place-
ment in the same step. This allows for more accurate modeling of parasitics in
synthesis time resulting in more accurate sizing of cells.

7 Verification

Before shipping the design to a foundry, the design is subject to at least the
following two types of physical verification:

– design rule checking (DRC),


– layout versus schematic (LVS).

7.1 DRC

This is relatively straightforward check of all polygons in a layout against all


design rules found in technology file. If no design rules are violated, the layout
is manufacturable.

7.2 LVS

The task of LVS is to verify that resulting layout geometries create the circuitry
corresponding to the gate-level netlist that was fed into the placement and rout-
ing tool. It also examines connectivity by checking for shorted or open nets.
The first part of the LVS input is the netlist itself. Using information from
standard cell library, the tool builds a graph GN of the design on transistor
level. In GN, nets are nodes of the graph while transistors correspond to edges
connecting the nodes.
The layout information from design database requires more extensive process-
ing. The tool is looking for overlapping polygons on diffusion and poly layers —
this is indication of presence of a transistor. Computational geometry methods
extract the transistors, determine their type and size. Following the transistor
connections on metal and via layers, the tool extracts nets. This allows construc-
tion of a graph GL following the same rules as when building GN [8].
The final step is matching GL to GN. If the graphs are found isomorphic, the
layout was successfully verified. In a case of mismatch, the tool usually provides
graphical interface to locate, view and repair the problem.
128 Sidney E. Benda

References
1. E. Cheng. Risa: Accurate and efficient placement routability modeling. In ICCAD
’94 Proceedings. IEEE Computer Society Press, 1994. 125
2. Design Compiler Reference Manual, 1997.
3. K. M. Hall. An r-dimensional quadratic placement algorithm. Management Sci-
ence, 17, 1970. 122
4. P. Kurup and T. Abbasi. Logic Synthesis Using Synopsys. Kluwer Academic
Publishers, 1997.
5. C. Y. Lee. An Algorithm for Path Connections and Its Applications. IEEE Trans-
action on Electronic Computers. 1961. 124
6. S. Malik and R. Rudell. Multi-level Logic Synthesis. ICCAD’92 Tutorial. IEEE
Computer Society Press, 1992.
7. H. Murata, K. Fujioshi, S. Nakatake, and Y. Kajitani. Rectangle-packing-based
module placement. In ICCAD ’95 Proceedings. IEEE Computer Society Press,
1995. 121
8. B. Preas and M. Lorenzetti. Physical Design Automation of VLSI Systems. The
Benjamin/Cummings Publishing Co., Inc., 1988. 122, 127
9. G. Rabbat. Handbook of Advanced Semiconductor Technology and Computer Sys-
tems. Van Nostrand Reinhold Company Inc., 1988.
10. N. Sherwani. Algorithms for VLSI Physical Design Automation. Kluwer Academic
Publishers, 1993. 122
11. W. Shi. An optimal algorithm for area minimization of slicing floorplans. In
ICCAD ’95 Proceedings. IEEE Computer Society Press, 1995. 121
12. J. Soukup. Fast maze router. In Proceedings of 15th Design Automation Confer-
ence, pages 100–102, 1978. 124
13. Verilog HDL Synthesis Reference Manual, 1993.
14. Verilog-XL, 1996.
15. N. Weste and K. Esraghian. Principles of CMOS VLSI Design. Addison-Wesley
Publishing Co., 1992.

You might also like