Reality-Driven Physical Synthesis
Patrick Groeneveld Chief Technologist, Magma Design Automation, San Jose (soon: Synopsys inc., Mountain View) Chair, 49th Design Automation Conference, San Francisco
ISPD 2012, Napa
Kevin Trudeau, the king of Quacks
As seen at ISPD
Physical Design of Apple processors
Common technology:
45nm Samsung
A4: 2010
iPhone 4 & iPad 1 7.3mm x 7.3mm
A5x A5 A4
A5: 2011
iPhone 4s & iPad 2 10.0mm x 12.5mm
A5x: 2012
iPad 3 12.9mm x 12.7mm = 3x as big as the A4
A closer look at the Apples physical design style
High Density region (near) rectangular blocks
(near) slicing floorplan Big macros are always at the border
No trace of data path regularity..
Thin Channels, so few cells at top level
MAGMA CONFIDENTIAL DO NOT COPY
PD: Many Objectives Simultaneously
Correct & manufacturable mask pattern
Congestion control Big chip = good
Meets timing & electrical requirements
Battle parasitics: timing, voltage drop Big gates = good, compact chip = good & a little bad
Low power
Leakage control, multi-voltage, sleep, etc Small gates = good, complex floorplan = necessary evil
Low part cost
Compact chip, dense wires = good
Low design effort
Robust design, short tool run times, re-use Simple = good, pushbutton = good
Magma Flow: guided by best available data
Global Route
fix time (logic synth)
Global route:
Layer assignment Congestion Resource contention Detours
fix cell (place, optimization)
fix cell_optimize
Track Route
fix clock (CTS)
Track route:
Refines global route
Detail Route
fix clock_optimize
fix wire (Route)
Detail route
Copies track route Fixes opens Ripup & Reroute
The only thing that matters is the quality at the end!
fix wire_optimize
Layout Design at different levels of abstraction
Productivedebugging betweenteams
7
What is the timing accuracy?
Global Route
fix time (logic synth)
fix cell (place, optimization)
Extract glr segments Delay calculator Timer
fix cell_optimize
fix clock (CTS)
GR-DR Timing correlation?
Detail Route
fix clock_optimize
fix wire (Route)
fix wire_optimize
Extract detailed wires Delay calculator Timer
Measuring correlation error: Experimental set-up
Take routed design:
Segments time in global mode, CCT Wires time in final mode,. Xtalk on = golden
Only compare 2-pin nets, > 40um length Circuit timed in FINAL mode (golden)
delay
Compare net delay Compare wire cap Compare slack
Circuit timed in GLOBAL mode
delay
Observations on Global vs Final delay correlation
Over 7 real designs, net delay miscorrelates badly between global and final:
Average = roughly OK 88% standard deviation So 33% of the net delays are off by more than 88% 97% of nets are worse than +-5% accurate
-100%
# of nets +100% Net Delay error (Final delay global delay)
Garbage in Garbage out ?
Modeling inaccuracies, causes earlier opto to work on the wrong parts Crosstalk noise could seriously randomize results.
Global
Opto 1 -2% Opto 2 -1% Opto 2 -3%
Final
TNS=-321n WNS=-239p FEP=734
1 0 % TNS=-???n WNS=-???p FEP=???
20 %
40%
80%
Optimization based on GR
What can we do? Attempt to increase accuracy of early timing:
Add xtalk estimate during Global Route Extraction Perform track routing as well
And/or: Live with the problem:
But!? But!? I need to optimize for something!!
Spend less effort on early optimization Carefully examine statistics of optimization effectiveness Have a good way to patch up xtalk at the end
12
Building a Layout Design Flow Observation 1:
Need gradual refinement flow using many algorithms
Formal Verification Global-level timer Mapping Buffering Global placer Global router Gate resizing Clock Tree S. Timer & Extractor Gate rewiring Gate buffering Detailed placer Sign-off DRC checker Sign-off Timer FinesimSpice Track router Detailed router Detailed opt.
Observation 2:
Synthesis algorithms need highly simplified models of reality
Observation 3:
Synthesis algorithms cannot deliver good multi-objective trade-offs
Observation 4:
Optimizing a single objective often makes other objectives worse.
The ABC of a solid EDA Design Flow
A: Avoid
Use pessimism to make problem unlikely, Correct by Construction
B: Build
Synthesize using an algorithm
C: Correct
Fix each failure by incremental modifications (ECOs).
Goal: Living on the edge
Avoid as little as possible Such that the remaining failures can be Corrected incrementally And accept the reality that Build algorithms offer little control
Needs correction
15
Fail
Pass
# of nets
ABC in action: Combating crosstalk delay
Avoid: using pessimism:
Size up all drivers: Costs cell area and power Force double spacing NDR on many nets: Costs congestion = area
Build:
Some routing tricks to spread & jog wires
Correct using ECO:
gate re-sizing, buffering Re-routing
Wire cap: 50fF, of which 30-80% is to neighbors
Gate input cap:
4fF
C routing improvement: pushing neighbors away
Might make other nets worse
Not always successful
Effect of this layout push on timing
worse
Actual wire delay
As reported by Tekton STA Crosstalk = on
better Average: -12% Neighbor length -13% Delay better worse
Medical tools
New drug
Biological model of cause, actions and side-effects
vs.
EDA tools
New Method/Algorithm Based on electrical/ physical plausibility Program it (C++/TCL) Unit test Test on small testcases Debug program Get a results table Publish at ISPD Go for it!
Develop it Test tube test Test on animals
Efficacy, side effects
Clinical trials
Large double-blind placebocontrolled tests
FDA-approval Deployment
Lack of Evidence = Quackery
EDA is not exempt:
Datapath placement Thermal-driven placement DFM-driven design Plug n play tool interoperability Hybrid GPU/CPU EDA tools. Gridless routing X-Architecture
Skeptical wisdom for Electronic Design
Humans are amazingly good at self-deception
This looks soooo good, therefore this must work
If it has no side effects, it probably has no effects either
Example: improving temperature gradients will cost timing you! Are you really willing to pay based on the evidence?
Do not confuse association with causation
I took this airborne pill, and I did not get sick I used this DFM optimizer, and the chip yields!
The plural of anecdote is anecdotes, not data
Result could be a random effect, or another side effect No substitute for unbiased placebo-controlled tests Only large data sets are statistically relevant
Summary: observations from practice
Layout is a multi-objective optimization problem
DRC, Manufacturability, timing, power, cost, design effort
Timing is poorly predictable early in the flow The only thing that counts is the result at the end
Intermediate data is a poor indicator Need hard evidence that trade off is worthwhile
Beware of XX-driven synthesis/place/route
Is the gain worth the side effects?
Optimal is irrelevant, while greedy is pretty good Simple A-B-C flows are proven in practice
23
24