0% found this document useful (0 votes)
10 views6 pages

DNN-Opt: An RL Inspired Optimization For Analog Circuit Sizing Using Deep Neural Networks

Your document was successfully uploaded!

Uploaded by

Django Projects
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views6 pages

DNN-Opt: An RL Inspired Optimization For Analog Circuit Sizing Using Deep Neural Networks

Your document was successfully uploaded!

Uploaded by

Django Projects
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

DNN-Opt: An RL Inspired Optimization for Analog

Circuit Sizing using Deep Neural Networks


Ahmet F. Budak1 ∗ , Prateek Bhansali2 , Bo Liu3 , Nan Sun1 , David Z. Pan1 and Chandramouli V. Kashyap2 †
1
ECE Department, The University of Texas at Austin 2 Intel Corp. 3 James Watt School of Eng., University of Glasgow
[email protected], † [email protected]

Abstract—Analog circuit sizing takes a significant amount of methods are generally fast, developing accurate expressions
manual effort in a typical design cycle. With rapidly developing for circuit performances is not easy and deviates largely from
technology and tight schedules, bringing automated solutions for the actual values. On the other hand, simulation-based methods
sizing has attracted great attention. This paper presents DNN-
arXiv:2110.00211v1 [cs.LG] 1 Oct 2021

Opt, a Reinforcement Learning (RL) inspired Deep Neural Net- employ black-box or learning-based optimization techniques to
work (DNN) based black-box optimization framework for analog explore design space. These methods make guided exploration
circuit sizing. The key contributions of this paper are a novel in the search space and target a global minimum using the real
sample-efficient two-stage deep learning optimization framework evaluations from circuit simulators.
leveraging RL actor-critic algorithms, and a recipe to extend it on Traditionally, there have existed various model-free opti-
large industrial circuits using critical device identification. Our
method shows 5–30x sample efficiency compared to other black- mization methods such as particle swarm optimization (PSO)
box optimization methods both on small building blocks and on [6] and advanced differential evolution [7]. Although these
large industrial circuits with better performance metrics. To the methods have good convergence behavior, they are known
best of our knowledge, this is the first application of DNN-based to be sample-inefficient (i.e., SPICE simulation intensive).
circuit sizing on industrial scale circuits. Recently surrogate model-based and learning-based methods
Index Terms—Analog Circuit Sizing Automation, Blackbox
Optimization, Reinforcement Learning, Deep Neural Network are becoming increasingly popular due to their efficiency in
exploring solution space. In surrogate model-based methods,
I. I NTRODUCTION Gaussian Process Regression (GPR) [8] is generally used
Analog Integrated Circuit (IC) design is a complex process for design space modeling, and the next design point is
involving multiple steps. Billions of nanoscale transistor de- determined through model predictions. For example, GASPAD
vices are fabricated on a silicon die and connected via intricate method is introduced into Radio Frequency (RF) IC synthesis
metal layers during those steps. The final product is an IC, where GPR predictions guide evolutionary search [9]. WEIBO
which powers much of our life today. An essential aspect of method proposed a GPR based Bayesian Optimization [10]
IC design is analog design, which continues to suffer from algorithm where a blended version of weighted Expected Im-
long design cycles and high design complexity due to lack of provement (wEI) and the probability of feasibility is selected
automation in analog Electronic Design Automation (EDA) as acquisition function to handle constrained nature of analog
tools compared to digital flows. In particular, “circuit sizing” sizing [11]. The main drawback of Bayesian Optimization
tends to consume a significant portion of analog designers’ methods is scalability as GP modeling has cubic complexity
time. In order to tackle this labor-intensive nature and reduce in the number of samples, O(N 3 ).
time-to-market requirements, analog circuit sizing automation Recently, reinforcement learning algorithms are applied in
has attracted high interest in recent years. the area as learning-based methods. GCN-RL [12] leverages
Prior work on analog circuit sizing automation can be di- Graph Neural Networks (GNN) and proposes a transferable
vided into two categories: knowledge-based and optimization- framework. Despite reporting superior results over various
based methods. In the knowledge-based approach, design methods and human-designer, a) it requires thousands of
experts transcribe their domain knowledge into algorithms and simulations for convergence (without transfer learning) and
equations [1], [2]. However, such methods create dependency b) it suffers from engineering effort to determine observation
on expert human-designers, circuit topology, and technology vector, architecture selection, and reward engineering. AutoCkt
nodes. Thus, these methods are highly time-consuming and [13] is a sparse sub-sampling RL technique optimizing the
not scalable. circuit parameters by taking discrete actions in the solution
Optimization-based methods are further categorized into space. AutoCkt shows more efficiency over random RL agents
two classes: equation-based and simulation-based methods. and Differential Evolution. Still, it requires to be trained with
Equation-based methods try to express circuit performance via thousands of SPICE simulations before deployment, which is
posynomial equations or regression models using simulation costly.
data. Then the equation-based optimization methods such as In this paper we introduce DNN-Opt, a two-stage deep
Geometric Programming [3], [4] or Semidefinite Programming learning black-box optimization scheme, where we merge the
(SDP) relaxations [5] are applied to convex or non-convex for- strengths of Reinforcement Learning (RL), Bayesian Opti-
mulated problems to find an optimal solution. Although those mization (BO), and population-based techniques in a novel

978-1-6654-3274-0/21/$31.00 ©2021 IEEE


Pseudo-sample Generation
any new design point’s performance. This prediction is used
by the actor network to propose new candidates for simula-
Critic
tion. This search scheme efficiently mimics BO behavior in
Cartesian Products Training space exploration. Besides, the sample generation is further
optimized by adopting a population control scheme.
Critic - Network
The two-stage network architecture of our work borrows its
structure from Deep Deterministic Policy Gradient (DDPG)

two-stage DNN
Circuit Info Actor
Training algorithm [14], which is an RL actor-critic algorithm [15]
developed for continuous action spaces. However, actor-critic
algorithms are not directly applicable to analog circuit sizing
Topology, Specs, Bounds since it is not a Markov Decision Processes (MDP) [16], which
is a necessary condition for any RL problem. Therefore we
Next Sample: adapt DDPG algorithm with significant modifications tailored
for analog circuit sizing.
Circuit Simulator Actor - Network
In the context of analog circuit sizing, we will keep some
of the RL notation but replace many for simplicity and clarity.
Fig. 1. DNN-Opt Framework Design: A design is a set of circuit parameters which we
denote by x and it is a vector of size d where each element
way. The key features of the DNN-Opt framework are below.
corresponds to a particular design variable. The optimization
• We tailored a two-stage Deep Neural Network (DNN) ar-
goal is to find optimal xopt which satisfies Eq. 1.
chitecture for black-box optimization tasks inspired by the Population: A population is set of multiple designs.
actor-critic algorithms developed in the RL community. Design Population Matrix: We define a design population
• To leverage convergence behavior of population-based
matrix as X ∈ RN ×d , where N is the population size. The
methods, DNN-Opt adopts a population-based search parameters of ith design is a row in the design population
space control mechanism. matrix X, which is denoted as xi .
• We introduce a recipe for extending our work for large in-
State Space: Our work maps optimization parameters (circuit
dustrial designs using sensitivity analysis. In collaboration design variables) to state representation in RL notation. A state
with a design house, we demonstrate that our work can of k th design is transformed as sk = xk .
also efficiently size large circuits with tens of thousands Action Space: Each action ak in our new architecture corre-
of devices in addition to small building blocks. sponds to change in optimization parameters vector, xk , which
The rest of the paper is organized as follows. We formulate can be denoted as ak = ∆xk . An intuitive explanation of this
analog circuit sizing problem in Section II and introduce DNN- choice is that an ideal action for an optimization task should
Opt with its RL core and other details. In Section III, the propose change in each design variable to have a better design.
performance of DNN-Opt is demonstrated on small building Critic-Network: Originally, a critic-network parameterized by
blocks and large industrial circuits. We also provide per- θQ approximates the return value of an MDP Return =
formance comparisons of DNN-Opt with other optimization Q(st , at |θQ ). We modify its role and use this network as a
methods. The conclusions are provided in Section IV. proxy in lieu of expensive SPICE simulator. Our modified
II. DNN-O PT F RAMEWORK critic-network provides a vector-to-vector mapping by taking
an (x, ∆x) ∈ D2d as input and providing performance
A. Analog Circuit Sizing: Problem Formulation
predictions Q(x, ∆x|θQ ) ∈ Rm+1 at output, one-dimension is
We formulate analog circuit sizing task as a constrained for objective specification and m for constraint specifications.
optimization problem succinctly as below. Actor-Network: An actor-network parameterized by θµ would
minimize f0 (x) take a state as its input and determine an action to take
(1) ak = µ(sk |θµ ). In the context of analog circuit sizing, actor-
subject to fi (x) ≤ 0 for i = 1, . . . , m
network provides change in design parameter vector for design
where, x ∈ Dd is the parameter vector and d is the number k as: ∆xk = ak = µ(xk |θµ ).
of design variables of sizing task. Thus, Dd is the design space. Critic-Network Training: We utilize critic-network for mod-
f0 (x) is the objective performance metric we aim to minimize. eling design variable to circuit performance relationship. For
Without loss of generality, we denote ith constraint by fi (x). effective training, we use data augmentation techniques to
generate N 2 pseudo-samples (ps) using original N samples.
B. DNN-Opt Core: RL Inspired Two-Stage DNN Architecture
In order to generate pseudo-samples, we use two-samples
The overall framework of DNN-Opt is shown in Figure xi and xj and corresponding spec vectors f (xi ) and f (xj ),
1. DNN-Opt comprises a two-stage deep neural network as follows:
architecture that interacts with a circuit simulator during the
optimization process. The flow starts from generated samples xps
ij = [xi , ∆xij ] = [xi , xj − xi ]
(2)
in the design space; then, a critic-network is used to predict f ps (xps
ij ) = f (xj )
This leads to change in the input dimensionality of critic- where, xi is the column vector of size Nes consisting of ith
network from d to 2d since we now have to use (x, ∆x) parameter of all designs in the elite population.
instead of x or (x+∆x). Our experiments conducted on The hyperparameters (number of layers, number of nodes,
Bayesmark [17] benchmark problems showed that using learning rate, etc.) of the architecture for the actor and critic
2d inputs and training with pseudo-samples boosted critic- networks were found based on empirical studies.
network’s accuracy significantly over a network trained with
C. Sensitivity Analysis
d inputs and original samples.
For a batch-size of Nb pseudo-samples, the following Mean We use sensitivity analysis to prune design search space for
Squared Error (MSE) loss function is used to train the critic efficiently finding an optimized solution. A blind search space
network. exploration may lead to wasted circuit simulations during
PNb Pm+1 2 optimization. For example, in a classical seven transistor
1

L θQ = Nb (m+1) k=1 l=1 Q(xk , ∆xk )l − f (xk + ∆xk )l (3) Operational Amplifier (OpAmp) [4] power dissipation does
l
where Q(xk , ∆xk ) is the critic-network’s approximation for not depend on the differential pair devices once they are in
kth pseudo-sample’s lth performance and f (xk + ∆xk )l is saturation. Thus, if we want to size a circuit for reducing
the SPICE simulated value for the same design-performance power, we should not make device properties of the differential
pair. To clarify, we have SPICE simulation values for pseudo- pair devices as variables. To use sensitivity analysis in practice
samples because the way they are constructed. for any generic circuit, we first traverse the circuit hierarchy
Actor-Network Training: Training of actor-network is done and collect all unique device design variables, d. Then, we
after critic-network is trained and its hyperparameters are perform sensitivity analysis by perturbing each of the design
fixed. The training of actor-network corresponds to search in variables around its nominal value and observing its impact
design space for better designs. We come up with a Figure on objective and constraints, fi . More formally, we compute
of Merit (FoM) function, g(·), based on performance-vector sensitivity Sij as
to objectively quantify how better a design is with respect to δfi
others. Sij = , ∀i = 0, . . . , m; j = 1, . . . , d. (7)
δdj
Xm
g [f (x)] = w0 ×f0 (x)+ min (1, max(0, wi × fi (x))) (4) We only need to consider design variables for which Sij >
i=1 thresh, where thresh is a user-defined number. Empirically,
where wi is the weighting factor. Note, a max(·) clipping this analysis prunes design search space effectively, allowing
used for equating designs after constraint are met and min(·) us to work on large scale circuits.
clipping is used for practical purposes to prevent single We are now ready to present the overall framework of DNN-
constraint violation to dominate g(·) value. We train actor- Opt in the next subsection.
network parameters by using g(·) function and replacing D. DNN-Opt: Overall Framework
SPICE simulation values f (·) by the critic-network predictions
The overall framework for DNN-Opt is provided in Algo-
Q(x, ∆x). We will further use a population of “elite” solutions
rithm 1. As a prerequisite, we apply sensitivity analysis for a
(es) of size Nes to restrict search space for actor network.
large design and reduce number of design variables to a work-
Population of elite solutions is a subset of total population
able range. We then randomly sample Ninit points from the
determined based on the FoM ranking.
design search space to build initial population. For optimiza-
For a batch-size of Nb samples the following loss-function
tion iteration t, first step is to initialize actor-critic parameters
is used to train actor network.
followed by pseudo-sample generation. Next actor-network
Nb
µ 1 X and critic-network are trained. After this, an elite-population
L (θ ) = (g [Q(xk , µ(xk | θµ ))] + kλ ∗ violk k2 ) (5) is constructed based on FoM of total-population (this elite-
Nb
k=1
population will be updated with optimization iterations). The
where µ(xk | θµ ) is proposed parameter change vector ∆xk next query point is generated from elite-population, Xes , using
by the actor network. (λ ∗ violk ) is an element-wise vector pre-trained actor-critic as follows. We use every design, xes i ,
multiplication where λ is weighting coefficient chosen to be in the pool of elite-population as input to actor-network. The
very large to prevent any boundary violation and keep the output of actor-network, ∆xes es
i = µ(xi ), is proposed change
search in the restricted search region. The total boundary for design parameters in search of an optimal solution. With
violation violk for action k is defined as follows: the imposed exploration noise (N ), a candidate design point
violk = max(0, lbrest − (xk + ∆xk )) + max(0, (xk + ∆xk ) − ubrest ) (6) is naturally formed as: xca es es
i = xi + µ(xi ) + N . At this step,
we have exactly the same number of proposed candidates,
where lbrest and ubrest are the restriction boundary vectors Xca = [xca ca
i , . . . , xNes ], as the size of elite-population. Once
for design variables determined by the population of elite the population pairs, Xes and Xca , are formed the next sample
solutions given by: point for iteration t is selected using Eq. 8.
lbirest =min(xi ) ∀i = 1, . . . , d xsample

= xca es ca es

t k for k = arg mini (g[Q(xi , xi − xi )]) (8)
ubirest =max(xi ) ∀i = 1, . . . , d
Algorithm 1 DNN-Opt Algorithm M4 M5
VDD

VBP2
Require: Dimensionality reduction with sensitivity analysis if (N1+N2)
W4
L4
(N1+N2)
W4
L4

design is large M2 VBP1


M3
W3 W3
Require: An initial sample set Xinit of Ninit designs and their M11
W7
N2
L3
M0 V IP V IN M1
N2
L3
M12
W7
2*W5
evaluations f (Xinit ) L7
M6
2*W5
L5 L5
M7
L7 VDD

W3
1: Define total population Xtot = Xinit
VOP W2 W2 VON
N2 VBN N2
M17
L2 L2 W4 L3
M10
L4
2: for t = 1, 2, . . . , tmax do MCAP
VBN2
2*N1
W1 MCAP

M13 M8 L1 M9 M14
3: Initialize actor & critic network parameters θµ and θQ N9
W1
N2
W1 VCMFB
N2
W1
N9
W1
VBP2

L1 L1 L1 L1 W3
4: Generate pseudo-samples using existing design Gain Stage L3
VDD
Xtot → Eqn. 2 VON VBP2 M22 VBN VBP1
W1
5: Train critic-network → Eqn. 3 N1
L1
R M15 M18 M20
6: Train actor-network → Eqn. 5 M23
W6
M24
W6 VREF
W2 W2 W2
L2
Calculate FoM for each design by FoM = g[f (Xtot )] L2 L2
7: R
VCMFB
L6 L6

8: Choose Nes designs with smallest FoM to form pop- VON


M25
W1
M26
W1
M16
W1
VBN2
M19
W1
M21
W1
N8 N8
ulation of elite solutions Xes . L1 L1 L1 L1 L1

Common-mode Feedback Bias


9: Find query point (next sample) xsample
t using actor-
Fig. 2. Schematic of the folded-cascode OTA
model → Eqn. 8
10: Simulate the query point and obtain specs f (xsample
t ) used several metrics to compare the algorithms. We provide
via SPICE sims statistics of the methods for each example, and we denote the
11: if return cond(e.g. specs are met) then number of times a feasible solution is found by success rate.
12: break We also share the evolution of FoM value calculated based
13: end if on Eq. 4 to demonstrate each algorithm’s convergence during
14: Xtot .append(xsample
t ) runtime. The constraint expressions given in Eq. 9 and 10 can
15: Go back to line 3 be trivially readjusted to fit into the form of Eq. 1.
16: end for Folded Cascode OTA: The first test case is a two-
17: return The design with highest FoM stage folded-cascode Operational Transconductance Amplifier
(OTA) (Figure 2).It has 20 design variables, and the designer
provided search ranges are as shown in Table I.
III. E XPERIMENTAL R ESULTS
TABLE I
To demonstrate the reliability and efficiency of the DNN- D ESIGN PARAMETERS AND RANGES FOR THE FOLDED - CASCODE OTA
Opt, we apply it to two sets of experiments using six circuit Parameter Name Unit LB UB
examples. The first experiment set is on small building blocks L1-L2-L3-L4-L5-L6-L7 µm 0.18 2
where every transistor is parameterized and sized, and the W1-W2-W3-W4-W5-W6-W7 µm 0.24 150
second experiment set includes larger industrial circuits with N1-N2-N8-N9 integer 1 20
thousands of nodes and devices. MCAP fF 100 2000
Cf f F 100 10000
A. Experiments with Small Building Blocks
We tested DNN-Opt on two small building blocks: a folded W:device width; L:device length; UB:upper bound; LB:lower bound
cascode amplifier and a strong-arm latch comparator. We The sizing problem is defined as follows:
included the majority of the circuit performances in the
minimize Power
constraint list to mimic real-world design experience. Both s.t. DC Gain > 60 dB Settling Time < 30 ns
designs are implemented in 180nm CMOS technology. CMRR > 80dB Saturation Margin > 50 mV
(9)
We compare our algorithm with three other well-known PSRR > 80 dB Unity Gain Freq. > 30 MHz
methods: a) A Differential Evolution (DE) method, which Out. Swing > 2.4 V Out. Noise < 30 mVrms
is a conventional population-based model-free algorithm, b) Static error < 0.1 Phase Margin > 60 deg.
Bayesian Optimization with weighted Expected Improvement In our experiment, the following transistors are required to
(BO-wEI) [11], which is a modified version of Bayesian Op- operate in the saturation region: M1, M3, M4, M7, M9, M10,
timization for constrained problems, and c) GASPAD method M12, M13, and [M15-M26]. The total number of design
[9], a surrogate model (GP) assisted evolutionary framework. constraints becomes 29.
To account for the randomized techniques involved in all The statistical results for all the reference algorithms are
these methods, we repeat experiments ten times to report each shown in Table II. DNN-Opt shows high reliability and find
method’s findings. We determine the simulation budgets for a feasible solution in all its trials. However, other model-
our experiments by considering the convergence nature of the based methods, BO-wEI and GASPAD, fail to achieve similar
methods. DE has a simulation budget of 10000, and BO-wEI, behavior. DE can also find feasible results, but DNN-Opt is
GASPAD, and DNN-Opt are limited by 500 simulations. All 24x more efficient in the number of required simulations to
the experiments are run on a workstation with Intel Xeon CPU find the first feasible result. It is also demonstrated in Table
and 128GB RAM, and a commercial SPICE simulator. We II that, on average, the final design proposed by DNN-Opt
Folded Cascode Amplifier Strong-Arm Latch Comparator
DNN-Opt DNN-Opt
GASPAD 5 GASPAD
1.50 BO-wEI BO-wEI
DE DE
1.25 4

1.00
3

FoM
FoM

0.75
2
0.50
1
0.25

0.00 0
0 100 200 300 400 500 0 100 200 300 400 500
number of simulations number of simulations
Fig. 3. The average FoM (lower is better) curve for 500 simulations Fig. 4. The average FoM (lower is better) curve for 500 simulations

TABLE II
S TATISTICS FOR DIFFERENT ALGORITHMS : F OLDED C ASCODE OTA
Algorithm DE BO-wEI GASPAD DNN-Opt
success rate 10/10 2/10 4/10 10/10
# of simulations 3200 >500 >500 132
Min power (mW ) 0.75 0.91 0.72 0.62
Max power (mW ) 1.53 1.62 1.75 0.77
Mean power (mW ) 1.14 1.25 0.96 0.71
Modeling time (h) NA 30 6.5 0.6
Simulation time (h) 54 2.7 2.7 2.7
Total runtime (h) 54 32.7 8.2 3.3

draws up to 43% less power. The modeling time required by W6

DNN-Opt is up to 50x smaller compared to other model-based L6

methods. This results in 2.5–16x efficiency for total runtime.


Figure 3 includes the FoM curve with iterations, where
DNN-Opt shows strong convergence behavior and outperforms Fig. 5. Schematic of SA-Latch Comparator
other methods. For our ten runs, DNN-Opt finds the feasible
solution within 205 iterations (marked with vertical dashed The statistical results for all the reference algorithms are
line) across all its ten trials. Although it is slow, GASPAD shown in Table-IV. Due to relatively tighter constraints for SA-
shows convergence to optimal FoM, but we observed that BO- Latch Comparator, methods typically needed a larger number
wEI is often trapped in local optima. of simulations to converge. DDN-Opt is the only method that
Strong-Arm Latch Comparator: The second test case is finds a feasible solution in all trials, and our method shows
SA-Latch Comparator, which is shown in Figure 5. It has 13 more than 30x efficiency compared to DE. GASPAD shows
design variables, and their names and bounds are shown in relatively competitive results, but DNN-Opt finds a solution
table III. with 25% better power consumption than successful runs of
TABLE III GASPAD. The runtime observations are similar to the folded
D ESIGN PARAMETERS AND THEIR RANGES FOR SA-L ATCH C OMPARATOR cascode case.
Parameter Name Unit LB UB FoM curves are shown in Figure 4 for different methods.
L1-L2-L3-L4-L5-L6 µm 0.18 10 DNN-Opt finds a feasible solution within 348 simulations,
W1-W2-W3-W4-W5-W6 µm 0.22 50 which is much earlier than the others. BO-wEI shows a similar
CL finger integer 10 300 convergence trend for initial iterations then fails to model one
The constrained optimization problem consists of 10 con- of the constraints properly. Our observations showed that all
straints in total: the runs with the BO-wEI method were unable to meet input-
minimize Power referred noise, and some failed for set delay.
s.t. Set Delay < 10 ns
Reset Delay < 6.5 ns TABLE IV
Area < 26 µm2 SA L ATCH C OMPARATOR R ESULTS
Input-referred Noise < 50 µVrms Algorithm DE BO-wEI GASPAD DNN-Opt
Differential Reset Voltage < 1 µV (10) success rate 5/10 0/10 6/10 10/10
Differential Set Voltage > 1.195 V # of simulations >10000 >500 >500 330
Positive-Integration Node Reset Voltage < 60 µV min (µ W) 2.98 NA 3.05 2.50
Negative-Integration Node Reset Voltage < 60 µV max (µ W) 4.22 NA 3.75 2.75
mean (µ W) 3.57 NA 3.45 2.65
Positive-Output Node Reset Voltage < 0.35 µV
Modeling time (h) NA 17 3 0.3
Negative-Output Node Reset Voltage < 0.35 µV.
Simulation time (h) 72 3.6 3.6 3.6
Total runtime (h) 72 20.6 6.6 3.9
B. Experiments with Industrial Scale Circuits TABLE V
DNN-O PT R ESULTS ON I NDUSTRIAL C IRCUITS
We tested DNN-Opt on four industrial circuits designed at
a very advanced technology node. These circuits were already Circuit MOS Nodes Simulated Annealing (SA) DNN-Opt
Inverter Chain 8 7 >1000 90
in the process of manual sizing by expert analog designers Level Shifter 1.2k 3.9k 1200 195
and needed some fine-tuning. For these industrial circuits, we LDO 167k 2.8k 552 112
CTLE 173k 63k 587 150
did not have access to other algorithms (DE, GASPAD, BO-
wEI), and hence our baseline is with a commercial black- Number of SPICE simulations shown in column SA and DNN-Opt
box optimizer based on Simulated Annealing. As will be for meeting constraints (lower is better).
demonstrated in this section, DNN-Opt performs well on
large circuits and is not limited to small examples. Analog building blocks and large industrial circuits leading to 5–30x
designers assisted in selecting permissible parameter ranges sample efficiency, while being able to find feasible solution for
of the devices, considering layout impacts and process rules. all circuit sizing tasks and showing superior converge curves
For industrial cases, we identify critical devices based on Eq. compared to other methods.
7 for the failing constraints (fi ’s of Eq. 1). Note, MLParest ACKNOWLEDGEMENT
[18] was used in the loop of DNN-Opt which helps analog
This work is supported in part by NSF under Grant No.
designer estimate post-layout effects early in the design.
1704758.
Inverter Chain: The first case is a simple inverter chain
used mainly for tool development and flow testing. We used R EFERENCES
all the devices (8) in the four stage inverter chain. There were [1] N. Horta, “Analogue and mixed-signal systems topologies exploration
only two specs, delay and power. using symbolic methods,” Analog Integr. Circuits Signal Process., 2002.
Level Shifter: Sensitivity analysis identified ten critical [2] N. Jangkrajarng, S. Bhattacharya, R. Hartono, and C.-J. Shi,
“Iprail—intellectual property reuse-based analog ic layout automation,”
devices impacting failing performances, and that led to a Integration, 2003, analog and Mixed-signal IC Design and Design
design space of 3.9 × 1015 . There were 60 total specs like Methodologies.
delay, rise, fall, power, current, etc. [3] W. Daems, G. Gielen, and W. Sansen, “Simulation-based generation
of posynomial performance models for the sizing of analog integrated
Low-Dropout (LDO) Regulator: We used sensitivity anal- circuits,” IEEE TCAD, 2003.
ysis to identify six critical devices leading to search space of [4] M. d. Hershenson, S. P. Boyd, and T. H. Lee, “Optimal design of a cmos
1.6×1013 . The circuit had PSRR, Gain Margin, Phase Margin, op-amp via geometric programming,” IEEE TCAD, 2001.
[5] Y. Wang, M. Orshansky, and C. Caramanis, “Enabling efficient analog
DC Gain, GBW, etc., as part of nine constraints. The number synthesis by coupling sparse regression and polynomial optimization,”
of devices is high due to arrayed instances used by the analog in DAC, 2014.
engineer. [6] R. Acar Vural and T. Yildirim, “Analog circuit sizing via swarm
intelligence,” AEU - International Journal of Electronics and Commu-
Continuous-Time Linear Equalizer (CTLE): Sensitivity nications, 2012.
analysis identified eight critical devices impacting failing [7] B. Liu, G. Gielen, and F. V. Fernndez, Automated Design of Analog
performances. With design parameter and ranges identified by and High-frequency Circuits: A Computational Intelligence Approach.
Springer, 2013.
analog designers, we had a design space of 3.3 × 1025 . There [8] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine
were a total of 14 constraints like DC Gain, offset, Nyquist Learning (Adaptive Computation and Machine Learning). The MIT
Gain, Fpeak, Peaking Max, Power, etc. Press, 2005.
[9] B. Liu, D. Zhao, P. Reynaert, and G. G. E. Gielen, “Gaspad: A general
As illustrated in Table-V, DNN-Opt outperforms commer- and efficient mm-wave integrated circuit synthesis method based on
cial optimizer available in the industry in terms of the number surrogate model assisted evolutionary algorithm,” IEEE TCAD, Feb
of simulations required to meet the constraints by 5x. We 2014.
[10] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimiza-
would like to emphasize that we can deal with fairly complex tion of machine learning algorithms,” in NIPS, 2012.
CTLE circuit by using 4x smaller number of costly SPICE [11] W. Lyu, F. Yang, C. Yan, D. Zhou, and X. Zeng, “Multi-objective
simulations. Additionally, the optimal solution proposed by bayesian optimization for analog/rf circuit synthesis,” in DAC, 18.
[12] H. Wang, K. Wang, J. Yang, L. Shen, N. Sun, H. Lee, and S. Han,
DNN-Opt consumed 8% lesser power than simulated anneal- “Gcn-rl circuit designer: Transferable transistor sizing with graph neural
ing. Our examples represent real use cases where designers networks and reinforcement learning,” DAC, 2020.
already spend several days worth of human time in fixing con- [13] K. Settaluri, A. Haj-Ali, Q. Huang, K. Hakhamaneshi, and B. Nikolić,
“Autockt: Deep reinforcement learning of analog circuit designs,” DATE,
straints. Had we started with designs without any knowledge 2020.
of human designers baked-in, we would have seen even greater [14] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa,
returns in sample efficiency like Section III-A. D. Silver, and D. Wierstra, “Continuous control with deep reinforcement
learning.” in ICLR, 2016.
IV. C ONCLUSION [15] V. Konda and J. Tsitsiklis, “Actor-critic algorithms,” in SIAM Journal
on Control and Optimization. MIT Press, 2000.
In this work, we presented DNN-Opt, a novel sample [16] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.
efficient black-box optimization algorithm that combined the Cambridge, MA, USA: A Bradford Book, 2018.
[17] R. Turner and D. Eriksson, “Bayesmark,” https://github.com/uber/
strengths of deep neural networks and reinforcement learning bayesmark, 2020.
paradigm. We also give a recipe to extend our work for large [18] B. Shook, P. Bhansali, C. Kashyap, C. Amin, and S. Joshi, “Mlparest:
circuits with thousands of devices. Our algorithm’s effective- Machine learning based parasitic estimation for custom circuit design,”
in DAC, 2020.
ness has been successfully demonstrated on various circuit

You might also like