Local Learning and Search in Memetic Algorithms
Local Learning and Search in Memetic Algorithms
Abstract— The use of local search in evolutionary techniques whose objective function demands much time to evaluate, i.e.
is believed to enhance the performance of the algorithms, giving some seconds to minutes. Optimization problems associated
rise to memetic or hybrid algorithms. However, in many contin- to computer aided design (CAD), in which the designer
uous optimization problems the additional cost required by local
search may be prohibitive. Thus we propose the local learning of needs to model electromagnetic, thermal or fluid phenomena
the objective and constraint functions prior to the local search that lead to the implicit solution of differential or integral
phase of memetic algorithms, based on the samples gathered equations, often fall in this class of problem [10]. When
by the population through the evolutionary process. The local multiplying the time of one single evaluation by thousands
search operator is then applied over this approximated model. of evaluations required for a single run of an evolutionary
We perform some experiments by combining our approach
with a real-coded genetic algorithm. The results demonstrate algorithm, we get a computationally expensive optimization
the benefit of the proposed methodology for costly black-box process. In this context, the use of local search operators
functions. elevates the computational cost in such a way that it makes
the employment of MAs prohibitive.
I. I NTRODUCTION
We provide an alternative methodology for using MAs
Nowadays, the combination of local search operators and with continuous and costly optimization problems. It is based
evolutionary techniques is argued to greatly improve the on the employment of local approximations before the local
performance of the basic evolutionary technique by com- search phase. When an individual is selected for local search,
bining the global search ability of these methods with the it “builds” a local model of the function behavior. After that,
advantages provided by local search techniques. Such class the local search operator uses the estimates provided by the
of hybrid methods is known as memetic algorithms (MAs) local model to enhance the individual. The local model is
[1], [2], [3]. The main idea of memetic evolution is that a generated through the learning of the input-output mapping
given individual in the population may be improved through performed by the black-box function based on current and
individual evolution. the past samples gathered during the evolutionary process.
Initially, MAs were developed for combinatorial optimiza- Evolutionary algorithms can be viewed as adaptive sampling
tion problems [4], [5], [6], by exploring the use of very techniques, in the sense that they sample the search space
specific local search operators for the problem being solved. in seeking for the optimal solution. This adaptive sampling
In this context, MAs have overcome the basic evolutionary process is guided by the heuristic operators of the algorithm,
techniques in many applications. Nonetheless, it did not take which direct the search to the most promising regions.
long for some works to appear in the literature dealing Assuming that we are dealing with an expensive-to-evaluate
with continuous search spaces [7], [8], [9]. However, the function, each sample is very valuable. We may store all
local search phase consumes a great number of function samples in an input-output data set, which represents all the
evaluations. When dealing with optimization problems in knowledge acquired by the algorithm for the problem. Points
which the objective function evaluation is fast, this charac- from this data set in the vicinity of the individual may be
teristic is not critical. In general, combinatorial optimization used to fit a given parameterized model, for instance, a neural
problems and some continuous optimization problems fall network model, that provides a local approximation to the
in this class of problems. Conversely, there are many real- function.
world problems, particularly some engineering problems, The use of approximations to deal with costly functions is
F.G. Guimarães, E.F. Wanner, and J.A. Ramı́rez are with the Depart- not new. A traditional approach is to sample the functions,
ment of Electrical Engineering, Federal University of Minas Gerais, Av. generally in a random manner, prior to the optimization
Antônio Carlos, 6627, Belo Horizonte, MG, 31270-010, Brazil (e-mail: [11], [12], [13]. A global approximation is built and the
fgg@[Link], elizabeth@[Link] and jramirez@[Link]).
R.H.C. Takahashi is with the Department of Mathematics, Federal Uni- optimization is performed over this global model. This global
versity of Minas Gerais, Av. Antônio Carlos, 6627, Belo Horizonte, MG, model can be a static model or a dynamic one, in the sense
Brazil (e-mail: taka@[Link]). that the model is further refined during the optimization
D.A. Lowther is with the Department of Electrical and
Computer Engineering, McGill University, Montreal, Canada (e-mail: process. However, this approach is very limited. For more
[Link]@[Link]). complex functions and for higher dimensions, it is necessary
F. Campelo and H. Igarashi are with the Research Group of Informatics to use many samples to produce a good approximation.
for System Synthesis Graduate School of Information Science and Technol-
ogy, Hokkaido University, Sapporo 060-0814, Japan. (fax: +81 11-706-7670; Additionally, the complexity of the model will increase in
email: pinto@[Link]). order to capture the global behavior of the function.
2937
vicinity of x(i) . The local search operator aims at obtaining a
solution x ∈ V(x(i) , ) such that x is a local optimal solution 500
in the local vicinity of x(i) :
Definition 1: Local optimal solution: A solution x∗ ∈ F 400
is said to be a local optimal solution in the vicinity of the 300
point x(i) if there is no x ∈ V(x(i) , ) such that f (x) <
f (x∗ ). 200
Nonetheless, MAs do not necessarily need to employ
100
local searchers that achieve this goal, since only a local
improvement in the solution may be helpful. Following this 0
definition, the local search problem within an MA can be
−100
posed as the standard form:
−200
Algorithm 3: (Standard local search operator)
−300
1) Given x(i) , , f, g, and h;
2) Find an x such that it is better than x(i) in solving: −400
2938
the algorithm. In an unconstrained problem S contains only
Recombination and
the input points and their objective function evaluations. Mutation
Suppose that the individual p(j) ∈ P (t), j ∈ {1, . . . , µ}, Parent pop Offspring Offspring
was selected for local search. We select from the current Size µ Size λ Size λ
data set S all neighbours of p(j) , i.e., those that belong to
V(x(j) , ), as data points for generating the local approxi-
mation. Using these neighbour points, the individual makes
a local data set named SVj , which contains all points from
S that belong to V(x(j) , ). The number of data points -
neighbour points - in SVj is given by Nj . Then, we adjust a Parent pop
parameterized model using the local data set. Size µ
Many approximation techniques are available for design- Selection ES(µ,λ) or
ing a memetic algorithm with local approximate models, see Selection ES(µ+λ)
[14] for an overview on the use of approximation techniques
in evolutionary methods. In our experiments, we use the
radial basis function (RBF) network [19], which is capable of
Fig. 2. General diagram of the evolution strategy.
generating a nonlinear input-output mapping and capturing
the local multimodality of the function. The RBF network
is a parameterized model which can be adjusted to the data B. Local optimization
using either the supervised learning paradigm or the direct
solution to a linear system of equations. The next step is to proceed to the optimization of the
The mapping performed by the RBF network is given by: approximate problem, which is described as follows:
m min f˜(x)
(12)
ψ(x) = wi θi (x) + β (8) subject to: x ∈ V(x(i) , ) ∩ F̃
i=1
where f˜(·) is the approximation generated by the neural
where we have m neurons in the hidden layer. wi represents network, and F̃ is the approximated feasible set considering
the synaptic connection weight, β is a bias term, and θi is the approximations of the constraints:
the output of the ith nonlinear neuron in the network. We
adopt the multiquadric function as the radial basis function, F̃ = G̃ ∩ H̃ ∩ D (13)
which is given by:
and:
θi (x) = x − ci + h2i (9) G̃ = {x ∈ Rn : g̃i (x) ≤ 0, i = 1, . . . , p} (14)
where ci is the center of the multiquadric function and hi is
the shape parameter. H̃ = x ∈ Rn : h̃i (x) = 0, i = 1, . . . , q (15)
Making m = Nj and setting each center equal to one of
the data points in SVj , it is possible to obtain an approxima- In this paper, we employ the evolution strategy (ES) [18],
tion by solving: [20] algorithm as local optimizer. This algorithm is capable
of dealing with multimodality, hence it can “jump” local
Ψ 1 w f minima captured or modeled by the local approximation
= (10)
1T 0 β 0 technique, which can not be done by gradient-based search
methods.
where: The evolution strategy works with a population of µ
Ψij = xi − xj + h2i (11) individuals, which generates an offspring with λ individuals
through the successive application of recombination and
The total number of points in the vicinity should be greater mutation operators. After that, the population of the next
than 0.5(n + 1)(n + 2), which is the minimum number generation is formed by the µ best individuals selected either
required to fit a n-dimensional quadratic function. If this from the offspring population only or from both parent and
condition is not met, the local approximation is not generated offspring population. These two selection schemes give rise
and no local search is made. When it is met, we train the to the nomenclature ES(µ, λ) or ES(µ + λ), respectively.
RBF model using S V as training data. A general diagram of the evolution strategy algorithm is
For constrained problems, the procedure described here is depicted in Fig. 2. This diagram shows the iterative cycle
repeated for each nonlinear constraint function, that is, we of the evolution strategy. The ES performs the local search
obtain an approximation for each function involved in the over the local approximations and within the vicinity of the
original optimization problem. point selected to local search. An improved individual is
2939
returned by the local search and substitutes for the original A. Unconstrained test problems
individual. Finally, since the basic ES does not have any
The test functions used in our experiments are described
explicit mechanism for dealing with constraints, we deal with
next:
constrained problems by using a penalty function. For other
ways to treat constraints in ES, see [21]. 1) Rosenbrock function:
2940
TABLE I
1.4
S ETTINGS FOR THE UNCONSTRAINED PROBLEMS RGA
H-RGA
1.2
Dimension n=2 n = 10
population size 20 50 1
crossover rate 0.80 0.80
mutation rate 0.05 0.05 0.8
max generations 100 300
0.6
TABLE II 0.4
Dimension n=4 0
0 200 400 600 800 1000 1200
population size 50
crossover rate 0.80
mutation rate 0.05
max generations 200 Fig. 3. Mean values of the convergence curve with respect to the number
of function calls for the Rosenbrock function with n = 2.
140
RGA
H-RGA
120
C. Results
Both RGA and H-RGA are set up according to Table I for 100
Figs. 6-8 show the same graphs for the unconstrained 0.8
2941
60 12000
RGA RGA
H-RGA 11000
H-RGA
50
10000
40
9000
30 8000
7000
20
6000
10
5000
0 4000
0 2000 4000 6000 8000 10000 12000 14000 16000 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Fig. 6. Mean values of the convergence curve with respect to the number Fig. 9. Mean values of the convergence curve at some generation values
of function calls for the Rosenbrock function with n = 10. for the constrained problem.
2500
RGA
H-RGA
2000
1500
1000
500
0
0 0.5 1 1.5 2 2.5 3
4
x 10
Fig. 7. Mean values of the convergence curve with respect to the number
of function calls for the Schwefel function with n = 10.
7 Fig. 10. SMES configuration, showing the two coils, the 21 control points
RGA
H-RGA for evaluating the strayed magnetic field and the 2D planar version of the
6 problem.
0
0 2000 4000 6000 8000 10000 12000 14000 16000
subject to:
= Bmax − 4.92T ≤ 0
g(x)
Fig. 8. Mean values of the convergence curve with respect to the number
of function calls for the Griewank function with n = 10. |E − 180M J|
h(x) =
180M J
where ri is one of the 21 control points, Bmax is the
axisymmetric nature of the problem, it is possible to reduce maximum value of the magnitude of the flux density at the
the geometry to a 2D problem, hence saving computation. coils, and E is the stored energy for the given configuration.
The objective of the problem is to find the optimal values The evaluation of these objective and constraint functions
for the radius (x1 ), the height (x2 ) and the width (x3 ) of the involves the solution of an implicit numerical field analysis
coil in its 2D section, which minimize the stray magnetic problem, which demands great computational effort.
field evaluated at the 21 control points shown in Fig. 10. For this problem, we adopt the same configuration used
Additionally, the optimal configuration must provide 180M J for the constrained problem, see Table II, except that the
(30kW h) for the stored energy and must not violate the maximum number of generations is limited to 50, i.e., the
2942
TABLE III
[2] P. Merz and B. Freisleben, “Fitness landscapes and memetic algorithm
S OLUTIONS FOR THE ELECTROMAGNETIC PROBLEM design,” in New Ideas in Optimization, D. Corne, M. Dorigo, and
F. Glover, Eds. McGraw-Hill, 1999.
RGA strayed field energy Bmax [3] N. Krasnogor, “Studies on the theory and design space of memetic
1.04mT 180.0M J 3.86T algorithms,” Ph.D. dissertation, Faculty of Computing, Engineering
x1 x2 x3 and Mathematical Sciences, University of the West England, Bristol,
3.30 1.06 0.16 UK, June 2002.
H-RGA strayed field energy Bmax [4] P. Merz and B. Freisleben, “Genetic local search for the TSP: New
1.05mT 180.9M J 4.71T results,” in Proc. IEEE International Conference on Evolutionary
x1 x2 x3 Computation. IEEE Press, Apr. 13–16, 1997, pp. 159–164.
3.06 0.52 0.38 [5] ——, “A comparison of memetic algorithms, tabu search, and ant
colonies for the quadratic assignment problem,” in Proc. IEEE
Congress on Evolutionary Computation, vol. 3. IEEE Press, July
6–9, 1999, pp. 2063–2070.
[6] H. Ishibuchi and T. Murata, “Local search procedures in a multi-
number of real function calls is limited to 2, 500. The objective genetic local search algorithm for scheduling problems,” in
electromagnetic analyzer takes about 2.65 seconds to solve Proc. IEEE International Conference on Systems, Man, and Cybernet-
ics, vol. 1. IEEE Press, Oct. 12–15, 1999, pp. 665–670.
the numerical field problem. The time to evaluate the entire [7] X. Hu, Z. Huang, and Z. Wang, “Hybridization of the multi-objective
population within a generation is hence about 2 minutes and evolutionary algorithms and the gradient-based algorithms,” in Proc.
12.5 seconds, thus the computational cost is dominated by IEEE Congress on Evolutionary Computation, vol. 2. IEEE Press,
Dec. 8–12, 2003, pp. 870–877.
the evaluation step. [8] S. Kimura and A. Konagaya, “High dimensional function optimization
Each algorithm is executed 5 times on this problem. The using a new genetic local search suitable for parallel computers,” in
best solutions achieved by both RGA and H-RGA are shown Proc. IEEE International Conference on Systems, Man, and Cybernet-
ics, vol. 1. IEEE Press, Oct. 5–8, 2003, pp. 335–342.
in Table III. On average, a very good solution is achieved by [9] Y. W. Ong and A. J. Keane, “Meta-lamarckian learning in memetic
H-RGA from the 20th generation on, while RGA achieves its algorithms,” IEEE Trans. Evol. Comput., vol. 8, no. 2, pp. 99–110,
solution only 23 generations later. Therefore, H-RGA saves Apr. 2004.
[10] G. Renner and A. Eckart, “Genetic algorithms in computer aided
1, 150 evaluations, which means approximately 50 minutes design,” Computer Aided Design, vol. 35, pp. 709–726, 2003.
in function evaluations. Until the 20th generation, the local [11] T. Ebner, C. Magele, B. R. Brandstatter, and K. R. Richter, “Utilizing
search phase was performed only 4 times, at generations 5, feedforward neural networks for acceleration of global optimization
procedures,” IEEE Trans. Magn., vol. 34, no. 5, pp. 2928–2931, Sept.
10, 15, and 20. The time to generate the local approximations 1998.
and the local search over the approximated functions at these [12] A. Canova, G. Gruosso, and M. Repetto, “Magnetic design opti-
4 times is insignificant compared to 50 minutes. mization and objective function approximation,” IEEE Trans. Magn.,
vol. 39, no. 5, pp. 2154–2162, Sept. 2003.
[13] K. R. Davey, “Examination of various techniques for the acceleration
VI. C ONCLUSIONS of multivariable optimization problems,” IEEE Trans. Magn., vol. 39,
This paper presented the use of local approximations in no. 3, pp. 1293–1296, May 2003.
[14] Y. Jin, “A comprehensive survey of fitness approximation in evo-
the context of local evolution within memetic algorithms for lutionary computation,” Soft Computing - A Fusion of Foundations,
costly optimization problems. The local search operator is Methodologies and Applications, vol. 9, no. 1, pp. 3–12, Jan. 2005.
performed by the evolution strategy restricted to the local [15] R. G. Regis and C. A. Shoemaker, “Local function approximation in
evolutionary algorithms for the optimization of costly functions,” IEEE
vicinity of the individual selected to memetic evolution. The Trans. Evol. Comput., vol. 8, no. 5, pp. 490–505, Oct. 2004.
local evolution is performed over the approximations built for [16] R. Dawkins, “Memes: The new replicators,” in The Selfish Gene.
the objective function and all nonlinear constraints present in Oxford University Press, 1976, ch. 11.
[17] P. Moscato, “On evolution, search, optimization, genetic algorithms
the optimization problem. and martial arts: Towards memetic algorithms,” Caltech Concurrent
The usage of the evolution strategy as local optimizer Computation Program, Tech. Rep., 1989, c3P Report 826.
has the benefit of jumping local minima modeled by the [18] H. P. Schwefel, Evolution and Optimal Seeking. Wiley, 1995.
[19] J. Park and I. W. Sandberg, “Universal approximation using radial-
local approximate models, thus the memetic evolution does basis-function networks,” Neural Computation, vol. 3, no. 2, pp. 246–
not get stuck in local minima within the vicinity. The 257, 1991.
results demonstrate the benefit of the proposed local learning [20] T. Back, Evolutionary Algorithms in Theory and Practice. Oxford
University Press, 1996.
and search for memetic algorithms in unconstrained and [21] E. M. Montes and C. A. C. Coello, “A simple multimembered
constrained optimization problems. evolution strategy to solve constrained optimization problems,” IEEE
The proposed methodology is very promising for tackling Trans. Evol. Comput., vol. 9, no. 1, pp. 1–17, 2005.
[22] K. Deb and R. B. Agrawal, “Simulated binary crossover for continuous
costly optimization problems, as demonstrated in the elec- search space,” Complex Systems, vol. 9, pp. 115–148, 1995.
tromagnetic design problem investigated. [23] B. Brandstaetter. (1996) Team workshop bench-
mark problem 22 definition. [Online]. Available:
ACKNOWLEDGMENT [Link]
2943