0% found this document useful (0 votes)
98 views20 pages

Genetic Programming in A Nutshell

Genetic programming is an evolutionary algorithm technique that can automatically solve problems without needing to specify the solution structure in advance. It works by evolving a population of computer programs over multiple generations, using genetic operations like crossover and mutation to combine programs and create new, hopefully improved programs. Programs are represented as syntax trees, with nodes for functions and leaves for variables/constants. The basic steps involve initializing a random population, evaluating each program's fitness, selecting programs for reproduction based on fitness, and applying genetic operations to create new programs, repeating until a solution is found.

Uploaded by

midhunsalim
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views20 pages

Genetic Programming in A Nutshell

Genetic programming is an evolutionary algorithm technique that can automatically solve problems without needing to specify the solution structure in advance. It works by evolving a population of computer programs over multiple generations, using genetic operations like crossover and mutation to combine programs and create new, hopefully improved programs. Programs are represented as syntax trees, with nodes for functions and leaves for variables/constants. The basic steps involve initializing a random population, evaluating each program's fitness, selecting programs for reproduction based on fitness, and applying genetic operations to create new programs, repeating until a solution is found.

Uploaded by

midhunsalim
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 20

Genetic Programming

INTRODUCTION

Genetic programming (GP) is an evolutionary computation technique that automatically solves


problems without requiring the user to know or specify the form or structure of the solution in
advance. At the most abstract level GP is a systematic, domain-independent method for getting
computers to solve problems automatically starting from a high-level statement of what needs to
be done. Since its inception, GP has attracted the interest of myriads of people around the globe.
Things continue to change rapidly in genetic programming as investigators and practitioners
discover new methods and applications.

GENETIC PROGRAMMING IN A NUTSHELL

In genetic programming we evolve a population of computer programs. That is, generation by


generation, GP stochastically transforms populations of programs into new, hopefully better,
populations of programs. GP, like nature, is a random process, and it can never guarantee results.
GP’s essential randomness, however, can lead it to escape traps which deterministic methods
may be captured by. Like nature, GP has been very successful at evolving novel and unexpected
ways of solving problems.

Figure: The basic control flow for genetic programming, where survival of the fittest is used to
find solutions.
Genetic Programming

The basic steps in a GP system are shown in Algorithm below:

1: Randomly create an initial population of programs from the available primitives


2: repeat
3: Execute each program and ascertain its fitness.
4: Select one or two program(s) from the population with a probability based on fitness to
participate in genetic operations.
5: Create new individual program(s) by applying genetic operations with specified probabilities
6: Until an acceptable solution is found or some other stopping condition is met (e.g., a
maximum number of generations is reached).
7: return the best-so-far individual.

How well a program works by running it, and then comparing its behaviour to some ideal. We
might be interested, for example, in how well a program predicts a time series or controls an
industrial process. This comparison is quantified to give a numeric value called fitness. Those
programs that do well are chosen to breed and produce new programs for the next generation.
The primary genetic operations that are used to create new programs from existing ones are:

• Crossover: The creation of a child program by combining randomly chosen parts from two
selected parent programs.

• Mutation: The creation of a new child program by randomly altering a randomly chosen part
of a selected parent program.

WHY GENETIC PROGRAMMING??

Genetic programming is a machine learning model which, its adherents would claim, is
the most general and flexible around. It saves time by freeing the human from having to
design complex algorithms. Not only designing the algorithms but creating ones that give
optimal solutions. It is again a type of artificial intelligence. It has already been applied to a wide
variety of problem domains and may well have real-world utility.

HISTORY OF GP

The first results on the GP methodology were reported by Stephen F. Smith (1980) and
Nichael L. Cramer (1985). In 1981 Forsyth reported the evolution of small programs in forensic
science for the UK police. John R. Koza is a main proponent of GP and has
pioneered the application of genetic programming in various complex optimization and
search problems. GP is very computationally intensive and so in the 1990s it was mainly used to
solve relatively simple problems. But more recently, thanks to improvements in GP technology
and to the exponential growth in CPU power, GP produced many novel and outstanding results
in areas such as quantum computing, electronic design, game playing, sorting, searching and
Genetic Programming

many more. These results include the replication or development of several post-year-2000
inventions. GP has also been applied to evolvable hardware as well as computer programs. There
are several GP patents listed in the website.
Developing a theory for GP has been very difficult and so in the 1990s GP was considered a sort
of outcast among search techniques. But after a series of breakthroughs
in the early 2000s, the theory of GP has had a formidable and rapid development. So
much so that it has been possible to build exact probabilistic models of GP (schema
theories and Markov chain models).
Genetic Programming

REPRESENTATION

In GP, programs are usually expressed as syntax trees rather than as lines of code. Figure below
shows the tree representation of the program max(x+x,x+3*y). The variables and constants in the
program (x, y and 3) are leaves of the tree. In GP they are called terminals, whilst the arithmetic
operations (+, * and max) are internal nodes called functions. The sets of allowed functions and
terminals together form the primitive set of a GP system.

GP syntax tree representing max(x+x,x+3*y).

Languages that provide automatic garbage collection and dynamic lists as fundamental data
types make it easier to implement expression. In more advanced forms of GP, programs can be
composed of multiple components (e.g., subroutines). In this case the representation used in GP
is a set of trees (one for each component) grouped together under a special root node that acts as
glue. We will call these (sub)trees branches. The number and type of the branches in a program,
together with certain other features of their structure, form the architecture of the program. It is
common in the GP literature to represent expressions in a prefix notation similar to that used in
Lisp or Scheme. For example, max(x+x,x+3*y) becomes (max (+ x x) (+ x (* 3 y))). This
notation often makes it easier to see the relationship between (sub)expressions and their
corresponding (sub)trees. . How one implements GP trees will obviously depend a great deal on
the programming languages and libraries being used. trees and the necessary GP operations.
Most traditional languages used in AI research (e.g., Lisp and Prolog), many recent languages
(e.g., Ruby and Python), and the languages associated with several scientific programming tools
(e.g., MATLAB and Mathematica) have these facilities. In other languages, one may have to
implement lists/trees or use libraries that provide such data structures. In high performance
environments, the tree-based representation of programs may be too inefficient since it requires
the storage and management of numerous pointers. In some cases, it may be desirable to use GP
primitives which accept a variable number of arguments (a quantity we call arity). An example is
the sequencing instruction program, which accepts any number of arguments executes them one
at a time and then returns the value returned by the last argument. However, fortunately, it is now
extremely common in GP applications for all functions to have a fixed number of arguments. If
this is the case, then, the brackets in prefix-notation expressions are redundant, and trees can
Genetic Programming

efficiently be represented as simple linear sequences. In effect, the function’s name gives its arity
and from the arities the brackets can be inferred. For example, the expression (max (+ x x) (+ x
(* 3 y))) could be written unambiguously as the sequence max + x x + x * 3 y.
The choice of whether to use such a linear representation or an explicit tree representation is
typically guided by questions of convenience, efficiency, the genetic operations being used
(some may be more easily or more efficiently implemented in one representation), and other data
one may wish to collect during runs. (It is sometimes useful to attach additional information to
nodes, which may be easier to implement if they are explicitly represented).
These tree representations are the most common in GP, e.g., numerous high-quality, freely
available GP implementations use them. Non-tree representations have been suggested and
successfully implemented, such as the simpler linear genetic programming which suits the more
traditional imperative languages . The commercial GP software Discipulus, uses AIM, automatic
induction of binary machine code to achieve better performance. MicroGP ,uses a representation
similar to linear GP to generate programs that fully exploit the syntax of a given assembly
language.
Genetic Programming

GENETIC PRINCIPLES

The main operators used in evolutionary algorithms such as GP are crossover and mutation.

Mutation

In theory, the task of mutation in GP is the same as in all other branches, creating a new
individual from an old one through some small random variation. The most common
implementation works by replacing the subtree starting at a randomly selected node by a
randomly generated tree. The newly created tree is usually generated the same way as in the
initial population, see Section 8.

Note, that the size (tree depth) of the child can exceed that of the parent tree. Figure
illustrates how the parse tree belonging to the formula 1 (left) is mutated into a parse tree
standing for 2 _ ¡ + ((x + 3) _ y)

Figure : GP mutation illustrated: the node designated by a circle in the tree on the left is
selected for mutation. The subtree staring at that node is replaced by a randomly generated tree,
being a leave here.

Mutation in GP has two parameters:


The probability of choosing mutation at the junction with recombination,
The probability of choosing an internal point within the parent as the root of the subtree to be
replaced.

It is remarkable that Koza's classic book on GP from 1992, cf. [5], advises to set the
mutation rate at 0, i.e., it suggests that GP works without mutation. More recently Banzhaf et al.
suggested5% [2]. In giving mutation such a limited role, GP differs from other EA streams. The
reason for this is the generally shared view that crossover has a large shuffling effect, acting in
some sense as a macromutation operator [1]. The current GP practice uses low, but positive,
mutation frequencies, even though some studies indicate that the common wisdom favoring an
(almost) pure crossover approach might be misleading.
Genetic Programming

GP Crossover

Allow for the two parents having different shapes, one-point crossover analyses during
biological sexual reproduction, the genetic material from both mother and father is combined in
such a way that genes in the child are in approximately the same position as they were in its
parents. This is quite different from traditional tree based GP crossover, which can move a
subtree to a totally different position in the tree structure.
Crossover operators that tend to preserve the position of genetic material are called homologous,
and several notions of homologous crossover have been proposed for GP. It is fairly
straightforward to realise homologous crossover when using linear representations, and
homologous operators are widely used in linear GP. Various forms of homologous crossover
have also been proposed for tree-based GP. The oldest homologous crossover in tree-based GP is
one-point crossover. This works by selecting a common crossover point in the parent programs
and then swapping the corresponding point only from the parts subtrees. To the two trees from
the root nodes and selects the crossover of the two trees in the common region . In the common
region, the parents have the same shape. The common region is related to homology, in the sense
that the common region represents the result of a matching process between parent trees. Within
the common region between two parent trees, the transfer of homologous primitives can happen
like it does in a linear bit string genetic algorithm. Uniform crossover for trees works (in the
common region) like uniform crossover in GAs. That is, the offspring are created by visiting the
nodes in the common region and flipping a coin at each locus to decide whether the
corresponding offspring node should be picked from the first or the second parent. If a node to be
inherited belongs to the base of the common region and is a function, then the subtree rooted
there is inherited as well. With this form of root than with other operators. In context-preserving
crossover, the crossover points are constrained to have the same coordinates, like crossover,
there can be a greater mixing of the code near the in one-point crossover. Note that the crossover
points are not limited to the common region. In size-fair crossover the first crossover point is
selected randomly, as with standard crossover
Genetic Programming

Example of one-point crossover between parents of different sizes and shapes.

Then the size of the subtree to be removed from the first parent is calculated. This is used to
constrain the choice of the second crossover point so as to guarantee that the subtree excised
from the second parent will not be “unfairly” big. Harries and Smith (1997) suggested five new
crossover operators that are like standard crossover but with probabilistic restrictions on the
depth of crossover points within the parent trees. Since crossover and mutation are specific to the
representation used in GP, each new representation tends to need new crossover and mutation
operators. For example “ripple crossover” is a way of looking at crossover in grammatical
evolution.
Genetic Programming

HOW GENETIC PROGRAMMING WORKS??

Genetic programming starts with a primordial ooze of thousands of


randomly created computer programs. This population of programs is progressively evolved over
a series of generations. The evolutionary search uses the Darwinian principle of natural selection
(survival of the fittest) and analogs of various naturally occurring operations, including crossover
(sexual recombination), mutation, gene duplication, gene deletion. Genetic programming
sometimes also employs developmental processes by which an embryo grows into fully
developed organism.. In addition, genetic programming can automatically create, in a single run,
a general (parameterized) solution to a problem in the form of a graphical structure whose nodes
or edges represent components and where the parameter values of the components are specified
by mathematical expressions containing free variables. That is, genetic programming can
automatically create a general solution to a problem in the form of a parameterized topology.

PREPARATORY STEPS OF GP

Genetic programming starts from a high-level statement of the requirements of a problem and
attempts to produce a computer program that solves the problem. The human user communicates
the high-level statement of the problem to the genetic programming system by performing
certain well-defined preparatory steps.
The five major preparatory steps for the basic version of genetic programming require the human
user to specify

(1) The set of terminals (e.g., the independent variables of the problem, zero-argument functions,
and random constants) for each branch of the to-be-evolved program,

(2) The set of primitive functions for each branch of the to-be-evolved program,

(3) The fitness measure (for explicitly or implicitly measuring the fitness of individuals in the
population),

(4) Certain parameters for controlling the run, and

(5) The termination criterion and method for designating the result of the run.

The figure below shows the five major preparatory steps for the basic version of genetic
programming. The preparatory steps (shown at the top of the figure) are the human supplied
input to the genetic programming system. The computer program (shown at the bottom) is the
output of the genetic programming system.
Genetic Programming

Preparatory steps of Genetic Programming

The first two preparatory steps specify the ingredients that are available to create the computer
programs. A run of genetic programming is a competitive search among a diverse population of
programs composed of the available functions and terminals.

Function Set and Terminal Set

The identification of the function set and terminal set for a particular problem (or category of
problems) is usually a straightforward process. For some problems, the function set may consist
of merely the arithmetic functions of addition, subtraction, multiplication, and division as well as
a conditional branching operator. The terminal set may consist of the program’s external inputs
(independent variables) and numerical constants. This function set and terminal set is useful for a
wide variety of problems (and corresponds to the basic operations found in virtually every
general-purpose digital computer).

For many other problems, the ingredients include specialized functions and terminals. For
example, if the goal is to get genetic programming to automatically program a robot to mop the
entire floor of an obstacle-laden room, the human user must tell genetic programming what the
robot is capable of doing. For example, the robot may be capable of executing functions such as
moving, turning, and swishing the mop. If the goal is the automatic creation of a controller, the
function set may consist of signal processing functions that operates on time-domain signals,
including integrators, differentiators, leads, lags, gains, adders, subtractors, and the like. The
terminal set may consist of signals such as the reference signal and plant output. Once the human
user has identified the primitive ingredients for a problem of controller synthesis, the same
function set and terminal set can be used to automatically synthesize a wide variety of different
controllers.
Genetic Programming

If a complex structure, such an antenna, is to be designed, it may be desirable to use functions


that cause a turtle to draw the complex structure.

If the goal is the automatic synthesis of an analog electrical circuit, the function set may enable
genetic programming to construct circuits from components such as transistors, capacitors, and
resistors. This construction (developmental) process typically starts with a very simple
embryonic structure, such as a single modifiable wire. If, additionally, such a function set is
geographically aware, a circuit’s placement and routing can be synthesized at the same as its
topology and sizing. Once the human user has identified the primitive ingredients for a problem
of circuit synthesis, the same function set and terminal set can be used to automatically
synthesize an amplifier, computational circuit, active filter, voltage reference circuit, or any other
circuit composed of these ingredients.

Fitness Measure

The third preparatory step concerns the fitness measure for the problem. The fitness measure
specifies what needs to be done. The fitness measure is the primary mechanism for
communicating the high-level statement of the problem’s requirements to the genetic
programming system. For example, if the goal is to get genetic programming to automatically
synthesize an amplifier, the fitness function is the mechanism for telling genetic programming to
synthesize a circuit that amplifies an incoming signal (as opposed to, say, a circuit that
suppresses the low frequencies of an incoming signal or a circuit that computes the square root of
the incoming signal). The first two preparatory steps define the search space whereas the fitness
measure implicitly specifies the search’s desired goal.

Control Parameters

The fourth and fifth preparatory steps are administrative. The fourth preparatory step entails
specifying the control parameters for the run. The most important control parameter is the
population size. In practice, the user may choose a population size that will produce a reasonably
large number of generations in the amount of computer time we are willing to devote to a
problem (as opposed to, say, analytically choosing the population size by somehow analyzing a
problem’s fitness landscape). Other control parameters include the probabilities of performing
the genetic operations, the maximum size for programs, and other details of the run.
Genetic Programming

Termination

The fifth preparatory step consists of specifying the termination criterion and the method of
designating the result of the run. The termination criterion may include a maximum number of
generations to be run as well as a problem-specific success predicate. In practice, one may
manually monitor and manually terminate the run when the values of fitness for numerous
successive best-of-generation individuals appear to have reached a plateau. The single best-so-far
individual is then harvested and designated as the result of the run.

Running Genetic Programming

After the human user has performed the preparatory steps for a problem, the run of genetic
programming can be launched. Once the run is launched, a series of well-defined, problem-
independent executional step is executed.

EXECUTIONAL STEPS OF GENETIC PROGRAMMING

Genetic programming typically starts with a population of randomly generated computer


programs composed of the available programmatic ingredients. Genetic programming iteratively
transforms a population of computer programs into a new generation of the population by
applying analogs of naturally occurring genetic operations. These operations are applied to
individual(s) selected from the population. The individuals are probabilistically selected to
participate in the genetic operations based on their fitness (as measured by the fitness measure
provided by the human user in the third preparatory step). The iterative transformation of the
population is executed inside the main generational loop of the run of genetic programming.
The executional steps of genetic programming are as follows:

(1) Randomly create an initial population (generation 0) of individual computer programs


composed of the available functions and terminals.

(2) Iteratively perform the following sub-steps (called a generation) on the population until the
termination criterion is satisfied:
(a) Execute each program in the population and ascertain its fitness (explicitly or implicitly)
using the problem’s fitness measure.

(b) Select one or two individual program(s) from the population with a probability based on
fitness (with reselection allowed) to participate in the genetic operations in (c).
Genetic Programming

(c) Create new individual program(s) for the population by applying the following genetic
operations with specified probabilities:

(i) Reproduction: Copy the selected individual program to the new population.
(ii) Crossover: Create new offspring program(s) for the new population by recombining
randomly chosen parts from two selected programs.
(iii) Mutation: Create one new offspring program for the new population by randomly mutating
a randomly chosen part of one selected program.
(iv) Architecture-altering operations: Choose an architecture-altering operation from the
available repertoire of such operations and create one new offspring program for the new
population by applying the chosen architecture-altering operation to one selected program.
3. After the termination criterion is satisfied, the single best program in the population produced
during the run (the best-so-far individual) is harvested and designated as the result of the run. If
the run is successful, the result may be a solution (or approximatesolution) to the problem.
Genetic Programming

Flowchart (Executional Steps) of Genetic Programming


Genetic Programming

EXAMPLES OF GP

• Symbolic regression - is the process of discovering both the functional form of a target
function and all of its necessary coefficients, or at least an approximation to these. This is distinct
from other forms of regression such as polynomial regression in which you are merely trying to
find the coefficients of a polynomial of a prespecified order. In some senses, all GP problems are
symbolic regression problems, but the term is usually used to apply to finding real-valued
functions.

• Analog circuit design-Embryo circuit-initial circuit which is modified to create a new circuit
according to functionality criteria.
Genetic Programming

META GP
Meta-Genetic Programming is the proposed meta learning technique of evolving a genetic
programming system using genetic programming itself. It suggests that crossover, and mutation
were themselves evolved, therefore like their real life counterparts should be allowed to change
on their own rather than being determined by a human programmer. Meta-GP was proposed by
Jürgen Schmidhuber in 1987; it is a recursive but terminating algorithm, allowing it to avoid
infinite recursion.
Critics of this idea often say this approach is overly broad in scope. However, it might be
possible to constrain the fitness criterion onto a general class of results, and so obtain an evolved
GP that would more efficiently produce results for sub-classes. This might take the form of a
Meta evolved GP for producing human walking algorithms which is then used to evolve human
running, jumping, etc. The fitness criterion applied to the Meta GP would simply be one of
efficiency.
For general problem classes there may be no way to show that Meta GP will reliably produce
results more efficiently than a created algorithm other than exhaustion. The same holds for
standard GP and other search algorithms.
Genetic Programming

APPLICATIONS OF GP

There are numerous applications of genetic programming. Some of them are:

“Black Art Problems,” such as the automated synthesis of analog electrical circuits, controllers,
antennas, networks of chemical reactions, optical systems, and other areas of design,

“Programming The Unprogrammable” (PTU) involving the automatic creation of computer


programs for unconventional computing devices such as cellular automata, multi-agent systems,
parallel programming systems, field-programmable gate arrays, field-programmable analog
arrays, ant colonies, swarm intelligence, distributed systems, and the like.

Commercially Useful New Inventions (CUNI) involving the use of genetic programming as an
automated "invention machine" for creating commercially usable new inventions.
Genetic Programming

LIMITATIONS

Genetic programming is limited by the following factors:

1. Conventional programs are extremely fragile with respect to random changes. Possible
direction: Use less fragile representation

2. Fitness function hard to define (binary nature of many solutions; how bad is a wrong answer?)
Possible direction: concentrate on application domains where fitness is easy to define
Genetic Programming

CONCLUSION

Today, we can see that indeed GP has started fulfilling dreams by providing us with a systematic
method, based on Darwinian evolution, for getting computers to automatically solve hard real-
life problems. To do so, it simply requires a high-level statement of what needs to be done and
enough computing power. At present GP is unable to produce computer programs that would
pass the full Turing test for machine intelligence, and it might not be ready for this immense task
for centuries. Nonetheless, thanks to the constant improvements in GP technology, in its
theoretical foundations and in computing power, GP has been able to solve dozens of difficult
problems with human-competitive results and to provide valuable solutions to many other
problems. It is reasonable to predict that in a few years time GP will be able to routinely and
competently solve important problems for us, in a variety of application domains with human-
competitive performance. Genetic programming will then become an essential collaborator for
many human activities. This will be a remarkable step forward towards achieving true human-
competitive machine intelligence.
Genetic Programming

REFERENCES

Links
http://www.genetic-programming.com/
http://en.wikipedia.org/wiki/Genetic_programming
http://www.massey.ac.nz/~mgwalker/gp/intro/introduction_to_gp.pdf

Books
Genetic Programming: On the Programming of Computers by Means of Natural selection
By: John R. Koza
A Field Guide to Genetic Programming
By :Riccardo Poli
William B. Langdon
Nicholas F. McPhee

You might also like