Gene Expression Programming
CS678 - Machine Learning
Grand Valley State University
by James Bund
GEP is like GA and GP
GA: Linear strings of fixed length
GP: Nonlinear entities of different sizes and shapes
GEP: Linear strings of fixed length and
Nonlinear entities of different sizes and shapes
GEP Algorithm
Genotype and Phenotype
Attempts to represent nature more faithfully.
In GA and GP genotype and phenotype are the same.
Genotype: Genetic representation
Phenotype: Physical representation
Genotype -> Translation -> Phenotype
String chromosome -> Expression Tree
Fitness is a judgment of the phenotype
Reproduction involves the genotype
Encoding Genotype & Phenotype
( a b ) (c d ) 01234567
Q*+-abcd
Genotype is a fixed length string.
Resulting genotype is guaranteed to translate into a valid
expression tree.
Open Reading Frames (ORF)
Biology: Start codons and Stop codons
GEP: Start at the beginning of the coding sequence
Stops when the functions have been expressed,
so usually there are non-coding regions at the end.
0123456789
Q*+-abcdef
Head and Tail Regions
t = h (n-1) + 1
t = tail length
h = head length
n = maximum number of arguments to a function
012345678901234567890
+Q-/b*aaQbaabaabbaaab
Multigenic Chromosomes
Hierarchical discovery technique
012345678012345678
Q*Q+bbaaa*-babaabb
Multigenic Chromosomes
012345678901201234567890120123456789012
IIAIca3aa2acuNNAOab2u3c31cAu12ua3112cac
13 = h + t
t = h (n-1) + 1
h = 4, t = 9
Replication
Replication is copying without modification.
Mutation
012345678012345678012345678
-+-+abaaa/bb/ababb*Q*+aaaba
012345678012345678012345678
Q+-+abaaa/bbQababb*b*+aaaba
Resulting ET will be valid as long as the tail doesnt contain functions.
Insertion Sequence (IS)
Transposition
012345678901234567890012345678901234567890
*-+*a-+a*bbabbaabababQ**+abQbb*aabbaaaabba
012345678901234567890012345678901234567890
*-+*a-bba+babbaabababQ**+abQbb*aabbaaaabba
The position and the length of the IS must be determined.
Resulting ET will be valid, because the change is upstream.
Root Insertion Sequence (RIS)
Transposition
012345678901234567890012345678901234567890
-ba*+-+-Q/abababbbaaaQ*b/+bbabbaaaaaaaabbb
012345678901234567890012345678901234567890
-ba*+-+-Q/abababbbaaa+bbQ*b/+bbaaaaaaaabbb
The position and the length of the RIS must be determined.
Resulting ET will be valid, because the change is upstream.
Gene Transposition
012345678012345678012345678
*a-*abbab-QQ/aaabbQ+abababb
012345678012345678012345678
-QQ/aaabb*a-*abbabQ+abababb
No effect on genes, but can change the linking of genes.
1-Point Recombination
012345678012345678
-b+Qbbabb/aQbbbaab
/-a/ababb-ba-abaaa
012345678012345678
-b+/ababb-ba-abaaa
/-aQbbabb/aQbbbaab
Second gene was not modified.
Both offspring will always be valid.
2-Point Recombination
0123456789001234567890
+*a*bbcccac*baQ*acabab
*cbb+cccbcc++**bacbaab
0123456789001234567890
+*a*bbccbcc++*Q*acabab
*cbb+ccccac*ba*bacbaab
Both offspring will always be valid.
Gene Recombination
012345678012345678012345678
/aa-abaaa/a*bbaaab/Q*+aaaab
/-*/abbabQ+aQbabaa-Q/Qbaaba
012345678012345678012345678
/aa-abaaaQ+aQbabaa/Q*+aaaab
/-*/abbab/a*bbaaab-Q/Qbaaba
Block Stacking (SpellBot) Example
EQ=E TOS=t
DO=D TCB=c
MTT=T NBN=n
MTS=S
NOT=N
EQ(DO(MTT(TOS), NOT(TOS)), DO(MTS(NBN), NOT(NBN)))
012345601234560123456
DTNttttDSNnnnncNStcnc
Advantages of GEP over GA and GP
GAs have less complexity in what they can express.
GPs are harder to work with because their operators must
be smart enough to ensure syntactically correct offspring.
GEP combine these two features by having the ease of
working with them as GAs and the complexity of GPs.
Generated phenotypes are always syntactically correct and
dont need to be corrected.
The genetic operators dont need to handle complex
language syntax.
Multigenic Chromosome Benefits
Evolving components of a solution not just solutions.
Separation of tasks and abstraction of details like in the
engineering done by humans.
Similar to:
Structured Programming
Functions
Object Oriented
Design Patterns
You Can Have Extra Genes
Best Fitness vrs. Average Fitness
References
Gene Expression Programming: A New Adaptive Algorithm for Solving Problems
by Candida Ferreira
http://www.gene-expression-programming.com/webpapers/gep.pdf
Gene Expression Programming
http://www.gene-expression-programming.com
GeneXproTools
http://www.gepsoft.com