0% found this document useful (0 votes)

23 views6 pages

A Genetic Algorithm For Classification

Uploaded by

fran.nanclares

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views6 pages

A Genetic Algorithm For Classification

Uploaded by

fran.nanclares

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: [Link]

net/publication/262282896

A genetic algorithm for classiﬁcation

Conference Paper · May 2011

CITATIONS READS

3 3,847

2 authors, including:

Holban Stefan
Polytechnic University of Timisoara
88 PUBLICATIONS 334 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Improve the algorithm I created: Local Binary Compress, LBC View project

MaternQual CEEX99 View project

All content following this page was uploaded by Holban Stefan on 29 October 2015.

The user has requested enhancement of the downloaded file.

Recent Researches in Computers and Computing

A Genetic Algorithm for Classification

RAUL ROBU, ŞTEFAN HOLBAN

Department of Automation and Applied Informatics, Department of Computers
“Politehnica” University of Timisoara
Vasile Parvan Blvd.2, 300223, Timisoara
ROMANIA
[Link]@[Link], [Link]@[Link]

Abstract: - The paper presents aspects regarding genetic algorithms, their use in data mining and especially
about their use in the discovery of classification rules. A synthetic presentation of the fitness functions of the
genetic algorithms used for mining the classification rules is performed. A genetic algorithm with a new fitness
function for mining the classification rules is suggested. The proposed algorithm was tested on classic dataset
Car, Zoo and Mushroom. The same datasets were tested with classic algorithms NaiveBayes si J48. The results
obtained by applying the three algorithms are presented.

Key-Words: - genetic algorithm, data mining, classification

1 Introduction
The genetic algorithms are adaptive techniques that can be successfully used to solve complex search and
optimization problems [1]. They are based on the principles of genetics and Darwin’s natural selection theory
(“the one that is best endowed, survives”). Genetic algorithms have been successfully used in data mining, in
order to determine classification rules [2], in order to search for appropriate cluster centers) [1], to select the
attributes of interest in predicting the value of a target attribute [3], etc. Classification of instances was
performed using some hybrid algorithms based on genetic algorithms and particle swarm optimization [4],
respectively Naïve Bayes and k-Nearest Neighbors [5]. A few applications in which genetic algorithms were
successfully applied to solve classification problems are prints’ classification, heart disease classification,
classification of emotions on the human face, etc.
The fitness functions of the genetic algorithms used for mining classification rules may contain metrics
concerning predictive accuracy, rule comprehensibility as well as rule interestingness [2][3]. Diverse studies
suggest genetic algorithms with fitness functions that take into consideration in different ways the above
mentioned metrics. The paper synthetically presents the fitness functions suggested in mining classification
rules and proposes a new fitness function. The genetic algorithm that incorporates the suggested function was
implemented in Java and tested on 3 classic datasets Zoo, Car and Mushrooms. The paper presents aspects
regarding the steps of the proposed algorithm. The 3 datasets were also classified using Naive Bayes and J48
algorithms from WEKA. The results obtained with the three algorithms are comparable and are presented in
chapter 4.

2 Fitness functions used in mining classification rules

Genetic algorithms are used to discover classification rules for data that will be used for predictions. The
discovered rules must have a great prediction accuracy, they must be comprehensive and interesting [2][3]. The
form of the rules is IF X THEN Y, where X is the antecedent of the rule and is formed from a conjunction of
conditions and Y is the consequent of the rule, that is the predicted class. In the fitness functions used for the
discovery of classification rules the following factors from the confusion matrix are often used:
• True positive (tp): the actual class is Y and the predicted class is also Y.
• False positive (fp): the actual class is Y, but the predicted class is not Y.

ISBN: 978-1-61804-000-8 52
Recent Researches in Computers and Computing

• True negative (tn): the actual class is not Y and the predicted class is also not Y.
• False negative (fn): the actual class is not Y, but the predicted class is Y.
In [3] the next fitness function is suggested:

Fitness=w1 x (CF x Comp) + w2 x Simp (1)

The first term from relation (1) determines prediction accuracy and the second one determines rule
comprehensibility. CF= tp / (tp+fp) is confidence factor and Comp= tp / (tp+fn) is the completeness factor.
Simp represents the simplicity of the rule (is inversaly proportional to the number of conditions from the
antecedent of the rule). w1 and w2 are weights defined by the user.
In [6] the fitness function was determined:

Fitness=tp / (tp+[Link]) * tn / (tn+[Link]) (2)

Where 0.2<A<2 si 1<B<20. A and B are parameters defined by the user that determine the convergence rate
of the solution and which are experimentally determined according to the requests.
In [7] the fitness function is calculated using the formula:

Fitness=w1 x (|A & C|-1/2) / |A| + w2 x CM (3)

where |A & C| is the number of instances that satisfy the antecedent and consequent of the rule and |A| is the
number of instances that satisfy the antecedent of the rule. Comprehensibility is noted with CM and is
calculated using the formula in relation (4).

CM= (L-n) / (L-1) (4)

where L is the maximum number of conditions of the rule and n is the length of the rule.
A complete fitness function that considers not only prediction accuracy and the clarity of the rule but also
the interestingness of the rule was suggested by Dehuri et all in [2] and it is presented in relation (5).

w1C ( R) + w2 Pr edAcc + w3 RInt

Fitness = (5)
w1 + w2 + w3

Where C(R) is comprehensibility, PredAcc is predictive accuracy and RInt is the interestingness of the rule.
Comprehensibility is computed in (6), where M is the maximum number of conditions of a rule.

C(R)=M- number of condition (R) (6)

Prediction accuracy is calculated with relation (7).

| A&C |
Pr edAcc = (7)
| A|

Rule interestingness is computed in (8) where InfoGain(Ai) is information gain bring by attribute i, and
|dom(G)| is the domain cardinality of the class attribute G.

ISBN: 978-1-61804-000-8 53
Recent Researches in Computers and Computing

n −1

∑ InfoGain( A )
i =1
i

RInt = 1 − n −1 (8)
log 2 (| dom(G ) |)

3 The genetic algorithm

The general forma of a genetic algorithm is the following [8]:

Algorithm
Compute initial population;
WHILE stopping condition not fulfilled DO
BEGIN
Select individuals for reproduction;
Create offspring’s by crossing individuals;
Eventually mutate some individuals;
Compute new generation
END

Data classification using a genetic algorithm presumes going over the steps presented below. The dataset is
divided into training set and test set. The initial population is made of the rules obtained by codifying the
instances from the training set. The rules from the initial population are expressed as following:

IF attribute1=value1 and attribute2=value2 and… THEN class=c1

A chromosome is represented by a classification rule. A gene is represented by the value of an attribute.

Each rule (chromosome) is codified under the form of a bit array [6].
An attribute that can take m values will be codified on m+1 bits (one bit for each possible value and another
bit that attests the presence or absence of the attribute from the rule). The class will be codified on m bits.
If we have 3 attributes A1 with 4 possible values, A2 with 2 possible values and A3 with 3 possible values
and a class attribute with 3 possible values, the codification of a rule can be made as in table 1 (the bits are
numbered from right to left). If the dataset contains numeric attributes they will discretize.

Table 1 Codification of chromosomes

Attribute A1 A2 A3 C
Code 11000 001 1010 100
Rule If (A1=valoare4 ) and A3=valoare2 then C=valoare3

In the following stage the fitness of each rule is calculated. The function suggested for calculating the fitness
is the one from relation (9).

Fitness=w1PA + w2 CPH+w3* S (9)

where,
• PA is predictive accuracy, PA= tp / (tp+fp)
• CPH is comprehensibility, CPH =MNC-ANC
• MNC = maxim number of conditions
• ANC=actual number of conditions
• S is sensitivity, S=tp / (tp+fn)

ISBN: 978-1-61804-000-8 54
Recent Researches in Computers and Computing

• w1, w2 and w3 are user defined weights

Some chromosomes from the current population are selected to form new descendants. The selection of
chromosomes is proportional with their performance (performed after the roulette principle). For each selected
chromosom a number between 0 and 1 is generated. If this number is lower then pc (crossover probability) the
chromosome will cross with another and will form two descendants (the crossing point is obtained randomly),
otherwise a descendant identical with the parent is created.
A crossing example after position 6 may be seen in table 2.

Table 2 – Crossing chromosomes

Cromozom1 100010011010100
Cromozom2 110001011001010
Cromozom1’ 110001011010100
Cromozom2’ 100010011001010

Through crossing, new chromosomes will be created (new rules). The numer of chromosomes from the
population expected to cross (nr_crom_cross) is determined as following:

nr_crom_cross =nr_corm_pop * pc ` (10)

where nr_crom_pop is the number of chromosomes from the population and pc is the crossing probability.
The mutation is performed by randomly choosing a bit from a chromosome and commuting its value. pm is
the notation for the probability of the mutation which is a parameter established usually by the users. For each
chromosome gene a number between 0 and 1 is generated and if the generated number is lower than pm, then
the gene suffers a mutation.
After the descendants were created, the most performing n parents and children are selected in order to form
a new population (n is mentioned by the user). The rest of the chromosomes from the new population are
chosen from the children chromosomes. The algorithm repeats itself until a stable number of generations is
formed or until there are no changes in the performances obtained for a number of population.
The user must choose the dimension of the initial population (P), the crossing probability (pc), the mutation
probability (pm), n number of the elite chromosomes and the number of generations for which the algorithm
runs.
The rules that are determined are used for classification of instances from the test set with the purpose to
find out the precision with which the model performs the classification. If this is satisfactory, the built model is
used to realize predictions.

4 Experimental results
The suggested algorithm was implemented in Java and tested on 3 datasets Car, Zoo and Mushrooms. The
chosen crossing factor was 0.9, the mutation probability 0.03, the population 100 and the number of generations
30. The weights of the fitness function have been chosen with the values w1=0.6, w2=0.2 and w3=0.2,
emphasing in this way the accuracy of the prediction. The results are presented in table 3. Tests were made on
the same datasets using WEKA [9] and the already classic algorithms NaïveBayes and J48. Each dataset was
equally divided into training set and test set. The results are presented in table 3.
The Car dataset, contains 1728 instances and 7 categorical attributes. These are buying (the purchasing price
of the car with the possible values vhigh,high,med,low), maint (the price for the car maintenance with the
possible values vhigh,high,med,low), doors (the number of doors -2, 3, 4, 5more), persons (number of persons
it may carry 2,4,more), log_but (dimension of the trunk– small, medium or large), safety (low, med, high), and
class (unacc, acc, good, vgood).

ISBN: 978-1-61804-000-8 55
Recent Researches in Computers and Computing

Zoo Dataset contains 101 instances and 8 attributes. These are animal, name, hair, feathers, eggs, milk,
airborne, aquatic, predator, toothed, backbone, breathes, venomous, fins, legs, tail, domestic, catsize, type (the
class attribute with 7 possible values).
The class attribute in Mushrooms dataset indicates if they are eatable or poisonous. The dataset has 8124
instances and 23 attributes such as cap-shape, cap-color, odor, gill-attachment, gill-spacing, gill-size, and so on.

Table 3 – Testing the algorithms

Car Zoo Mushrooms
NaiveBayes 86 90 94
J48 85 82 100
Genetic 89 85 96
Algorithm

As it can be observed from Table 3 the results obtained by the three algorithms are good and comparable,
on different detests different algorithms have better performances.

5 Conclusion
Data mining techniques are used to extract useful information and knowledge from the database. Classification
is a data mining technique which explores data from the training set and builds a classifier model, based on
these data, that can be used for performing predictions. The genetic algorithms are adaptive techniques that can
be successfully used in solving complex search and optimization problems. A database can be viewed as a very
large search space and a genetic algorithm can be viewed as a search strategy in a database. The classification
rules that are searched are the ones that will constitute the classifier model.
The present paper presents the way in which genetic algorithms have been used to discover classification
rules. A synthesis was presented concerning the fitness functions used at classification rules mining. A new
fitness function was suggested. The genetic algorithm and the suggested fitness function were implemented in
Java and tested on 3 classic datasets Car, Zoo and Mushrooms, and the results were good. The suggested fitness
function determines prediction accuracy and rule comprehensibility. The interestingness degree is another
important metric for the classification rules. An ulterior development direction would be to develop a metric for
the interestingness of the rule and to include it in the fitness function. Another future development direction
would be determining the most appropriate weight coefficients for a certain dataset.

References:
1. U. Maulik and S. Bandyopadhyay, Genetic algorithm-based clustering technique, Pattern Recognition 33,
2000, pp. 1455-1465
2. [Link], A. Ghosh and R. Mall, Genetic Algorithms for Multi-Criterion Classification and Clustering in
Data Mining, International Jurnal of Computing & Information Sciences, 2006, pp. 143-154
3. A.A. Freitas, A survey of evolutionary algorithms for data mining and knowledge discovery, Advances in
Evolutionary Computation, Springer-Verlag, 2002, pp. 819-845.
4. R. Ding, H. Dong, X. Feng and G. Yin, A Hybrid Particle Swarm Genetic Algorithm for Classification,
Proceedings of the 2009 International Joint Conference on Computational Sciences and Optimization,
vol.2, 2009, pp. 301
5. M. Aci, C. Inan and M. Avci, A hybrid classification method of k nearest neighbor, Bayesian methods and
genetic algorithm, Expert Systems with Applications, Vol. 37, 2010, 5061-5067
6. P. Vivekanandan and D. R. Nedunchezhian, A New Incremental Genetic Algorithm Based Classification
Model to Mine Data With Concept Drift, Journal of Theoretical and Applied Information Technology,
2010, pp. 36-42
7. P. Vivekanandan and D. R. Nedunchezhian, A Fast Genetic Algorithm for Mining Classification Rules in
Large Datasets, International Journal on Soft Computing (IJSC), Vol.1, 2010, pp. 10-20
8. D.M. Mukhopadhyay, M.O. Balitanas, A. Farkhod, S.H. Jeon, and D. Bhattacharyya, Genetic Algorithm:
A Tutorial Review, International Journal of of Grid and Distributed Computing, Vol.2, No.3, 2009, pp.
25-32
9. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann and I.H. Witten, The WEKA Data Mining
Software: An Update, ACM SIGKDD Explorations Newsletter, Volume 11 , Issue 1, 2009, pp. 10-18

ISBN: 978-1-61804-000-8 56

View publication stats

Genetic Algorithms For Association Rule Mining: A Comparative Study
No ratings yet
Genetic Algorithms For Association Rule Mining: A Comparative Study
7 pages
Analysisonfitnessfunction
No ratings yet
Analysisonfitnessfunction
5 pages
An Entropy-Based Adaptive Genetic Algorithm For Learning Classification Rules
No ratings yet
An Entropy-Based Adaptive Genetic Algorithm For Learning Classification Rules
8 pages
Unit - 5
No ratings yet
Unit - 5
9 pages
DM Report
No ratings yet
DM Report
12 pages
Observations From Survey: Analysis Based On Output
No ratings yet
Observations From Survey: Analysis Based On Output
2 pages
Genetic Algorithms ML
No ratings yet
Genetic Algorithms ML
40 pages
DC Meet Second
No ratings yet
DC Meet Second
21 pages
Survey On Mining Ars Using Genetic Algorithm
No ratings yet
Survey On Mining Ars Using Genetic Algorithm
4 pages
2008 - Pietramala - ECML
No ratings yet
2008 - Pietramala - ECML
16 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
3 pages
Class Adv Classification II
No ratings yet
Class Adv Classification II
32 pages
Data M Ining Classifi Cation: Fabricio Voznika Leo Nardo Via Na
No ratings yet
Data M Ining Classifi Cation: Fabricio Voznika Leo Nardo Via Na
6 pages
A Quantum Swarm Evolutionary Algorithm For Mining Association Rules in Large Databases
No ratings yet
A Quantum Swarm Evolutionary Algorithm For Mining Association Rules in Large Databases
6 pages
Association Rule Mining Using Genetic Algorithm: The Role of Estimation Parameters
No ratings yet
Association Rule Mining Using Genetic Algorithm: The Role of Estimation Parameters
10 pages
Indira 2011
No ratings yet
Indira 2011
10 pages
As You Can
No ratings yet
As You Can
8 pages
Abstract - Genetic Algorithms (GA) Have Emerged As Practical, Robust Optimization and Search
No ratings yet
Abstract - Genetic Algorithms (GA) Have Emerged As Practical, Robust Optimization and Search
10 pages
2010 OutliningAPractitionersGuidetoFitnessFunctionDesign Wilkerson Tauritz
No ratings yet
2010 OutliningAPractitionersGuidetoFitnessFunctionDesign Wilkerson Tauritz
6 pages
Rule Acquisition in Data Mining Using A Self Adaptive Genetic Algorithm
No ratings yet
Rule Acquisition in Data Mining Using A Self Adaptive Genetic Algorithm
8 pages
A Genetic-Algorithm For Discovering Small-Disjunct Rules in Data Mining
No ratings yet
A Genetic-Algorithm For Discovering Small-Disjunct Rules in Data Mining
14 pages
Literature Survey On Genetic Algorithm Approach For Fuzzy Rule-Based System
No ratings yet
Literature Survey On Genetic Algorithm Approach For Fuzzy Rule-Based System
4 pages
Lec07 GA
No ratings yet
Lec07 GA
15 pages
Artificial Intelligence in Biomedical Engineering
No ratings yet
Artificial Intelligence in Biomedical Engineering
25 pages
Mining Association Rules Using Genetic Algorithm: The Role of Estimation Parameters
No ratings yet
Mining Association Rules Using Genetic Algorithm: The Role of Estimation Parameters
9 pages
Icml 2005
No ratings yet
Icml 2005
8 pages
A Novel Optimization Hardness Indicator Based On The Relationship
No ratings yet
A Novel Optimization Hardness Indicator Based On The Relationship
17 pages
Genetic Algorithm for Association Rule Mining
No ratings yet
Genetic Algorithm for Association Rule Mining
10 pages
Genetic Algorithms For The Traveling Salesman Problem-8
No ratings yet
Genetic Algorithms For The Traveling Salesman Problem-8
32 pages
A Survey of Fitness Approximation Methods Applied in Evolutionary Algorithms
No ratings yet
A Survey of Fitness Approximation Methods Applied in Evolutionary Algorithms
26 pages
Unit 2 - Genetic Algorithms
No ratings yet
Unit 2 - Genetic Algorithms
46 pages
GADataMining CNA
No ratings yet
GADataMining CNA
73 pages
Chapter 4: Classification & Prediction
100% (1)
Chapter 4: Classification & Prediction
54 pages
Information Sciences: F.J. Berlanga, A.J. Rivera, M.J. Del Jesus, F. Herrera
No ratings yet
Information Sciences: F.J. Berlanga, A.J. Rivera, M.J. Del Jesus, F. Herrera
18 pages
Genetic Algorithms: Population & Fitness
No ratings yet
Genetic Algorithms: Population & Fitness
23 pages
Computational Intelligence Based Machine Fault Diagnosis: D. D. WANG, Debing YANG, Jinwu XU & Ke XU
No ratings yet
Computational Intelligence Based Machine Fault Diagnosis: D. D. WANG, Debing YANG, Jinwu XU & Ke XU
5 pages
1 s2.0 S1568494623005768 Main
No ratings yet
1 s2.0 S1568494623005768 Main
19 pages
Genetic Algorithms: Machine Learning and Data Mining
No ratings yet
Genetic Algorithms: Machine Learning and Data Mining
46 pages
1 s2.0 S1568494619302315 Main
No ratings yet
1 s2.0 S1568494619302315 Main
13 pages
Lect12-Rule Based Classifier
No ratings yet
Lect12-Rule Based Classifier
27 pages
Correlation-Based Feature Selection
No ratings yet
Correlation-Based Feature Selection
4 pages
Decision Tree Classifiers With Ga Based Feature Selection
No ratings yet
Decision Tree Classifiers With Ga Based Feature Selection
10 pages
An Introduction To Genetic Algorithms: Abstract
No ratings yet
An Introduction To Genetic Algorithms: Abstract
9 pages
Genetic Programming For Classification With Unbalanced Data (Part 1)
No ratings yet
Genetic Programming For Classification With Unbalanced Data (Part 1)
24 pages
Genetic Algorithms and Learning Rules
No ratings yet
Genetic Algorithms and Learning Rules
136 pages
Genetic Algorithm Based Feature Selection For Medical Diagnosing Using Artificial Neural Network
No ratings yet
Genetic Algorithm Based Feature Selection For Medical Diagnosing Using Artificial Neural Network
54 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
22 pages
Rule Acquisition in Data Mining Using A Self Adaptive Genetic Algorithm
No ratings yet
Rule Acquisition in Data Mining Using A Self Adaptive Genetic Algorithm
8 pages
Classification: Alternative Techniques: Md. Fazlul Karim Patwary IIT
No ratings yet
Classification: Alternative Techniques: Md. Fazlul Karim Patwary IIT
65 pages
1-2 The Problem 3-4 Proposed Solution 5-7 The Experiment 8-9 Experimental Results 10-11 Conclusion 12 References 13
No ratings yet
1-2 The Problem 3-4 Proposed Solution 5-7 The Experiment 8-9 Experimental Results 10-11 Conclusion 12 References 13
14 pages
Wind Power Fitness Function Calculation Based On Niche Genetic Algorithm
No ratings yet
Wind Power Fitness Function Calculation Based On Niche Genetic Algorithm
5 pages
Learning Acyclic Rules Based On Chaining Genetic Programming
No ratings yet
Learning Acyclic Rules Based On Chaining Genetic Programming
8 pages
2.1-Genetic Algorithms
No ratings yet
2.1-Genetic Algorithms
97 pages
A Sampling of Various Other Learning Methods
No ratings yet
A Sampling of Various Other Learning Methods
34 pages
Algoritmi Genetici
No ratings yet
Algoritmi Genetici
7 pages
Genetic Algorithm for Class Scheduling
No ratings yet
Genetic Algorithm for Class Scheduling
39 pages
Perplexity
No ratings yet
Perplexity
2 pages
BCG Press Release - AI at Work Report
No ratings yet
BCG Press Release - AI at Work Report
2 pages
Navigating Governmental Choice
No ratings yet
Navigating Governmental Choice
29 pages
Generative AI - More Than ChatGPT
100% (1)
Generative AI - More Than ChatGPT
33 pages
Blue Technological Artificial Intelligence Infographic
No ratings yet
Blue Technological Artificial Intelligence Infographic
1 page
An Efficient Detection Model For Brain Stroke Based On Transfer Learning
No ratings yet
An Efficient Detection Model For Brain Stroke Based On Transfer Learning
5 pages
RegGAN: Enhanced Medical Image Translation
No ratings yet
RegGAN: Enhanced Medical Image Translation
18 pages
Hyperrealism in Voices PsyArXiv
No ratings yet
Hyperrealism in Voices PsyArXiv
32 pages
Airport Baggage Handling Systems Using The Baggage Factory Approach To Support AI Optimisation, Decisions, and Design...
100% (2)
Airport Baggage Handling Systems Using The Baggage Factory Approach To Support AI Optimisation, Decisions, and Design...
234 pages
LightGlue: Fast Local Feature Matching
No ratings yet
LightGlue: Fast Local Feature Matching
18 pages
DL Unit 4&5
No ratings yet
DL Unit 4&5
27 pages
SURVEY QUESTIONNAIRE in Stats 22
No ratings yet
SURVEY QUESTIONNAIRE in Stats 22
2 pages
GenAI CheatSheet
No ratings yet
GenAI CheatSheet
2 pages
AI Project Cycle Notes
No ratings yet
AI Project Cycle Notes
7 pages
Generative AI Leader (ILT) - Course Opening and Module1 - Gen AI - Beyond The Chatbot
100% (1)
Generative AI Leader (ILT) - Course Opening and Module1 - Gen AI - Beyond The Chatbot
149 pages
Expert Intelligence - Apple Strategy Teardown
No ratings yet
Expert Intelligence - Apple Strategy Teardown
34 pages
Deep Learning Event Extraction Survey
No ratings yet
Deep Learning Event Extraction Survey
22 pages
Interview Preparation For Data Scientists
No ratings yet
Interview Preparation For Data Scientists
5 pages
Ait 317
No ratings yet
Ait 317
6 pages
2020UEA6591
No ratings yet
2020UEA6591
1 page
FDP Plan
No ratings yet
FDP Plan
3 pages
FCC 2228 Lesson 2-4 - 250303 - 201841
No ratings yet
FCC 2228 Lesson 2-4 - 250303 - 201841
5 pages
AI - Unit - 04
No ratings yet
AI - Unit - 04
28 pages
Twilight Struggle Solo
No ratings yet
Twilight Struggle Solo
16 pages
AI Master's Program with IBM
No ratings yet
AI Master's Program with IBM
23 pages
OECD Digital Transformation
No ratings yet
OECD Digital Transformation
19 pages
SCR0359PL Talent Release Agreement Non Exclusive Data
No ratings yet
SCR0359PL Talent Release Agreement Non Exclusive Data
5 pages
Seminar Smart Health Record Using Machine Learning Report
No ratings yet
Seminar Smart Health Record Using Machine Learning Report
23 pages
EEE385L - Project Report - Group 3
No ratings yet
EEE385L - Project Report - Group 3
44 pages
100 - Free AI Resources For Learning AI
100% (1)
100 - Free AI Resources For Learning AI
1 page

A Genetic Algorithm For Classification

Uploaded by

A Genetic Algorithm For Classification

Uploaded by

See discussions, stats, and author profiles for this publication at: [Link]

A genetic algorithm for classiﬁcation

Conference Paper · May 2011

MaternQual CEEX99 View project

The user has requested enhancement of the downloaded file.

A Genetic Algorithm for Classification

RAUL ROBU, ŞTEFAN HOLBAN

Key-Words: - genetic algorithm, data mining, classification

2 Fitness functions used in mining classification rules

Fitness=w1 x (CF x Comp) + w2 x Simp (1)

Fitness=tp / (tp+[Link]) * tn / (tn+[Link]) (2)

Fitness=w1 x (|A & C|-1/2) / |A| + w2 x CM (3)

CM= (L-n) / (L-1) (4)

w1C ( R) + w2 Pr edAcc + w3 RInt

C(R)=M- number of condition (R) (6)

Prediction accuracy is calculated with relation (7).

3 The genetic algorithm

IF attribute1=value1 and attribute2=value2 and… THEN class=c1

A chromosome is represented by a classification rule. A gene is represented by the value of an attribute.

Table 1 Codification of chromosomes

Fitness=w1*PA + w2 * CPH+w3* S (9)

• w1, w2 and w3 are user defined weights

Table 2 – Crossing chromosomes

nr_crom_cross =nr_corm_pop * pc ` (10)

Table 3 – Testing the algorithms

View publication stats

You might also like

Fitness=w1PA + w2 CPH+w3* S (9)