0% found this document useful (0 votes)
8 views6 pages

Find S Algorithm

The Find-S algorithm is a concept learning method in machine learning that iteratively refines a hypothesis based on positive training examples while ignoring negative ones. It starts with the most specific hypothesis and generalizes it by replacing differing attribute values with a 'don't care' symbol. Limitations include the inability to ensure hypothesis consistency and the lack of a backtracking technique for improving the hypothesis.

Uploaded by

rajkirannaidu123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views6 pages

Find S Algorithm

The Find-S algorithm is a concept learning method in machine learning that iteratively refines a hypothesis based on positive training examples while ignoring negative ones. It starts with the most specific hypothesis and generalizes it by replacing differing attribute values with a 'don't care' symbol. Limitations include the inability to ensure hypothesis consistency and the lack of a backtracking technique for improving the hypothesis.

Uploaded by

rajkirannaidu123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

What is Find-S Algorithm in Machine Learning?

1.​ Concept Learning


2.​ General Hypothesis
3.​ Specific Hypothesis
1. Concept Learning

●​ Training Data
●​ Target Concept
●​ Actual Data Objects

n the Find-S algorithm, the following symbols are commonly used to represent
different concepts and operations −
●​ ∅ (Empty Set) − This symbol represents the absence of any specific value
or attribute. It is often used to initialize the hypothesis as the most
specific concept.
●​ ? (Don't Care) − The question mark symbol represents a "don't care" or
"unknown" value for an attribute. It is used when the hypothesis needs
to generalize over different attribute values that are present in positive
examples.
●​ Positive Examples (+) − The plus symbol represents positive examples,
which are instances labeled as the target class or concept being learned.
●​ Negative Examples (-) − The minus symbol represents negative
examples, which are instances labeled as non-target classes or concepts
that should not be covered by the hypothesis.
●​ Hypothesis (h) − The variable h represents the hypothesis, which is the
learned concept or generalization based on the training data. It is refined
iteratively throughout the algorithm
●​ Initialization − The algorithm starts with the most specific hypothesis,
denoted as h. This initial hypothesis is the most restrictive concept and
typically assumes no positive examples. It may be represented as h =
<∅, ∅, ..., ∅>, where ∅ denotes "don't care" or "unknown" values for
each attribute.
●​ Iterative Process − The algorithm iterates through each training
example and refines the hypothesis based on whether the example is
positive or negative.
o​ For each positive training example (an example labeled as the
target class), the algorithm updates the hypothesis by
generalizing it to include the attributes of the example. The
hypothesis becomes more general as it covers more positive
examples.
o​ For each negative training example (an example labeled as a
non-target class), the algorithm ignores it as the hypothesis
should not cover negative examples. The hypothesis remains
unchanged for negative examples.
●​ Generalization − After processing all the training examples, the
algorithm produces a final hypothesis that covers all positive examples
while excluding negative examples. This final hypothesis represents the
generalized concept that the algorithm has learned from the training
data.
2. General Hypothesis
Hypothesis, in general, is an explanation for something. The general hypothesis
basically states the general relationship between the major variables. For
example, a general hypothesis for ordering food would be I want a burger.
G = { ‘?’, ‘?’, ‘?’, …..’?’}
3. Specific Hypothesis

The specific hypothesis fills in all the important details about the variables
given in the general hypothesis. The more specific details into the example
given above would be I want a cheeseburger with a chicken pepperoni filling
with a lot of lettuce.
S = {‘Φ’,’Φ’,’Φ’, ……,’Φ’}
Now ,let’s talk about the Find-S Algorithm in Machine Learning.
The Find-S algorithm follows the steps written below:
1.​ Initialize ‘h’ to the most specific hypothesis.
2.​ The Find-S algorithm only considers the positive examples and eliminates
negative examples. For each positive example, the algorithm checks for
each attribute in the example. If the attribute value is the same as the
hypothesis value, the algorithm moves on without any changes. But if
the attribute value is different than the hypothesis value, the algorithm
changes it to ‘?’.
How Does It Work?

1.​ The process starts with initializing ‘h’ with the most specific hypothesis,
generally, it is the first positive example in the data set.
2.​ We check for each positive example. If the example is negative, we will
move on to the next example but if it is a positive example we will
consider it for the next step.
3.​ We will check if each attribute in the example is equal to the hypothesis
value.
4.​ If the value matches, then no changes are made.
5.​ If the value does not match, the value is changed to ‘?’.
6.​ We do this until we reach the last positive example in the data set.
Limitations of Find-S Algorithm
There are a few limitations of the Find-S algorithm listed down below:
1.​ There is no way to determine if the hypothesis is consistent throughout
the data.
2.​ Inconsistent training sets can actually mislead the Find-S algorithm, since
it ignores the negative examples.
3.​ Find-S algorithm does not provide a backtracking technique to determine
the best possible changes that could be done to improve the resulting
hypothesis.

Implementation of Find-S Algorithm

Time Weather Temperature Company Humidity Wind Goes

Morning Sunny Warm Yes Mild Strong Yes

Evening Rainy Cold No Mild Normal No

Morning Sunny Moderate Yes Normal Normal Yes

Evening Sunny Cold Yes High Strong Yes


Looking at the data set, we have six attributes and a final attribute that defines
the positive or negative example. In this case, yes is a positive example, which
means the person will go for a walk.
So now, the general hypothesis is:
h0 = {‘Morning’, ‘Sunny’, ‘Warm’, ‘Yes’, ‘Mild’, ‘Strong’}
This is our general hypothesis, and now we will consider each example one by
one, but only the positive examples.
h1= {‘Morning’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}
h2 = {‘?’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}
We replaced all the different values in the general hypothesis to get a resultant
hypothesis.

1 import pandas as pd
2 import numpy as np
3
4 #to read the data in the csv file
5 data = pd.read_csv("data.csv")
6 print(data,"n")
7
8 #making an array of all the attributes
9 d = np.array(data)[:,:-1]
10 print("n The attributes are: ",d)
11
12 #segragating the target that has positive and negative examples
13 target = np.array(data)[:,-1]
14 print("n The target is: ",target)
15
16 #training function to implement find-s algorithm
17 def train(c,t):
18 for i, val in enumerate(t):
19 if val == "Yes":
20 specific_hypothesis = c[i].copy()
21 break
22
23 for i, val in enumerate(c):
24 if t[i] == "Yes":
25 for x in range(len(specific_hypothesis)):
26 if val[x] != specific_hypothesis[x]:
27 specific_hypothesis[x] = '?'
28 else:
29 pass
30
31 return specific_hypothesis
32
33 #obtaining the final hypothesis
34 print("n The final hypothesis is:",train(d,target))
Output:

You might also like