0% found this document useful (0 votes)
21 views24 pages

UNIT-2 Notes Part 2

Uploaded by

ashishguptarnq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views24 pages

UNIT-2 Notes Part 2

Uploaded by

ashishguptarnq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Concept learning can be formulated as a problem of

searching through a predefined space of potential

Concept hypotheses for the hypothesis that best fits the training
examples.

Learning OR

“A task of acquiring potential hypothesis (solution) that


best fits the given training examples”
In both artificial intelligence (AI) and human learning, concept learning helps in identifying the common features of a set
of objects or situations, forming generalizations that can be used to predict and classify future instances.

Here are a few examples where it is used:


In medical diagnosis, a doctor (or AI system) can learn the concept of a disease by understanding the symptoms (e.g.,
fever, cough, etc.). The learned concept helps make a quick and accurate diagnosis when presented with a new patient
showing similar symptoms.

❖ In email filtering, machine learning algorithms learn the concept of “spam” emails by analyzing features like certain
keywords, email structure, and sender information. Once the concept is learned, the system can automatically classify
incoming emails as spam.

❖ In weather prediction, concept learning can be used to identify patterns in weather data (temperature, humidity,
pressure, etc.). Once the concept of "storm" is learned, it can be used to predict when future storms are likely to occur
under similar conditions.

❖ When learning the concept of "furniture," you may be shown a few examples like chairs, tables, and sofas. Once the
concept is learned, you can recognize a new object as furniture even if you haven’t seen that specific object before.

Etc……
Concept Learning
Given that the attribute Sky has three possible
values, and that AirTemp, Humidity, Wind,
Water, and Forecast each have two possible
Instance
values, the instance space X contains exactly
Space
3 . 2 . 2 . 2 . 2 . 2 = 96 distinct instances.
Enjoy Sport – Hypothesis Representation
• Each hypothesis consists of a conjunction of constraints on the instance attributes.

• Each hypothesis will be a vector of six constraints, specifying the values of the six
attributes : (Sky, AirTemp, Humidity, Wind, Water, and Forecast)

• Each attribute will be:


? - indicating any value is acceptable for the attribute (don’t care)
single value – specifying a single required value (ex. Warm) (specific)
ø - indicating no value is acceptable for the attribute (no value)
Instance Space
Let’s assume there are two features F1 and F2 with F1 has A and B as possibilities and F2 as
X and Y as possibilities.
• F1 – > A, B
• F2 – > X, Y
Instance Space: (A, X), (A, Y), (B, X), (B, Y): 4 Examples
Hypothesis Space including ‘?’ and ‘ø’ :
(A, X), (A, Y), (A, ø), (A, ?), (B, X), (B, Y), (B, ø), (B, ?), (ø, X), (ø, Y), (ø, ø), (ø, ?), (?,
X), (?, Y), (?, ø), (?, ?) : 16
Hypothesis Space including ‘?’: (A, X), (A, Y), (A, ?), (B, X), (B, Y), (B, ?), (?, X), (?, Y),
(?, ?): 32 (9)
• A hypothesis:

Sky AirTemp Humidity Wind Water Forecast

<Sunny, ? , ? , Strong,?, Same>


Hypothesis
• The most general hypothesis – that every day is a
Representation
positive example <?, ?, ?, ?, ?, ?>

• The most specific hypothesis – that no day is a


positive example < ø , ø , ø , ø , ø , ø >
Given
Instances X: set of all possible days, each
described by the attributes

Enjoy Sport • Sky – (values: Sunny, Cloudy, Rainy)


• AirTemp – (values: Warm, Cold)
Concept • Humidity – (values: Normal, High)
Learning Task • Wind – (values: Strong, Weak)
• Water– (values: Warm, Cold)
• Forecast – (values: Same, Change)
Target Concept (Function) c : Enjoy Sport : X →
{Yes, No}

Hypotheses H: Each hypothesis is described by a

Enjoy Sport conjunction of constraints on the attributes.

Training Examples D: positive and negative


Concept examples of the target function
Learning Task
Determine

A hypothesis h in H such that h(x)=c(x) for all x


in D.
Enjoy Sport • Sky has 3 possible values, and other 5 attributes
have 2 possible values.
- Hypothesis • There are 96 (= 3.2.2.2.2.2) distinct instances in

Space X.

• There are 5120 (=5.4.4.4.4.4) syntactically


distinct hypotheses in H because of two more
values for attributes: ? and ø

• Every hypothesis containing one or more ø


symbols represents the empty set of instances;
that is, it classifies every instance as negative.

• There are 973 (= 1 + 4.3.3.3.3.3) semantically


distinct hypotheses in H.
Find-S Algorithm

The FIND-S algorithm is a concept learning algorithm used in machine learning


and artificial intelligence to learn a concept (hypothesis) from a set of training
examples. It's primarily employed in the context of "dichotomous" (binary)
concept learning, where the goal is to find the most specific hypothesis that
correctly classifies positive and negative examples.
Find-S Algorithm

The FIND-S algorithm works as follows:


1. Initialize the hypothesis to the most specific hypothesis possible, often denoted as ‘Փ’
h=< Փ Փ Փ ……>
1. Generalize the initial hypothesis for the first positive instance.
2. For each subsequent instances:
if it is positive instance ,
Check for each attribute value in the instance with the hypothesis ‘h’
if the attribute value is the same as the hypothesis value then do nothing
else change it to ‘?’ in hypothesis(h)
else for negative instance
ignore it.
Find-S Algorithm Example

Example

h0=     
h1=Sunny Warm Normal Strong Warm Same
Solution h2=Sunny Warm ? Strong Warm Same
h3=Sunny Warm ? Strong Warm Same
h4=Sunny Warm ? Strong ? ?
Example 2:

Solution: h= <>=9 yes ? Good ? ?>


• Limited to Boolean Functions
Find-S • Cannot Handle Noise
Algorithm • Deterministic Output
Disadvantages • Overfitting
• Binary Attributes Only
Find-S Algorithm Disadvantages

1. Limited to Boolean Functions: The FIND-S algorithm can only learn binary (yes/no)
concepts, making it unsuitable for more complex, multi-class problems.
2. Cannot Handle Noise: It struggles with noisy data, as it tries to precisely match each example,
which can lead to incorrect generalizations.
3. Deterministic Output: It produces a single deterministic hypothesis, which may not capture the
full complexity of the concept space.
4. Overfitting: FIND-S can overfit the training data by becoming overly specific, which may not
generalize well to new, unseen examples.
5. Binary Attributes Only: It's designed for binary attribute values and may not work with
continuous or multi-valued attributes without modifications.
Candidate Elimination Algorithm

The Candidate Elimination Algorithm is a concept learning approach that


maintains a set of consistent hypotheses and refines them as it encounters
training examples. It narrows down the hypothesis space by eliminating
inconsistent hypotheses. It is used for binary concept learning and is often
applied in machine learning and artificial intelligence.
Candidate Elimination Algorithm
The Candidate Elimination Algorithm finds all describable hypotheses that are consistent
with the observed training examples. In order to define this algorithm precisely, we begin
with a few basic definitions. First, let us say that a hypothesis is consistent with the training
examples if it correctly classifies these examples.

Definition: A hypothesis h is consistent with a set of training examples D if and only if h(x) =
c(x) for each example (x, c(x)) in D.
Candidate Elimination Algorithm
The Candidate Elimination Algorithm represents the set of all hypotheses consistent
with the observed training examples. This subset of all hypotheses is called the
version space with respect to the hypothesis space H and the training examples D,
because it contains all plausible versions of the target concept.

Definition: The version space, denoted V SH, D with respect to hypothesis space H
and training examples D, is the subset of hypotheses from H consistent with the
training examples in D.
Candidate
Elimination
Algorithm Example

You might also like