0% found this document useful (0 votes)
36 views51 pages

Lecture10 PatternRecognition 1 51

Uploaded by

Nemesis Ccc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views51 pages

Lecture10 PatternRecognition 1 51

Uploaded by

Nemesis Ccc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Pattern Recognition

1
Lecturer

Prof. Dr. Khaled M. Mahr


email: [email protected]
Contents

⬥ Introduction to Patterns and Pattern


Recognition
⬥ Clustering vs. Classification
⬥ Template Matching
⬥ Features
⬥ Classification
Object recognition and
interpretation
⬥ Having segmented regions of interest out of an image the next task
is to identify them. Identification is usually divided into two tasks:
evaluating suitable quantitative descriptors and then matching the
descriptors to known objects. In this lecture we will look at
matching the descriptors to known objects.
⬥ Sub topics
■ Pattern matching
■ Pattern classification
■ Decision theoretic methods
■ Correlation matching
■ Structural methods
Object recognition and interpretation

■ The principal goal of image analysis by computer is to endow a


machine with the capability to approximate a similar capability in
humans.
■ Thus, an automated image analysis system (e.g. on a robot) should
be capable of exhibiting a degree of intelligence.
■ What do we mean by intelligence?
- The capability to learn from examples and to generalize this
knowledge so that it will apply in new and different
circumstances.
- The ability to make inferences from incomplete information.
(Gonzalez and Woods)
Pattern

A pattern is the opposite of a chaos, it is an


entity that can be given a name

6
Recognition

⬥ Identification of a pattern as a member of a


category
■ Classification (Supervised: known categories)
■ Clustering (Unsupervised: learning categories)

7
Classification vs. Clustering

Category “A”

Category “B”

Clustering
Classification

8
Pattern Recognition

Given an input pattern, make a decision about


the “category” or “class” of the pattern

9
Pattern Class

⬥ A collection of similar (not necessarily identical)


objects
⬥ Intra-class variability
⬥ Inter-class similarity

10
Pattern Class
Intra-class variability

The letter “T” in different typefaces

Same face under different expression, pose, illumination 11


Pattern Class
Inter-class similarity

Characters that look similar

Identical twins

12
Object Recognition
■ We can define a pattern as an arrangement of descriptors (‘features’).
■ A pattern class is a set of patterns ω1, ω 2, …… ωn which share a
common set of characteristics.
■ Object recognition can be described as the task of automatically
assigning a pattern to a pattern class which is labelled with the object
name.

Three common structures are used to define a pattern :


• Vectors (for quantitative descriptions)
• Strings (for structural descriptions)
• Trees (for structural descriptions)

Here we will concentrate on the descriptor vector.


Object RecognitionApproaches

⬥ Statistical PR:
■ Pattern classes represented by statistical measures.
⬥ Structural (or Syntactic) PR:
■ Pattern classes represented by means of formal structures as
grammars, automata, strings, etc.

14
Template Matching

Image is converted into 12x12 bitmap.

15
Template Matching
Bitmap is represented by 12x12-matrix or by 144-vector
with 0 and 1 coordinates.
0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 0 1 1 0 0 0
0 0 0 0 1 1 0 1 1 0 0 0
0 0 0 0 1 0 0 0 1 0 0 0
0 0 0 1 1 0 0 0 1 1 0 0
0 0 0 1 1 0 0 0 1 1 0 0
0 0 1 1 1 1 1 1 1 1 1 0
0 0 1 1 0 0 0 0 0 1 1 0
0 1 1 0 0 0 0 0 0 1 1 0
0 1 1 0 0 0 0 0 0 0 1 1
1 1 0 0 0 0 0 0 0 0 1 1

16
Template Matching
Training samples – templates with corresponding class:

Template of the image to be recognized:


?

Algorithm:

17
Template Matching
Number of templates to store:
If fewer templates are stored, some images might not
be recognized.

Improvements?

18
Features

⬥ Features are the individual measurable properties of the


signal being observed.
⬥ The set of features used for pattern recognition is called
feature vector.
⬥ The number of used features is the dimensionality of
the feature vector.
⬥ n-dimensional feature vectors can be represented as
points in n-dimensional feature space

19
The Feature vector
Assuming we are using a set of descriptors x1, x2, …. xn to describe
a particular object then we can define the descriptor vector as

The use of discriminant analysis was pioneered by Fisher in 1936


and his test data on separating out three species of iris flowers is still
a classic illustration.

In this example the vector has two elements: petal width and length.
Clearly only the pattern class for Iris setosa is clearly discriminated.
Petal width versus length for 3
types of iris
Feature Vector

height

weight

Class 1 Class 1
Class 2 Class 2

22
Features
⬥ Feature extraction aims to create discriminative features good
for classification
⬥ Good Features
• Objects from the same class have similar feature values.
• Objects from different classes have different values.

“Good” features “Bad” features


23
Features

⬥ Use fewer features if possible


⬥ Use features that differentiate classes well
⬥ Character recognition example
■ Good features: aspect ratio, presence of loops
■ Bad features: number of black pixels, number of connected
components

24
Classifier

Identify the class of a given pattern

25
Classifier
Distance between Feature Vectors
⬥ Instead of finding template exactly matching input template look
at how close feature vectors are
⬥ Nearest neighbor classification algorithm:
1. Find template closest to the input Class 1
pattern.
Class 2
2. Classify pattern to the same class
as closest template.

26
Classifier
K-Nearest Neighbor Classifier
⬥ Use k nearest neighbors instead of 1 to classify pattern.

Class 1

Class 2

27
Classifier
A classifier partitions feature space X into class-labeled
regions such that:
and

The classification consists of determining to which region a feature vector x belongs to.
Borders between regions are called decision boundaries 28
A (Simplified )PR System
Two Modes:
Classification
Mode

test Feature
Preprocessing Classification
pattern Measurement

training Feature
pattern Preprocessing Extraction/ Learning
Selection

Training
Mode 29
Classification - Example
⬥ Sorting incoming Fish on a
conveyor according to
species using optical
sensing

⬥ Salmon or Sea Bass?

30
Classification - Example
⬥ Set up a camera and take some sample images to
extract features
■ Length
■ Lightness
■ Width
■ Number and shape of fins
■ Position of the mouth, etc…

31
Classification - Example

32
Classification - Example

The length is a poor feature alone!

Select the lightness as a possible feature.

33
Classification - Example

34
Classification - Example

Customers do not want sea bass in


their cans of salmon

35
Classification - Example
⬥ Decision boundary threshold and cost
relationship
⬥ Move our decision boundary toward
smaller values of lightness in order to
minimize the cost (reduce the number
of sea bass that are classified salmon!)

36
Classification - Example

FISH x = [x1, x2]

Lightness Width

37
Classification - Example

38
Classification - Example
⬥ We might add other features that are not correlated with the
ones we already have. A precaution should be taken not to
reduce the performance by adding such “noisy features”

⬥ Ideally, the best decision boundary should be the one which


provides an optimal performance

39
Classification - Example

Generalization?

40
Decision-Theoretic methods

■ The main problem is: how do we use such pattern arrangements in the
recognition stage of our image analysis system?
■ Decision-theoretic approaches to recognition are based on the use of
decision (or discriminant) functions.
■ If x = (x1, x2, …, xn)T is an n-dimensional pattern vector, for M
pattern classes ω1, ω2, …, ωM, the basic problem in decision-theoretic
pattern recognition is to find M decision functions d1(x),d2(x), … dM(x)
with the property that if pattern x belongs to class ωI then:
di(x) > dj(x) j = 1, 2, …, M; i ≠ j
■ The decision boundary separating class ωi from ωj is given by the
values for which di(x) = dj(x) i.e. di(x) - dj(x) = 0
■ So how do we find the required decision function?
Pattern matching
• To assign a specific vector to a particular class we need to define a
distance classifier.
• One approach (but not the only one) is to define the prototype of a
class by the mean vector representing all the instances of the class:

Where W is the number of classes

For an unknown vector x we can assign it to the class of its closest


prototype. Often we interpret ‘closest’ as the minimum Euclidean
distance:
Evaluating the Euclidean
decision boundary
For any vector a then

Since Dj(x) > 0 we can ignore the square root factor, then

Note that the first term is independent of ‘j’ and so choosing the smallest
value of ‘D’ is equivalent to choosing the largest value of the second
term which is normally taken as the decision variable.
The decision boundary

If the above function records a maximum for a particular ‘j’ then we can
assign the region with vector x to class ωj. If we have only two classes
‘j’ and ‘k’ we can describe the boundary for a minimum distance
between the classes as :

For two element vectors the decision boundary is a line, for three
element vectors the boundary is a plane. For more than three
elements it is a hyperplane.
Example of calculating a decision
boundary
In the original data of Fisher if the two classes Iris versicolor (ω1)
and Iris setosa (ω2) have class mean vectors of

Show that the equation of the decision boundary line is :

2.8x1 + 1.0x2 – 8.9 = 0


Solution

Similarly:

Hence:
Graphical representation
y

2.8x1+1.0x2-8.9=0
1.3

0.3

1.5 4.3 x
Minimum distance classifier works well when the distance
between means is large compared to the spread of each class
with respect to its mean.
Correlation matching

■ In some circumstances if we have available a copy of the object(s)


we are attempting to locate within an image we can bypass the use
of descriptors and attempt a direct correlation matching.
■ For example consider the task of finding all the occurrences of the
letter ‘a’ in the following image with the available standard mask.
image mask

An obvious approach would


be to pass the mask over the
image looking for points at
which the fit is best.
Correlation of f(x,y) and w(x,y) at
point
N (s,t)
x

Image origin
y

w(x-s,y-t)
M J
(s,t)

f(x,y)
Notes about previous figure
■ The origin of subimage w(x,y) is at its centre.
■ In its simplest form, the correlation between f(x,y) and w(x,y) is
given by:

■ As s and t are varied, w(x, y) moves around the image area giving one
value for c(s, t) for every position (s, t).
■ The maximum value of c(s, t) indicates the position where w(x, y) best
matches f(x, y).
■ Disadvantage: sensitive to changes in amplitude of f(x, y) and w(x, y).
Definition of correlation
coefficient
An approach frequently used to overcome this disadvantage is to
perform matching via the correlation coefficient given by :

Note when using this function both matrices must be the same
size. This can be achieved by padding out the w array with an
appropriate number of zeros.

You might also like