0% found this document useful (1 vote)
440 views20 pages

PR - L1-Introduction To Pattern Recognition PDF

Uploaded by

cooldoubtless
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
440 views20 pages

PR - L1-Introduction To Pattern Recognition PDF

Uploaded by

cooldoubtless
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

LECTURE 1: Course introduction

g Introduction
n Course organization
n Grading policy
n Outline
g What is Pattern Recognition?
n Definitions from the literature
n Related fields and applications
g Components of a Pattern Recognition system
n Pattern Recognition problems
n Features and Patterns
n The Pattern Recognition design cycle
g Pattern Recognition approaches
n Statistical
n Neural
n Structural

Introduction to Pattern Analysis 1


Ricardo Gutierrez-Osuna
Texas A&M University
Course organization
g Instructor
n Ricardo Gutierrez-Osuna
n Office: 520A HRRB
n Tel: (979) 845-2942
n E-mail: [email protected]
n URL: http://faculty.cs.tamu.edu/rgutier/
g Grading
n Homework
g 3 assignments, every 3 weeks
Weight (%)
n Tests
Homework 30
g 1 midterm, 1 final (comprehensive)
Project 40
n Term project Midterm 15
g Open-ended Final Exam 15
g Public presentation

Introduction to Pattern Analysis 2


Ricardo Gutierrez-Osuna
Texas A&M University
Course outline
g Introduction (3 lectures)
n Pattern Recognition
n Probability and Statistics
n MATLAB and Linear Algebra
g Statistical Pattern Recognition (7 lectures)
n Bayesian Decision Theory
n Quadratic Classifiers
n Parameter and Density Estimation
n Nearest Neighbors
n Linear Discriminants
n Validation
g Dimensionality Reduction (4 lectures)
n Principal Components Analysis
n Fishers Discriminants Analysis
n Feature Subset Selection
g Clustering (3 lectures)
n Mixture models
n Hierarchical Clustering
n On-line Clustering
n Self-Organizing Maps
g Neural network approaches (4 lectures)
n Multilayer Perceptrons
n Radial Basis Functions
n Associative Memories
g Advanced topics (5 lectures)
n Support Vector Machines
n Hidden Markov Models
n Ensemble Learning

Introduction to Pattern Analysis 3


Ricardo Gutierrez-Osuna
Texas A&M University
What is pattern recognition?
g Definitions from the literature
n The assignment of a physical object or event to one of several
pre-specified categories Duda and Hart
n A problem of estimating density functions in a high-dimensional
space and dividing the space into the regions of categories or
classes Fukunaga
n Given some examples of complex signals and the correct
decisions for them, make decisions automatically for a stream of
future examples Ripley
n The science that concerns the description or classification
(recognition) of measurements Schalkoff
n The process of giving names to observations x, Schrmann
n Pattern Recognition is concerned with answering the question
What is this? Morse

Introduction to Pattern Analysis 4


Ricardo Gutierrez-Osuna
Texas A&M University
Examples of pattern recognition problems
g Machine vision
n Visual inspection, ATR
n Imaging device detects ground target
n Classification into friend or foe
g Character recognition
n Automated mail sorting, processing bank
checks
n Scanner captures an image of the text
n Image is converted into constituent
characters
g Computer aided diagnosis
n Medical imaging, EEG, ECG signal
analysis
n Designed to assist (not replace)
physicians
n Example: X-ray mammography
g 10-30% false negatives in x-ray
mammograms
g 2/3 of these could be prevented with
proper analysis
g Speech recognition
n Human Computer Interaction, Universal 0.2

Access 0.1

n Microphone records acoustic signal 0

-0.1
n Speech signal is classified into phonemes -0.2
and/or words -0.3
1 2 3 4 5 x 104 samples

Introduction to Pattern Analysis 5


Ricardo Gutierrez-Osuna
Texas A&M University
Related fields and application areas for PR
g Related fields g Applications
n Adaptive Signal Processing n Image Preprocessing / Segmentation
n Machine Learning n Computer Vision
n Artificial Neural Networks n Speech Recognition
n Robotics and Vision n Automated Target Recognition
n Cognitive Sciences n Optical Character Recognition
n Mathematical Statistics n Seismic Analysis
n Nonlinear Optimization n Man and Machine Diagnostics
n Exploratory Data Analysis n Fingerprint Identification
n Fuzzy and Genetic systems n Industrial Inspection
n Detection and Estimation Theory n Financial Forecast
n Formal Languages n Medical Diagnosis
n Structural Modeling n ECG Signal Analysis
n Biological Cybernetics
n Computational Neuroscience

Introduction to Pattern Analysis 6


Ricardo Gutierrez-Osuna
Texas A&M University
Components of a pattern recognition system
g A basic pattern classification system contains
n A sensor
n A preprocessing mechanism
n A feature extraction mechanism (manual or automated)
n A classification algorithm
n A set of examples (training set) already classified or described

Measuring Dimensionality Model


Preprocessing reduction Prediction selection
devices

r r r f2
u=v v
The Analysis
real world R results
R0 f1

Sensors Feature selection Cross-validation


Cameras Feature projection Bootstrap
Databases
Noise filtering Classification
Feature extraction Regression
Normalization Clustering
Description

Introduction to Pattern Analysis 7


Ricardo Gutierrez-Osuna
Texas A&M University
Types of prediction problems
g Classification
n The PR problem of assigning an object to a class
n The output of the PR system is an integer label
g e.g. classifying a product as good or bad in a quality control test
g Regression
n A generalization of a classification task
n The output of the PR system is a real-valued number
g e.g. predicting the share value of a firm based on past performance and
stock market indicators
g Clustering
n The problem of organizing objects into meaningful groups
n The system returns a (sometimes hierarchical) grouping of objects
g e.g. organizing life forms into a taxonomy of species
g Description
n The problem of representing an object in terms of a series of primitives
n The PR system produces a structural or linguistic description
g e.g. labeling an ECG signal in terms of P, QRS and T complexes

Introduction to Pattern Analysis 8


Ricardo Gutierrez-Osuna
Texas A&M University
Features and patterns (1)
g Feature
n Feature is any distinctive aspect, quality or characteristic
g Features may be symbolic (i.e., color) or numeric (i.e., height)
n Definitions
g The combination of d features is represented as a d-dimensional column vector called a feature vector
g The d-dimensional space defined by the feature vector is called the feature space
g Objects are represented as points in feature space. This representation is called a scatter plot

x3

Feature 2
Class 1

x1 Class 3
x
x
x = 2


x d
x1 x2
Class 2

Feature 1

Feature vector Feature space (3D) Scatter plot (2D)


g Pattern
n Pattern is a composite of traits or features characteristic of an individual
n In classification tasks, a pattern is a pair of variables {x,} where
g x is a collection of observations or features (feature vector)
g is the concept behind the observation (label)

Introduction to Pattern Analysis 9


Ricardo Gutierrez-Osuna
Texas A&M University
Features and patterns (2)
g What makes a good feature vector?
n The quality of a feature vector is related to its ability to discriminate examples
from different classes
g Examples from the same class should have similar feature values
g Examples from different classes have different feature values

Good features Bad features

g More feature properties

Linear separability Non-linear separability Highly correlated features Multi-modal

Introduction to Pattern Analysis 10


Ricardo Gutierrez-Osuna
Texas A&M University
Classifiers
g The task of a classifier is to partition feature
space into class-labeled decision regions R1
R1
n Borders between decision regions are called R2 R3
decision boundaries
R3 R2
n The classification of feature vector x consists of R4
determining which decision region it belongs to,
and assign x to this class

g A classifier can be represented as a set of discriminant functions


n The classifier assigns a feature vector x to class if gi ( x ) > g j ( x ) j i

Class assignment

Select
Select max
max

Costs
Costs

Discriminant functions gg11(x)


(x) gg22(x)
(x) ggCC(x)
(x)

Features xx11 xx22 xx33 xxdd

Introduction to Pattern Analysis 11


Ricardo Gutierrez-Osuna
Texas A&M University
Pattern recognition approaches
g Statistical (StatPR)
n Patterns classified based on an underlying statistical model of the features
g The statistical model is defined by a family of class-conditional probability density
functions Pr(x|ci) (Probability of feature vector x given class ci)
g Neural (NeurPR)
n Classification is based on the response of a network of processing units
(neurons) to an input stimuli (pattern)
g Knowledge is stored in the connectivity and strength of the synaptic weights
n NeurPR is a trainable, non-algorithmic, black-box strategy
n NeurPR is very attractive since
g it requires minimum a priori knowledge
g with enough layers and neurons, an ANN can create any complex decision region
g Syntactic (SyntPR)
n Patterns classified based on measures of structural similarity
g Knowledge is represented by means of formal grammars or relational descriptions
(graphs)
n SyntPR is used not only for classification, but also for description
g Typically, SyntPR approaches formulate hierarchical descriptions of complex patterns
built up from simpler sub patterns

Introduction to Pattern Analysis 12


Ricardo Gutierrez-Osuna
Texas A&M University
Example: neural, statistical and structural OCR

A
From [Schalkoff, 1992]

Neural* Statistical Structural

Feature extraction:
# intersections
# right oblique lines +
# left oblique lines
# horizontal lines
# holes
+
0 Probabilistic
x 2 = [3 2 1 2 1] model
T
1 p(x 2 |" A" )

1
+
P(f1, f2| i)
0
0
x 1=
1
M

0 To
1 x3 =
parser
Fe


a
tu

0 *Neural approaches may also


re

e #1
#2

employ feature extraction Featur

Introduction to Pattern Analysis 13


Ricardo Gutierrez-Osuna
Texas A&M University
A simple pattern recognition problem
g Consider the problem of recognizing the letters L,P,O,E,Q
n Determine a sufficient set of features
n Design a tree-structured classifier

Start

Features YES NO
V>0?
Character Vertical Horizontal Oblique
Curved
straight straight straight
lines
lines lines lines
L 1 1 0 0 YES
C>0? YES NO
P 1 0 0 1 O>0?
O 0 0 0 1
E 1 3 0 0 YES NO
P H>0? Q O
Q 0 0 1 1

E L

Introduction to Pattern Analysis 14


Ricardo Gutierrez-Osuna
Texas A&M University
The pattern recognition design cycle (1)
g Data collection
n Probably the most time-intensive component of a PR project
n How many examples are enough?
g Feature choice
n Critical to the success of the PR problem
g Garbage in, garbage out
n Requires basic prior knowledge
g Model choice
n Statistical, neural and structural approaches
n Parameter settings
g Training
n Given a feature set and a blank model, adapt the model to explain the data
n Supervised, unsupervised and reinforcement learning
g Evaluation
n How well does the trained model do?
n Overfitting vs. generalization

Introduction to Pattern Analysis 15


Ricardo Gutierrez-Osuna
Texas A&M University
The pattern recognition design cycle (2)
g Consider the following scenario
n A fish processing plan wants to automate the process of sorting incoming fish
according to species (salmon or sea bass)
n The automation system consists of
g a conveyor belt for incoming products
g two conveyor belts for sorted products
g a pick-and-place robotic arm
g a vision system with an overhead CCD camera
g a computer to analyze images and control the robot arm

CCD
camera
Conveyor
belt (salmon)

computer
Conveyor
belt

Robot
arm Conveyor
belt (bass)

From [Duda, Hart and Stork, 2001]


Introduction to Pattern Analysis 16
Ricardo Gutierrez-Osuna
Texas A&M University
The pattern recognition design cycle (3)
g Sensor
n The vision system captures an image as a new fish enters the sorting area
g Preprocessing
n Image processing algorithms
g adjustments for average intensity levels
g segmentation to separate fish from background
g Feature Extraction
n Suppose we know that, on the average, sea bass is larger than salmon
g From the segmented image we estimate the length of the fish

g Classification
n Collect a set of examples from both species count
Decision
boundary

n Compute the distribution of lengths for both


classes
n Determine a decision boundary (threshold) Salmon Sea bass

that minimizes the classification error


n We estimate the classifiers probability of
error and obtain a discouraging result of 40% length
n What do we do now?

Introduction to Pattern Analysis 17


Ricardo Gutierrez-Osuna
Texas A&M University
The pattern recognition design cycle (4)
g Improving the performance of our PR system
n Determined to achieve a recognition rate of 95%, we try a number of features
g Width, Area, Position of the eyes w.r.t. mouth...
g only to find out that these features contain no discriminatory information
n Finally we find a good feature: average intensity of the scales

Decision
count boundary

Sea bass Salmon

Avg. scale intensity

length
Decision
boundary
n We combine length and average
intensity of the scales to improve class
separability
n We compute a linear discriminant function
to separate the two classes, and obtain a
classification rate of 95.7% Sea bass Salmon

Avg. scale intensity

Introduction to Pattern Analysis 18


Ricardo Gutierrez-Osuna
Texas A&M University
The pattern recognition design cycle (5)
g Cost Versus Classification rate
n Our linear classifier was designed to minimize the overall misclassification rate
n Is this the best objective function for our fish processing plant?
g The cost of misclassifying salmon as sea bass is that the end customer will
occasionally find a tasty piece of salmon when he purchases sea bass
g The cost of misclassifying sea bass as salmon is an end customer upset when he
finds a piece of sea bass purchased at the price of salmon
n Intuitively, we could adjust the decision boundary to minimize this cost function

New

length
length

Decision
boundary Decision
boundary

Sea bass Salmon Sea bass


Salmon
Avg. scale intensity Avg. scale intensity

Introduction to Pattern Analysis 19


Ricardo Gutierrez-Osuna
Texas A&M University
The pattern recognition design cycle (6)
g The issue of generalization
n The recognition rate of our linear classifier (95.7%) met the design specs, but we
still think we can improve the performance of the system
g We then design an artificial neural network with five hidden layers, a combination of
logistic and hyperbolic tangent activation functions, train it with the Levenberg-Marquardt
algorithm and obtain an impressive classification rate of 99.9975% with the following
decision boundary

length

Sea bass Salmon

Avg. scale intensity

n Satisfied with our classifier, we integrate the system and deploy it to the fish
processing plant
g After a few days, the plant manager calls to complain that the system is misclassifying an
average of 25% of the fish
g What went wrong?

Introduction to Pattern Analysis 20


Ricardo Gutierrez-Osuna
Texas A&M University

You might also like