Detailed Notes on Linear Discriminant Functions
1. Introduction
Linear Discriminant Functions (LDF) are used to classify data points by finding a linear decision
boundary that separates different classes. They are widely applied in pattern recognition, machine
learning, and statistical classification.
2. Mathematical Formulation
A linear discriminant function is given by:
g(x) = w^T * x + w_0
where:
- x is the feature vector (input data),
- w is the weight vector (learned parameters),
- w_0 is the bias term.
The decision boundary is the hyperplane where g(x) = 0. For two classes, classification is done as:
If g(x) > 0, assign Class 1
If g(x) < 0, assign Class 2
3. Geometric Interpretation
The decision boundary is a hyperplane defined as:
w1*x1 + w2*x2 + ... + wn*xn + w0 = 0
For a 2D space, this represents a straight line, while for higher dimensions, it defines a hyperplane.
4. Fishers Linear Discriminant
Fishers Linear Discriminant aims to maximize the separation between two classes by projecting data
onto a single axis. The optimal projection direction w is determined by:
w = S_w^(-1) * (m1 - m2)
where:
- S_w is the within-class scatter matrix,
- m1 and m2 are the means of the two classes.
5. Linear Discriminant Analysis (LDA)
LDA extends Fishers Linear Discriminant to multiple classes. It assumes that classes have a
Gaussian distribution with shared covariance. LDA finds a projection that maximizes class
separability.
6. Training a Linear Discriminant Classifier
Training involves:
1. Computing class means and covariance.
2. Finding the optimal projection w.
3. Transforming data using w.
4. Applying a threshold to classify new samples.
7. Optimization and Regularization
To handle overfitting and improve generalization, techniques like ridge regression (L2 regularization)
and feature selection can be applied.
8. Applications in Machine Learning and Pattern Recognition
- Face recognition
- Handwritten digit recognition
- Medical diagnostics
- Speech recognition
9. Comparison with Other Classifiers
- Perceptron: Only works for linearly separable data.
- Support Vector Machines: Maximizes the margin between classes.
- Neural Networks: Can model complex decision boundaries but require more data.
10. Case Studies and Real-World Examples
Example: Fishers Linear Discriminant is used in the LDA algorithm for face recognition. It reduces
dimensionality while preserving discriminative information.