Machine Learning
(ICS – 235)
Evolution of Machine Learning Inspired by Biological Learning
•Biological Inspiration:
• Machine Learning (ML) draws significant inspiration from how
living organisms learn and adapt to their environments.
• Early ML algorithms were modeled after neural processes
observed in the human brain.
•Neural Networks:
• Conceptualized based on the structure and function of
biological neurons.
• The foundation for deep learning, mirroring brain's layered
architecture to process information.
Human Perception
• Humans have developed highly sophisticated skills for
sensing their environment and taking actions according to
what they observe, e.g.,
• Recognizing a face. (aprior knowledge)
• Understanding spoken words.
• Reading handwriting.
• Distinguishing fresh food from its smell.
We would like to give similar capabilities to machines
Human Perception
• Learn from the past experience.
• Pattern can be recognized based on their past
experience
Temple @Chicago, USA Temple @UDUPI
Taj Mahal @Agra
Cognitive Psychology
• Theoretical orientation emphasizing mental structures and processes.
• How sensory information is acquired, stored, transformed, and used.
• Mental activity or acquisition, storage, transformation, and use of
knowledge.
Human and Machine Perception
• When we develop pattern recognition algorithms.
• We are often influenced by the knowledge of how patterns
are modeled and recognized in nature.
• Pattern - complex composition of sensory stimuli that
the human observer may recognize.
• Issue - what cognitive mechanisms need to be
inferred to describe this process of recognition?
Machine Learning?
• Machine Learning is the study of how machines can:
• represent the environment,
• learn to distinguish patterns of interest,
• make reasonable decisions about the categories of the
patterns
Definition:
• Machine Learning (ML) is a subset of artificial
intelligence (AI) focused on developing algorithms
and statistical models that enable computers to
perform tasks without explicit instructions.
• ML systems learn from and make predictions or
decisions based on data.
• Data is extracted from an object.
Introduction
• A physical object is an abstract notion.
• It can be represented by a set of descriptors:
• Example: A man is a pattern/object: color, height, weight etc
• A ball is a pattern/object: shape, color, size etc.
Introduction
• A abstract of a pattern will be in the form of a feature set.
• A pattern is represented as a vector of feature values.
• The features which are used to represent patterns are
important.
• But, how to choose the most discriminative features to
create data abstraction ?
Example:
• Let us take an example, where humans are categorized into two
groups “tall” and “short”?
• The classes are represented using the feature “weight” as shown in
the table: Weight of Class Label
Human (in Kgs)
40 Tall
50 Short
60 Tall
70 Short
• If a newly en-countered person weight is 46 kg ?
• Then, its difficult to make proper decision.
• This attribute may not represent the data appropriately.
Pattern and class:
Example 1:
• Classification of flowers, features used Petal width, Petal length, Sepal width,
Sepal length
Example 2:
• Classification of Plant, to monitor the growth of a plant:
Example 3:
• Classification of CAT or DOG
CAT Score
DOG Score
VC Dimension:
Role of VC Dimension in Classification:
•VC Dimension (Vapnik-Chervonenkis Dimension):
• A measure of a model's capacity to classify data points
correctly.
• Indicates the maximum number of points that can be
shattered (classified correctly in all possible ways) by the
hypothesis space of the model.
VC Dimension:
• The VC dimension is a measure of the capacity or complexity of a set of
functions that can be learned by a model.
• It is a concept from statistical learning theory that helps to understand the
model's ability to fit various patterns in the data.
• The VC dimension is defined as the maximum number of points that can
be shattered by the model.
Consider a simple example with a linear classifier in two-dimensional space
• For any two points, we can label them in four possible ways (00, 01, 10, 11).
• A line can always be drawn to separate these points correctly according to any of these labelings
• For three points, there are (2^3 = 8) possible labelings.
• However, a line can correctly separate these points for any labeling except for the case where the
points are collinear and labeled alternatively (like 010 or 101).
Hence, a linear classifier can shatter some, but not all.
Consider a simple example with a linear classifier in 2D space
For four points, there are (2^4 = 16) possible labelings.
A line can’t separate all these labelings correctly.
For example, if the four points form the vertices of a convex quadrilateral and
are labeled alternatively, no line can separate them.
Consider a simple example with a linear classifier in two-
dimensional space
• From this example, we can conclude that the VC dimension of a linear classifier in 2D is 3.
• This means a linear classifier can shatter any set of three points, but not all sets of four points
Solution for such problem:
• The VC dimension (Vapnik-Chervonenkis dimension) is a measure of the
capacity of a classification model.
• For axis-aligned squares in the plane, the VC-dimension is 4.
• This is because you can shatter (separate with every possible combination
of labels) any set of 4 points, but not 5 points, using axis-aligned squares.
Summary:
• VC Dimension is a crucial concept for understanding the capacity and
complexity of classification models in ML.
• Balancing the model complexity (VC dimension) with the amount of
training data and the inherent complexity of the data distribution is
crucial for building models that generalize well to new data
• The VC dimension helps to understand the trade-off between the
complexity of the model and its ability to generalize to unseen data
Limitations to VC Dimension
• However, there are some limitations to VC dimension.
• First, it only applies to binary classifiers and cannot be used for multi-
class classification or regression problems.
• Second, it assumes that the data is linearly separable, which is not
always the case in real-world datasets.
• Third, it does not take into account the distribution of the data and
the noise level in the dataset
Why VC Dimension Matters:
•Model Complexity:
• High VC Dimension implies a more complex model with a higher
capacity to fit the data.
• Low VC Dimension Indicates a simple model with limited capacity.
Such models may underfit the data if the patterns are complex.
•Overfitting and Underfitting:
• High VC Dimension can lead to overfitting, where the model learns
noise in the training data.
• Low VC Dimension can result in underfitting, where the model fails
to capture the underlying patterns.
Bias and Variance:
Underfitting, Overfitting, and Best Fitting in classification problem.
Underfitting, Overfitting, and Best Fitting in classification problems.
Activity 1:
• Instructions: Pl. answer in one or two words
1. An object can be represented by using _________
2. The data is grouped based on__________
3. Name any one application, which has pattern recognition______
4. Name any one soft computing tools used for classification_______
Activity 1:
• Instructions: Pl. answer in one or two words.
1. An object can be represented by using _Features______
2. The data is grouped based on_Similarity / Proximity
3. Name any one application, which has pattern recognition_Iris Detection/
Face Recog./ Gesture Recog. / Crop Grading / etc___
4. Name any one soft computing tools used for classification_SVM/ NN/
BNN/ DT/ RF/, ,,,
Learning Process:
• Learning has Two phase Process
1. Training/Learning:
Learning is hard and time consuming
System must be exposed to several examples of each class
Creates a “model” for each application
Once learned, it becomes natural
2. Detecting/Classifying
Features:
• Take a group of graphical objects
• Possible features:
• Shape
• Color
• Size ...
• Allows to group them into different classes
Feature Vector:
• Usually a single object can be represented using several features, e.g.:
• x1 = shape (e.g. no. of sides)
• x2 = size (e.g. some numeric value)
• x3 = color (e.g. rgb values) ...
• xd = some other (numeric) feature.
• X becomes a feature vector
• x is a point in a d-dimensional feature space.
Example of a 2D Feature Space
• x1 = shape (e.g. no. of sides)
• x2 = size (e.g. some numeric value)
Example of Applications:
Example of Applications:
Applications:
•Image and Speech Recognition:
• Identifying objects in images or converting speech to text.
•Natural Language Processing:
• Understanding and generating human language (e.g.,
chatbots, translation).
•Recommendation Systems:
• Suggesting products or content based on user behavior (e.g.,
Netflix, Amazon).
•Predictive Analytics:
• Forecasting future trends from historical data (e.g., stock
market, weather).
Applications List
1. Air traffic control
2. Animal behaviour
3. Appraisal and valuation of properties,
automobiles, etc
4. Betting on stock, sport, etc
5. Criminal sentencing
6. Complex physical and chemical processes
7. Data mining cleaning and validation
8. Digital marketing
9. Echo Patterns
[Link] modeling
37
Applications List
[Link] hiring
[Link] consultants
[Link] detection
[Link]/fingerprint/voice recognition
[Link]/River water levels
[Link] controls
[Link] diagnosis / Image annotation
[Link] research
[Link] composition
[Link] and chemical formulations
38
Applications List
[Link] inventories
[Link] of travel, staff and tasks
[Link] for games, business, etc
[Link] flow
[Link] prediction
Hybrid: Autonomous driving
39
Medical Imaging - Retinal Imaging
● Google partnered with Aravind Hospital in India for
detecting diabetic retinopathy
● Used dataset of 128,000 ophthalmologist-evaluated
images of the interiors of eyeballs
● CNN - U-net architecture
40
Driver Occupancy Monitoring System
- COVID 19
41
Digital Marketing
● Predicting customer behaviour
● Creating and understanding more sophisticated
buyer segments
● Marketing automation
● Sales forecasting
● Personalized content experience - Content
Curation
● Automatic content creation - Automatic
conversation
42
Game Design and Gaming
Environment
43
Music Composition
MAC-Net (Melody and Accompaniment Composer Network)
44
Word to Sense Embeddings
45