0% found this document useful (0 votes)

33 views45 pages

C1 - Introduction To ML

The document discusses the evolution of Machine Learning (ML) inspired by biological learning processes, particularly through neural networks that mimic human brain functions. It covers key concepts such as pattern recognition, VC dimension, and the importance of feature selection in classification tasks, while also highlighting various applications of ML in fields like image recognition, natural language processing, and medical diagnosis. Additionally, it addresses the trade-offs between model complexity and generalization, emphasizing the significance of understanding underfitting and overfitting in machine learning models.

Uploaded by

Priyanka Rajput

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views45 pages

C1 - Introduction To ML

Uploaded by

Priyanka Rajput

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Machine Learning

(ICS – 235)
Evolution of Machine Learning Inspired by Biological Learning

•Biological Inspiration:
• Machine Learning (ML) draws significant inspiration from how
living organisms learn and adapt to their environments.

• Early ML algorithms were modeled after neural processes

observed in the human brain.

•Neural Networks:
• Conceptualized based on the structure and function of
biological neurons.

• The foundation for deep learning, mirroring brain's layered

architecture to process information.
Human Perception
• Humans have developed highly sophisticated skills for
sensing their environment and taking actions according to
what they observe, e.g.,
• Recognizing a face. (aprior knowledge)
• Understanding spoken words.
• Reading handwriting.
• Distinguishing fresh food from its smell.

We would like to give similar capabilities to machines

Human Perception
• Learn from the past experience.

• Pattern can be recognized based on their past

experience

Temple @Chicago, USA Temple @UDUPI

Taj Mahal @Agra
Cognitive Psychology

• Theoretical orientation emphasizing mental structures and processes.

• How sensory information is acquired, stored, transformed, and used.

• Mental activity or acquisition, storage, transformation, and use of

knowledge.
Human and Machine Perception
• When we develop pattern recognition algorithms.
• We are often influenced by the knowledge of how patterns
are modeled and recognized in nature.

• Pattern - complex composition of sensory stimuli that

the human observer may recognize.

• Issue - what cognitive mechanisms need to be

inferred to describe this process of recognition?
Machine Learning?
• Machine Learning is the study of how machines can:
• represent the environment,
• learn to distinguish patterns of interest,
• make reasonable decisions about the categories of the
patterns
Definition:
• Machine Learning (ML) is a subset of artificial
intelligence (AI) focused on developing algorithms
and statistical models that enable computers to
perform tasks without explicit instructions.

• ML systems learn from and make predictions or

decisions based on data.

• Data is extracted from an object.

Introduction
• A physical object is an abstract notion.
• It can be represented by a set of descriptors:
• Example: A man is a pattern/object: color, height, weight etc

• A ball is a pattern/object: shape, color, size etc.

Introduction
• A abstract of a pattern will be in the form of a feature set.

• A pattern is represented as a vector of feature values.

• The features which are used to represent patterns are

important.

• But, how to choose the most discriminative features to

create data abstraction ?
Example:
• Let us take an example, where humans are categorized into two
groups “tall” and “short”?
• The classes are represented using the feature “weight” as shown in
the table: Weight of Class Label
Human (in Kgs)
40 Tall
50 Short
60 Tall
70 Short

• If a newly en-countered person weight is 46 kg ?

• Then, its difficult to make proper decision.
• This attribute may not represent the data appropriately.
Pattern and class:
Example 1:
• Classification of flowers, features used Petal width, Petal length, Sepal width,
Sepal length
Example 2:
• Classification of Plant, to monitor the growth of a plant:
Example 3:

• Classification of CAT or DOG

CAT Score

DOG Score
VC Dimension:
Role of VC Dimension in Classification:

•VC Dimension (Vapnik-Chervonenkis Dimension):

• A measure of a model's capacity to classify data points
correctly.

• Indicates the maximum number of points that can be

shattered (classified correctly in all possible ways) by the
hypothesis space of the model.
VC Dimension:
• The VC dimension is a measure of the capacity or complexity of a set of
functions that can be learned by a model.

• It is a concept from statistical learning theory that helps to understand the

model's ability to fit various patterns in the data.

• The VC dimension is defined as the maximum number of points that can

be shattered by the model.
Consider a simple example with a linear classifier in two-dimensional space

• For any two points, we can label them in four possible ways (00, 01, 10, 11).
• A line can always be drawn to separate these points correctly according to any of these labelings

• For three points, there are (2^3 = 8) possible labelings.

• However, a line can correctly separate these points for any labeling except for the case where the
points are collinear and labeled alternatively (like 010 or 101).

Hence, a linear classifier can shatter some, but not all.

Consider a simple example with a linear classifier in 2D space

For four points, there are (2^4 = 16) possible labelings.

A line can’t separate all these labelings correctly.

For example, if the four points form the vertices of a convex quadrilateral and
are labeled alternatively, no line can separate them.
Consider a simple example with a linear classifier in two-
dimensional space
• From this example, we can conclude that the VC dimension of a linear classifier in 2D is 3.

• This means a linear classifier can shatter any set of three points, but not all sets of four points
Solution for such problem:
• The VC dimension (Vapnik-Chervonenkis dimension) is a measure of the
capacity of a classification model.

• For axis-aligned squares in the plane, the VC-dimension is 4.

• This is because you can shatter (separate with every possible combination
of labels) any set of 4 points, but not 5 points, using axis-aligned squares.
Summary:

• VC Dimension is a crucial concept for understanding the capacity and

complexity of classification models in ML.

• Balancing the model complexity (VC dimension) with the amount of

training data and the inherent complexity of the data distribution is
crucial for building models that generalize well to new data

• The VC dimension helps to understand the trade-off between the

complexity of the model and its ability to generalize to unseen data
Limitations to VC Dimension

• However, there are some limitations to VC dimension.

• First, it only applies to binary classifiers and cannot be used for multi-
class classification or regression problems.

• Second, it assumes that the data is linearly separable, which is not

always the case in real-world datasets.

• Third, it does not take into account the distribution of the data and
the noise level in the dataset
Why VC Dimension Matters:
•Model Complexity:
• High VC Dimension implies a more complex model with a higher
capacity to fit the data.
• Low VC Dimension Indicates a simple model with limited capacity.
Such models may underfit the data if the patterns are complex.

•Overfitting and Underfitting:

• High VC Dimension can lead to overfitting, where the model learns
noise in the training data.
• Low VC Dimension can result in underfitting, where the model fails
to capture the underlying patterns.
Bias and Variance:
Underfitting, Overfitting, and Best Fitting in classification problem.
Underfitting, Overfitting, and Best Fitting in classification problems.
Activity 1:
• Instructions: Pl. answer in one or two words

1. An object can be represented by using _________

2. The data is grouped based on__________

3. Name any one application, which has pattern recognition______

4. Name any one soft computing tools used for classification_______

Activity 1:
• Instructions: Pl. answer in one or two words.

1. An object can be represented by using _Features______

2. The data is grouped based on_Similarity / Proximity

3. Name any one application, which has pattern recognition_Iris Detection/

Face Recog./ Gesture Recog. / Crop Grading / etc___

4. Name any one soft computing tools used for classification_SVM/ NN/
BNN/ DT/ RF/, ,,,
Learning Process:
• Learning has Two phase Process
1. Training/Learning:
Learning is hard and time consuming
System must be exposed to several examples of each class
Creates a “model” for each application
Once learned, it becomes natural

2. Detecting/Classifying
Features:
• Take a group of graphical objects
• Possible features:
• Shape
• Color
• Size ...

• Allows to group them into different classes

Feature Vector:
• Usually a single object can be represented using several features, e.g.:
• x1 = shape (e.g. no. of sides)
• x2 = size (e.g. some numeric value)
• x3 = color (e.g. rgb values) ...
• xd = some other (numeric) feature.

• X becomes a feature vector

• x is a point in a d-dimensional feature space.
Example of a 2D Feature Space
• x1 = shape (e.g. no. of sides)
• x2 = size (e.g. some numeric value)
Example of Applications:
Example of Applications:
Applications:

•Image and Speech Recognition:

• Identifying objects in images or converting speech to text.
•Natural Language Processing:
• Understanding and generating human language (e.g.,
chatbots, translation).
•Recommendation Systems:
• Suggesting products or content based on user behavior (e.g.,
Netflix, Amazon).
•Predictive Analytics:
• Forecasting future trends from historical data (e.g., stock
market, weather).
Applications List

1. Air traffic control

2. Animal behaviour
3. Appraisal and valuation of properties,
automobiles, etc
4. Betting on stock, sport, etc
5. Criminal sentencing
6. Complex physical and chemical processes
7. Data mining cleaning and validation
8. Digital marketing
9. Echo Patterns
[Link] modeling
37
Applications List

[Link] hiring
[Link] consultants
[Link] detection
[Link]/fingerprint/voice recognition
[Link]/River water levels
[Link] controls
[Link] diagnosis / Image annotation
[Link] research
[Link] composition
[Link] and chemical formulations
38
Applications List

[Link] inventories
[Link] of travel, staff and tasks
[Link] for games, business, etc
[Link] flow
[Link] prediction

Hybrid: Autonomous driving

39
Medical Imaging - Retinal Imaging
● Google partnered with Aravind Hospital in India for
detecting diabetic retinopathy
● Used dataset of 128,000 ophthalmologist-evaluated
images of the interiors of eyeballs
● CNN - U-net architecture

40
Driver Occupancy Monitoring System
- COVID 19

41
Digital Marketing
● Predicting customer behaviour
● Creating and understanding more sophisticated
buyer segments
● Marketing automation
● Sales forecasting
● Personalized content experience - Content
Curation
● Automatic content creation - Automatic
conversation
42
Game Design and Gaming
Environment

43
Music Composition
MAC-Net (Melody and Accompaniment Composer Network)

44
Word to Sense Embeddings

ML Unit-1
No ratings yet
ML Unit-1
64 pages
Unit 1 ML - Ver 2
No ratings yet
Unit 1 ML - Ver 2
56 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
Unit 1 ML - Ver 2
No ratings yet
Unit 1 ML - Ver 2
56 pages
Aml CH.1
No ratings yet
Aml CH.1
11 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
Intro to Machine Learning Course
No ratings yet
Intro to Machine Learning Course
83 pages
VC Dimension
No ratings yet
VC Dimension
6 pages
VC Dimensions NG
No ratings yet
VC Dimensions NG
17 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Module 1 ML
No ratings yet
Module 1 ML
78 pages
Machine Learning Classifiers Overview
No ratings yet
Machine Learning Classifiers Overview
46 pages
Module 1
No ratings yet
Module 1
22 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
ML and DL
No ratings yet
ML and DL
15 pages
Lecture 5
No ratings yet
Lecture 5
12 pages
Optimization
No ratings yet
Optimization
95 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Lesson 4 - Introduction Machine Learning
No ratings yet
Lesson 4 - Introduction Machine Learning
44 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
Complete Unit-1 Merged
No ratings yet
Complete Unit-1 Merged
74 pages
Part 11 MD
No ratings yet
Part 11 MD
53 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Well-Posed Learning Problems in ML
No ratings yet
Well-Posed Learning Problems in ML
17 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
QB Unit 1
No ratings yet
QB Unit 1
6 pages
SWE622 Lecture 3 Classification
No ratings yet
SWE622 Lecture 3 Classification
57 pages
Visual Recognition
No ratings yet
Visual Recognition
123 pages
Unit 3
No ratings yet
Unit 3
100 pages
Data Analysis ch1
No ratings yet
Data Analysis ch1
13 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
58 pages
Machine Learning: VC-Dimension & Ensembles
No ratings yet
Machine Learning: VC-Dimension & Ensembles
7 pages
IV - AI & DS - AL3451 - ML - Unit1 - QB
No ratings yet
IV - AI & DS - AL3451 - ML - Unit1 - QB
5 pages
03 Machine Learning Overview
No ratings yet
03 Machine Learning Overview
24 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
40 pages
Mlunit 01
No ratings yet
Mlunit 01
25 pages
Exploring, Transforming, and Summarizing Input Datasets For Building Classification Models
No ratings yet
Exploring, Transforming, and Summarizing Input Datasets For Building Classification Models
21 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
ML 02 Dataset-Feature Selection PDF
No ratings yet
ML 02 Dataset-Feature Selection PDF
44 pages
ML 23 First Lectures 2 3 v0.1
No ratings yet
ML 23 First Lectures 2 3 v0.1
66 pages
Machine Learning
100% (2)
Machine Learning
104 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
AA2 Intro ML 2024
No ratings yet
AA2 Intro ML 2024
35 pages
Machine Learning and Econometrics
No ratings yet
Machine Learning and Econometrics
50 pages
Breast Cancer Classification Techniques
No ratings yet
Breast Cancer Classification Techniques
5 pages
Automated Cloud Particle Classification
No ratings yet
Automated Cloud Particle Classification
46 pages
CS 2 3 4 Aml
No ratings yet
CS 2 3 4 Aml
70 pages
Different Paradigms of Pattern Recognition
No ratings yet
Different Paradigms of Pattern Recognition
8 pages
Machine Learning
No ratings yet
Machine Learning
49 pages
Machine Learning Classification Techniques
No ratings yet
Machine Learning Classification Techniques
56 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
Ch8 - Main Memory
No ratings yet
Ch8 - Main Memory
67 pages
Ch7 - Deadlocks
No ratings yet
Ch7 - Deadlocks
42 pages
Chapter 5 - CPU (Process) Scheduling
No ratings yet
Chapter 5 - CPU (Process) Scheduling
42 pages
Ch5 - Process Synchronization
No ratings yet
Ch5 - Process Synchronization
35 pages
HP Color LaserJet Enterprise M455
No ratings yet
HP Color LaserJet Enterprise M455
6 pages
33 CFR Part 401 (Up To Date As of 2-05-2024)
No ratings yet
33 CFR Part 401 (Up To Date As of 2-05-2024)
60 pages
Geology for Petroleum Engineers
100% (1)
Geology for Petroleum Engineers
39 pages
Submersible Pump for Dirty Water
No ratings yet
Submersible Pump for Dirty Water
4 pages
Counter Blast
80% (5)
Counter Blast
12 pages
Pinout 2 F150
No ratings yet
Pinout 2 F150
1 page
Sandal Upper Assembly Guide
No ratings yet
Sandal Upper Assembly Guide
5 pages
Overseer Grade III
No ratings yet
Overseer Grade III
4 pages
Cat Cake Recipes for Special Occasions
No ratings yet
Cat Cake Recipes for Special Occasions
23 pages
Konale DPP
No ratings yet
Konale DPP
4 pages
Sinclair C5: Wikipedia Encyclopedia Britannica Wikipedia Wikipedia
No ratings yet
Sinclair C5: Wikipedia Encyclopedia Britannica Wikipedia Wikipedia
2 pages
Alexander the Great as Dhul-Qarnayn
No ratings yet
Alexander the Great as Dhul-Qarnayn
27 pages
Report Mrs. Rinki 01-May-2025
No ratings yet
Report Mrs. Rinki 01-May-2025
5 pages
Ops Manual - Trailer and Rail With Advance Micro
100% (1)
Ops Manual - Trailer and Rail With Advance Micro
70 pages
Parts List / Technical Guide: R15B/6R15C
No ratings yet
Parts List / Technical Guide: R15B/6R15C
22 pages
MSC Sustainable Fisheries 30062016
No ratings yet
MSC Sustainable Fisheries 30062016
81 pages
Green Energy Conference Registration 2013
No ratings yet
Green Energy Conference Registration 2013
2 pages
Unit 11 Our Greener World
No ratings yet
Unit 11 Our Greener World
3 pages
Pune University Recruitment List
No ratings yet
Pune University Recruitment List
6 pages
Indian Thinkers On Education
100% (1)
Indian Thinkers On Education
13 pages
The Importance of Social Learning in Restoring The Multifunctionality of Rivers and Floodplains
No ratings yet
The Importance of Social Learning in Restoring The Multifunctionality of Rivers and Floodplains
14 pages
Bivariate 1
No ratings yet
Bivariate 1
27 pages
Report On The KCCA Mobility Workshop
No ratings yet
Report On The KCCA Mobility Workshop
46 pages
'Loving You Sunday Morning' A Tu Celular
No ratings yet
'Loving You Sunday Morning' A Tu Celular
15 pages
Portable Fire Extinguishers - : Part 9: Additional Requirements To EN 3-7 For Pressure Resistance of CO Extinguishers
No ratings yet
Portable Fire Extinguishers - : Part 9: Additional Requirements To EN 3-7 For Pressure Resistance of CO Extinguishers
16 pages
EkillAlive Documentation R17
No ratings yet
EkillAlive Documentation R17
11 pages
ATS Kingston Heath Price List With Effect 15 July 25
No ratings yet
ATS Kingston Heath Price List With Effect 15 July 25
1 page
Fluid Mechanics and Elasticity Overview
No ratings yet
Fluid Mechanics and Elasticity Overview
104 pages
Genetic Algorithm for Damped Oscillator
No ratings yet
Genetic Algorithm for Damped Oscillator
10 pages
Sequence Generation With RNNs - Pre Quiz - Attempt Review
100% (1)
Sequence Generation With RNNs - Pre Quiz - Attempt Review
5 pages

C1 - Introduction To ML

Uploaded by

C1 - Introduction To ML

Uploaded by

Machine Learning

• Early ML algorithms were modeled after neural processes

• The foundation for deep learning, mirroring brain's layered

We would like to give similar capabilities to machines

• Pattern can be recognized based on their past

Temple @Chicago, USA Temple @UDUPI

• Theoretical orientation emphasizing mental structures and processes.

• How sensory information is acquired, stored, transformed, and used.

• Mental activity or acquisition, storage, transformation, and use of

• Pattern - complex composition of sensory stimuli that

• Issue - what cognitive mechanisms need to be

• ML systems learn from and make predictions or

• Data is extracted from an object.

• A ball is a pattern/object: shape, color, size etc.

• A pattern is represented as a vector of feature values.

• The features which are used to represent patterns are

• But, how to choose the most discriminative features to

• If a newly en-countered person weight is 46 kg ?

• Classification of CAT or DOG

•VC Dimension (Vapnik-Chervonenkis Dimension):

• Indicates the maximum number of points that can be

• It is a concept from statistical learning theory that helps to understand the

• The VC dimension is defined as the maximum number of points that can

• For three points, there are (2^3 = 8) possible labelings.

Hence, a linear classifier can shatter some, but not all.

For four points, there are (2^4 = 16) possible labelings.

A line can’t separate all these labelings correctly.

• For axis-aligned squares in the plane, the VC-dimension is 4.

• VC Dimension is a crucial concept for understanding the capacity and

• Balancing the model complexity (VC dimension) with the amount of

• The VC dimension helps to understand the trade-off between the

• However, there are some limitations to VC dimension.

• Second, it assumes that the data is linearly separable, which is not

•Overfitting and Underfitting:

1. An object can be represented by using _________

2. The data is grouped based on__________

3. Name any one application, which has pattern recognition______

4. Name any one soft computing tools used for classification_______

1. An object can be represented by using _Features______

2. The data is grouped based on_Similarity / Proximity

3. Name any one application, which has pattern recognition_Iris Detection/

• Allows to group them into different classes

• X becomes a feature vector

•Image and Speech Recognition:

1. Air traffic control

Hybrid: Autonomous driving

You might also like