0% found this document useful (0 votes)

38 views42 pages

ML Unit-1

The document discusses key concepts in machine learning, including VC dimensions, PAC learning, hypothesis spaces, inductive bias, and generalization. It highlights the challenges of high-dimensional data, such as sparsity and overfitting, and emphasizes the importance of dimensionality reduction techniques. Additionally, it explains the significance of inductive bias in guiding model selection and the balance between bias and variance for effective generalization.

Uploaded by

nnce ece

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views42 pages

ML Unit-1

Uploaded by

nnce ece

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

 VC DIMENSIONS

 PAC
 HYPOTHESIS SPACES
 INDUCTIVE BIAS
 GENERALIZATION
 BIASS VARIANCE TRADE-OFF
VAPNIK CHERVONENKIS
(VC) DIMENTIONS
• Vc provides a measure of the complexity of space of function & which allows the probably
approximately correct framework to be extended to spaces containing an infinite number of function.

• VC DIMENTIONS IS A MEASURE OF COMPLEXITY OR CAPACITY OF A CLASS OF

FUNCTION f(α).

• The vc dimension measures the largest number of eg that can be explained by the family f(α).
DATA SET

• no of features -no of dimensions in a dataset increases, the amount of data to generalize accurately
grows.

• In the context of data anlysis and maching learning, dimensions refer to the features or attributes of
data.

• Examples data set: Houses-price, size, number of bedrooms, location.

• if we add more dimensions to dataset, the volume of the space increases, the data becomes sparse.

 1D-points
 2D-area
 3D-more points
Problems

• Data sparsity-clustering and classification challenging

• Increased Computation-more resourse and time

• Overfitting-reduces the model's ability to generalize to new data.

• Euclidean distance-the difference in distances between data points tends to become negligible.

• Performace degradation-k-nearest neighbors can drop in performance.

• Visualizatoin challenges- hard to visualize, making EDA more difficult.

solution

• In high dimensional data, the data points are at the edges or corners, making the data sparse.

• Dimensionality refers to the challenges and complications that arise when analysing and organising
data in high-dimensional spaces(100-1000 dimensions).

• The solution to the curse of dimensionalty is "dimensionality reduction".

dimensionality reduction

• "IT is a process that reduces the number of random variables under consideration by obtaining a set
of principle variables.

• by reducing dimensionality, we can retain the most important information in the dataset while
discarding the redundant or less important features.

• Dimensioality reduction methods:

1.PCA Principal component Analysis
2.LDA Linear Discriminant Analysis
3.t-SNE t-Distributed stochastic Neighbor Embedding
Vapnik-Chervonenkis (VC) dimension

• Vapnik-Chervonenkis (VC) dimension is a measure of the size (capacity, complexity, expressive

power, richness, or flexibility) of a class of sets.

• The Vapnik-Chervonenkis (VC) dimension is a measure of the capacity of a hypothesis set to fit
different data sets.

• Vapnik-Chervonenkis dimension, is a measure of the complexity of a machine learning model.

• VC dimension is measure of a model's capacity. which is used to guide the model selection process
while developing machine learning applications.

• VC dimension is a measure of the difficulty of the machine learning problem.it is the cardinality of
the largest set of points that the algorithm can shatter.
(VC) –Shattering

• shattering is the ability of a model to classify a set of points perfectly.it consider all possible
combinations of labels upon those points.

• VC dimension of a model is the size of the largest set of the that model can shatter.

 h(m)=0;h(n)=0

 h(m)=0;h(n)=1

 h(m)=1;h(n)=0

 h(m)=1;h(n)=1

VC dimension 2, The model can divide the points into

two segments two points in the dataset sre shattered.
(VC)

• r=1 if x is a positive example

• 0 if x is a negative example

• dataset containing N points.N points can be labeled in 2N ways

as positive and negative.

• find a hypothesis h ∈ H

• The maximum number of number of points that can be shattered

by H is called the Vapnik-Chervonenkis (VC) dimension.

• VC dimensionof H, is denoted as VC(H), and measures the

capacity of H.
(VC)

• VC dimension is the capacity of a machine learning algorithm.

• capacity-its ability to learn from a given dataset

• accuracy-its ability to correctly identify labels for a given dataset.

• VC dimension act as a guiding light in model selection.capacity of classification model=complexity

of model

• Eg:Bujus-After some finite number of examples, learner will have learned the correct concept (though
might not even know it!). Correct means agrees with target concept on labels for all data.
Probably Approximately Correct (PAC)
learning
Probably Approximately Correct (PAC) learning
• In PAC learning, the goal is to find a hypothesis that performs well on unseen examples, given a sample of labeled
training probability distribution the training data is drawn independently and identically from an unknown

Applications of PAC learning in machine learning

• Supervised Learning: PAC learning is particularly applicable to supervised learning problems, where a model is
trained on labeled examples to make predictions on new unseen examples.

• Sample Selection: PAC learning guides the process of selecting representative training samples, ensuring that the
selected samples are informative and cover therundering distribution adequately

• Model Selection and Evaluation: PAC learning provides a theoretical framework for selecting and evaluating
models based on their generalization performance

• Active Learning: Active learning strategies use PAC learning principles, to actively query and select the most
informative or uncertain instances from anunlabeled dataset.

• Computational Learning Theory: It provides insights into the feasibility andcomplexity of learning tasks
PAC

• learning provides a theoretical framework for understanding the sample complexity, generalization
performance, and guarantees in learning from data.

• It plays a crucial role in shaping the design, evaluation, and analysis of machine learning algorithms.

• Probability of successful learning Number of training examples Complexity of hypothesis space

• Accuracy to which target function is approximated Manner in which training examples presented

• Instances X (set of instance or objects in world)

• Target concept c (subset of instance space)

• Hypothesis H (collection of concepts over X)

• Training data D (example from instance space)

Probably Approximately Correct (PAC)learning

• PAC learning is a for analyzing the efficiency of machine learning algorithms.

• The goal of PAC learning is to design algorithms that can learn a target concept with high probability
and high accuracy, given a finite amount of labeled training data.

• Hypothesis Class (H): The set of possible hypotheses orclassifiers that the learning algorithm

• Concept Class (C): The set of all possible target concepts.goal is for I Thethe learning algorithm to
output a hypothesis that approximates the true concept

• The concept is a function that maps instances to binary labels (0 or 1).

(PAC)

• The number of examples (labeled instances) hypothesis that is "pro approximately correct" with high
probability.

Error and Confidence:

• Error (ε): The maximum allowable error rate for the learned hypothesis. The hypothesis is considered
correct if its error is less than ɛ.

• Confidence (∂ ): The desired confidence level. It represents the probability that the hypothesis is
"probably approximately correct.“

• A learning algorithm is PAC if, for any ɛ> 0 and ∂ > 0, with probability at least 1-∂ (confidence
level), the algorithm outputs a hypothesis h such that the error of h is at most ɛ.
(PAC)

• Theoretical results in PAC learning provide bounds on the number of training examples needed to
achieve a certain level of confidence and error.

• Imagine we're doing classification with categorical inputs.

• All inputs and outputs are binary.Eg: GenderThere's a machine f(x,h), hypotheses called h, h.. H

Example hypotheses: citizenship

• X1 ^ X2

• if there are 3 attributes, what is the complete set of hypotheses in f?

(PAC)

• In probably approximately correct (PAC) learning, given a class, C, and PAC learning error rate (e)
and confidence level (6), and the goal is to achieve low error rates with high confidence.

Noise Tolerance:

• PAC learning often assumes that the training data may contain some amount of noise or errors.

• Noise tolerance refers to the ability of a learning algorithm to still learn the underlying concept in the
presence of such noise.

PAC

• PAC learning provides a rigorous theoretical framework for analyzing the performance of learning
algorithms in terms of their ability to generalize from limited data, control error rates, and achieve
high confidence in their predictions.
HYPOTHESIS SPACES
Hypothesis spaces

• Hypothesis space (H)Hypothesis space is defined as a set of all possible legal hypotheses; hence it is
also known as a hypothesis set.

• It is used by supervised machine learning algorithms to determine the best possible hypothesis to
describe the target function or best maps input to output.

• It is often constrained by choice of the framing of the problem, the choice of model, and the choice of
model configuration.

• hypothesis (h) can be concluded as a single hypothesis that maps input to proper output and can be
evaluated as well as used to make predictions.

• y= mx + b
Hypothesis spaces

y= mx + b

Where, Y- Range
m- Slope of the line which divided test data
x- domain
c- intercept (constant)

• Example: Let's understand the hypothesis (h) and hypothesis space (H) with a two-dimensional
coordinate plane showing the distribution of data

• Hypothesis space (H) is the composition of all legal best possible ways to divide the coordinate plane
so that it best maps input to proper output.
Inductive bias
Bias

• the average squared difference between predictions and true values.

• It's a measure of how good your model fits the data.

• Zero bias would mean that the model captures the true data generating process perfectly. Both your
training and validation loss would go to zero. That is unrealistic

Inductive Bias

• Every machine learning model requires some type of architecture design and some initial assumptions
about the data, to analyze.

• every belief that we make about the data is a form of inductive bias.

• Inductive biases play an important role in the ability of machine learning models to generalize to the
unseen data
Inductive bias

• Given a training dataset, we need some additional constraints or criteria to help us better fit the
training samples, so that the trained model can make better predictions on the unseen samples (i.e.,
generalize beyond the training data).

• The additional constraints or criteria here are called inductive bias.

• In traditional machine learning, every algorithm has its own inductive biases

• inductive bias refers to the set of assumptions that a learning algorithm makes to predict outputs for
inputs it has never seen.

• It's the bias of a model towards making a particular kind of assumption in order to generalize from its
training data to unseen situations.
Inductive bias Importantace

• Learning from Limited Data- Inductive bias helps models

• generalize to unseen data based on the assumptions they carry.

• Guiding Learning- Given a dataset, there can be countless hypotheses that fit the data. Inductive bias
helps the algorithm choose one plausible hypothesis.

• Preventing Overfitting

• A model with no bias or assumptions might fit the training data perfectly, capturing every minute
detail, including noise. This is known as overfitting.

• An inductive bias can prevent a model from overfitting by making it favor simpler hypotheses.
Types of Inductive Bias

Preference Bias- It expresses a preference for some hypotheses over others. For example, in decision
tree algorithms like ID3, the preference is for shorter trees over longer trees..

Restriction Bias- It restricts the set of hypotheses considered by the algorithm. For instance, a linear
regression algorithm restricts its hypothesis to linear relationships between variables.
Examples of Inductive Bias

Decision Trees- a bias towards shorter trees and splits that categorize the data most distinctly at each
level.

k-Nearest Neighbors (k-NN)-The algorithm assumes that instances that are close to each other in the
feature space have similar outputs.

Neural Networks: They have a bias towards smoothfunctions. The architecture itself (number of layers,
number of neurons) can also impose bias.

Linear Regression: Assumes a linear relationship between the input features and the output.
GENERALIZATION
Generalization

• Generalization refers to your model's ability to adapt properly to new, previously unseen data, drawn
from the same distribution as the one used to create the model.

• supervised learning in the domain of machine learning refers to a way for the model to learn and
understand data.

• Based on this training data, the model learns to make predictions.

• The term 'generalization' refers to the model's capability to adapt and react properly to previously
unseen, new data, which has been drawn from the same distribution as the one used to build the
Model.

• generalization examines how well a model can digest new data and make correct predictions after
getting trained on a training set
• Generalization IS a model is able to generalize is the key to its success.

• If you train a model too well on training data, it will be incapable of generalizing. In such cases, it
will end up making erroneous predictions when it's given new data. This would make the model
ineffective even though it's capable of making correct predictions for the training data set. This is
known as overfitting.

• The inverse (underfitting) is also true, which happens when you train a model with inadequate data. In
cases of underfitting, your model would fail to make accurate predictions even with the training data.
This would make the model just as useless as overfitting.

• Generalization is a measure of how your model performs on predicting unseen data, So, it is
important to come up with the best-generalized model to give better performance against future data.
Let us first understand what is underfitting and overfitting, and then see what are the best practices to
train a generalized model.
What is Underfitting?

• Underfitting is a state where the model cannot model itself on the training data. And also not able
to generalize new data. You can notice it with the help of loss function during your training. A
simple rule of thumb is if both training loss and cross-validation loss are high, then your model is
underfitting.

• Lack of data, not enough features, lack of variance in training data or high regularization rate can
cause underfitting. A simple solution is to add more shuffled data to your training.

TRAINING INCREASE, TESTING REDUCE

What is Overfitting?

• Overfitting is a situation where your model force learns the whole variance. Experts say it as
model starts to memorize all the noise instead of learning. A simple rule of thumb to identify the
overfitting is if your training loss is low and cross-validation loss is high then your model is
overfitting.

• Uncleaned data, fewer steps in training, higher complexity of the model (due to higher weights in
data) can cause overfitting. It is always recommended to preprocess data and create a good dáta
pipeline.Select only necessary and meaningful features with good variance.

TRAINING DATA REDUCE, TESTING INCREASE

best practices to get a Generalized model?

• It is important to have a training dataset with good variance (i.e. a shuffled data set).

• split data into training, AND evaluate.

• evaluation set is used to cross-validate the trained model. It is always good to ensure that the
distribution in all the dataset is stationary(same). To achieve this goal, you can track the performance
of a machine learning algorithm over time as it's working with a set of training data. You can plot both
the skill on the training data and the skill on a test dataset that you've held back from the training
process.

• Training the model for too long would cause a continual decrease in the performance on the training
dataset due to overfitting. At the same time, due to the model's decreasing ability for generalization,
the error for the test set would start to increase again.

Regularization

• Regularization is a method to avoid high variance and overfitting as well as to increase generalization.
Without getting into details, regularization aims to keep coefficients close to zero
Low Bias: A low bias model will make fewer assumptions about the form of the target function.

High Bias:A model with a high bias makes more assumptions, and the model becomes unable to capture
the important features of our dataset. A high bias model also cannot perform well on new data.

variance :tells that how much a random variable is different from its expected value.

Low variance means there is a small variation in the prediction of the target function with changes in the
training data set.

High variance shows a large variation in the prediction of the target function with changes in the
training dataset
Ways to Reduce High Variance:

• Reduce the input features or number of parameters as a model is overfitted

• Do not use a much complex model.

• Increase the training data.

• Increase the Regularization term

Bias-Variance Trade-Off

• While building the machine learning model, it is really important to take care of bias and
variance in order to avoid overfitting and underfitting in the model.

• •If the model is very simple with fewer parameters, it may have low variance and high bias.

• •Whereas, if the model is complex, which has a large number of parameters, it will have high
variance and low bias.

• •So, it is required to make a balance between bias and variance errors, and this balance between
the bias error and variance error is known as the Bias-Variance trade-off.
The properties of inductive bias

• The strength of inductive bias describes its limitation on the size of the hypothesis space that the
learner can search.

• Strong inductive bias gives the learner a relatively small search space, while weak inductive bias
provides a broader search space for the learner.

• How to measure it? VC dimension theory (Vapnik-Chervonenkis dimension) Correctness. Only

the correct inductive bias can ensure that the learner successfully learns the target concept.

• Conversely, under the incorrect induction bias, the learner cannot learn the correct target concept
no matter how many training samples are used.

• How to measure it? PAC learning theory (Probably Approximately Correct)

Trade-offs

• While inductive bias helps models generalize from training data, there's a trade-off.

• A strong inductive bias means the model might not be flexible enough to capture all patterns in the
data.

• On the other hand, too weak a bias could lead the model to overfit the training data.

• inductive bias is the "background knowledge" or assumptions that guide a machine learning
algorithm.

• It's essential for generalization, especially when the training data is sparse or noisy.

• However, choosing the right type and amount of inductive bias for a particular problem is an art and is
crucial for the success of the model.
VARIANCE

• A model is said to have high variance if its predictions are sensitive to small changes in the input
When a model does not perform as well as it does with the trained data set, there is a possibility that
the model has a variance.

• It basically tells how scattered the predicted values are from the actual values

• Bias: Error in training data

• Variance: Error in test data

• A statistical model is said to be overfitted when we feed it a lot more data than necessary.

• Training Data Accuracy is high and Test Data Accuracy is low.

• UNDERFITTING In order to avoid overfitting, we could stop the training at an earlier stage.

• Training Data Acc is low and Test Data Acc is low underfitting would imply that the model has still
capacity to learn, so you would simply train for more iterations or collect more data.
Bias-Variance Tradeoff

• The bias-variance tradeoff is a stand-alone theory that provides a different perspective on

generalization

• The bias-variance tradeoff in machine learning involves a tradeoff between approximation and
generalization, aiming to minimize the error in learning.

• The bias-variance analysis quantifies how well the best hypothesis performs in approximating the
target function, taking into account the overall ability of the hypothesis set to approximate the
function.

• The decomposition of the out-of-sample error into approximation and generalization components can
help understand the behavior of the hypothesis and its performance on different data sets.
• The variance term in the bias-variance tradeoff arises from the fact that we only have access to one
dataset at a time, resulting in different outcomes for each dataset.

• The bias-variance tradeoff can be measured by comparing the squared difference between the
predicted values and the true values, which is called variance, and the difference between the
predicted values and the expected value, which is called bias.

• Increasing the size of the hypothesis set reduces bias but increases variance, while decreasing the size
of the hypothesis set increases bias but reduces variance.

• The bias-variance tradeoff highlights the importance of finding the right balance between model
complexity and data resources in a learning situation.

• Overfitting occurs when a complex model with many degrees of freedom fits the training set perfectly
but fails to generalize well, resulting in a high out-of-sample error and no real learning.

• Ensemble learning, methods, such as Bagging, rely on the concept of reducing variance by averaging
multiple models or predictions, leading to improved performance

Lecture 5
No ratings yet
Lecture 5
12 pages
Aml CH.1
No ratings yet
Aml CH.1
11 pages
Module 1 Part3
No ratings yet
Module 1 Part3
56 pages
Machine Learning Key Concepts and Examples
No ratings yet
Machine Learning Key Concepts and Examples
22 pages
MachineLearning - UNIT III
No ratings yet
MachineLearning - UNIT III
30 pages
ML Unit-3.-1
No ratings yet
ML Unit-3.-1
28 pages
Computational Learning Theory Guide
No ratings yet
Computational Learning Theory Guide
24 pages
Lec-3-Vc Dimension and Pac Learning
No ratings yet
Lec-3-Vc Dimension and Pac Learning
19 pages
UNIT 1 Notes
No ratings yet
UNIT 1 Notes
38 pages
All Merged Chap 4
No ratings yet
All Merged Chap 4
37 pages
Probably Approximately Correct (PAC) Learning Model - Machine Learning
No ratings yet
Probably Approximately Correct (PAC) Learning Model - Machine Learning
11 pages
Week 7 Notes
No ratings yet
Week 7 Notes
11 pages
PAC Learning and Complexity
No ratings yet
PAC Learning and Complexity
14 pages
Error (ε) : A small value representing the maximum acceptable error
No ratings yet
Error (ε) : A small value representing the maximum acceptable error
3 pages
Matters of Discussion
No ratings yet
Matters of Discussion
28 pages
PAC Learning Explained
No ratings yet
PAC Learning Explained
15 pages
PAC Learning Overview and Concepts
No ratings yet
PAC Learning Overview and Concepts
3 pages
Day 8 The PAC Learning Model 1747716734
No ratings yet
Day 8 The PAC Learning Model 1747716734
8 pages
FML (2) (1) - Compressed
No ratings yet
FML (2) (1) - Compressed
13 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
12 pages
PAC Learning Detailed
No ratings yet
PAC Learning Detailed
2 pages
Unit 3
No ratings yet
Unit 3
99 pages
VC Dimension, Hypothesis and PAC
No ratings yet
VC Dimension, Hypothesis and PAC
23 pages
MACHINE LEARNING Updated
No ratings yet
MACHINE LEARNING Updated
12 pages
Understanding Computational Learning Theory
No ratings yet
Understanding Computational Learning Theory
16 pages
INT354 Unit 1 Part2
No ratings yet
INT354 Unit 1 Part2
14 pages
PAC Learning & Machine Learning Course
No ratings yet
PAC Learning & Machine Learning Course
36 pages
ML Unit-2 Material Add-On
No ratings yet
ML Unit-2 Material Add-On
82 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Notes
No ratings yet
Notes
125 pages
A Primer On PAC-Bayesian Learning
No ratings yet
A Primer On PAC-Bayesian Learning
26 pages
ML Unit-1
No ratings yet
ML Unit-1
64 pages
Computer Network: 02 December 2024 22:38
No ratings yet
Computer Network: 02 December 2024 22:38
5 pages
ML Question Bank CA-II
No ratings yet
ML Question Bank CA-II
10 pages
Understanding PAC Learning Framework
No ratings yet
Understanding PAC Learning Framework
30 pages
PSO
No ratings yet
PSO
74 pages
VC Dimension & Model Complexity
No ratings yet
VC Dimension & Model Complexity
42 pages
Finite and Infinite Hypothesis Spaces - PAC and Bayes Theorem
No ratings yet
Finite and Infinite Hypothesis Spaces - PAC and Bayes Theorem
9 pages
PAC Learning and VC Dimension Explained
No ratings yet
PAC Learning and VC Dimension Explained
31 pages
INT354 - Unit 1
No ratings yet
INT354 - Unit 1
72 pages
ML Lecture 8
No ratings yet
ML Lecture 8
12 pages
PAC Learning for ML Theorists
No ratings yet
PAC Learning for ML Theorists
34 pages
ch-5 Planning and Learning
No ratings yet
ch-5 Planning and Learning
59 pages
Lec 6
No ratings yet
Lec 6
29 pages
Machine Learning Theory for Students
No ratings yet
Machine Learning Theory for Students
45 pages
Lecture5 Learning Theory v1.1
No ratings yet
Lecture5 Learning Theory v1.1
59 pages
Summary
No ratings yet
Summary
47 pages
PAC Learning and Sample Complexity in ML
No ratings yet
PAC Learning and Sample Complexity in ML
64 pages
SupervisedLearning 2 33
No ratings yet
SupervisedLearning 2 33
32 pages
Unit Iii
No ratings yet
Unit Iii
6 pages
C1 - Introduction To ML
No ratings yet
C1 - Introduction To ML
45 pages
SML Lecture3
No ratings yet
SML Lecture3
36 pages
PAC Learning for ML Researchers
No ratings yet
PAC Learning for ML Researchers
22 pages
ML Questions - GROUP - 08
No ratings yet
ML Questions - GROUP - 08
23 pages
PAC Learning Model Overview
No ratings yet
PAC Learning Model Overview
7 pages
PAC Bayesian Learning Overview
No ratings yet
PAC Bayesian Learning Overview
66 pages
Concept Learning in Machine Learning
No ratings yet
Concept Learning in Machine Learning
14 pages
PAC Learning Frameworks Explained
No ratings yet
PAC Learning Frameworks Explained
59 pages
Computational Learning Theory Guide
No ratings yet
Computational Learning Theory Guide
43 pages
Software Technology
No ratings yet
Software Technology
1 page
VLSI and Chip Design - EC3552 - Important Questions With Answer - Unit 1 - MOS Transistor Principles
No ratings yet
VLSI and Chip Design - EC3552 - Important Questions With Answer - Unit 1 - MOS Transistor Principles
17 pages
WC QB 2025 Updated
No ratings yet
WC QB 2025 Updated
125 pages
C PROGRAMMING Final Notes
No ratings yet
C PROGRAMMING Final Notes
60 pages
ML Cia-1
No ratings yet
ML Cia-1
2 pages
CS3491 - Artificial Intelligence
No ratings yet
CS3491 - Artificial Intelligence
2 pages
Compiler Design
No ratings yet
Compiler Design
49 pages
C PROGRAMMING QUESTION BANK Final
No ratings yet
C PROGRAMMING QUESTION BANK Final
116 pages
AIML Unit-5
No ratings yet
AIML Unit-5
26 pages
AIML Unit-4
No ratings yet
AIML Unit-4
27 pages
AIML Unit-3
No ratings yet
AIML Unit-3
36 pages
Cs3251-Programming in C I Year / Ii Semester
No ratings yet
Cs3251-Programming in C I Year / Ii Semester
113 pages
AIML Unit-1
No ratings yet
AIML Unit-1
39 pages
Retina: The Eye's Light-Sensitive Layer
No ratings yet
Retina: The Eye's Light-Sensitive Layer
17 pages
Mechatronics - All 5 Units Notes
No ratings yet
Mechatronics - All 5 Units Notes
217 pages
C Programming Question Bank
No ratings yet
C Programming Question Bank
59 pages
Neuroscience Basics for Students
No ratings yet
Neuroscience Basics for Students
7 pages
Control Body Movements and Functions
No ratings yet
Control Body Movements and Functions
7 pages
Unit V Nervous and Sensory System
No ratings yet
Unit V Nervous and Sensory System
16 pages
Ec3352-Vlsi and Chip Design Notes Ece 24-25 Odd
No ratings yet
Ec3352-Vlsi and Chip Design Notes Ece 24-25 Odd
192 pages
ITIL 4 Foundation - Learner Workbook - Digital
No ratings yet
ITIL 4 Foundation - Learner Workbook - Digital
217 pages
4th Grade Geometry: Congruent Shapes Lesson
No ratings yet
4th Grade Geometry: Congruent Shapes Lesson
4 pages
Usc Strategic Management Course Syllabus
No ratings yet
Usc Strategic Management Course Syllabus
13 pages
Introduction to Machine Learning
50% (2)
Introduction to Machine Learning
27 pages
Wa0194.
No ratings yet
Wa0194.
7 pages
Vygotsky's Socio-Cultural Theory
100% (7)
Vygotsky's Socio-Cultural Theory
16 pages
Preamble of the Teacher's Code of Ethics
100% (2)
Preamble of the Teacher's Code of Ethics
34 pages
Effective High School Reading Strategies
100% (2)
Effective High School Reading Strategies
37 pages
Grade 12 PE Daily Lesson Log Week 8
No ratings yet
Grade 12 PE Daily Lesson Log Week 8
2 pages
Jerome Bruner's Theory of Cognitive Development
No ratings yet
Jerome Bruner's Theory of Cognitive Development
8 pages
Lesson Exemplar: at The End of The Lesson, Learners Are Expected To
100% (2)
Lesson Exemplar: at The End of The Lesson, Learners Are Expected To
3 pages
cs231n 2018 Midterm Review-2 PDF
No ratings yet
cs231n 2018 Midterm Review-2 PDF
86 pages
Lesson 2
No ratings yet
Lesson 2
8 pages
Curriculum Implementation and Evaluation
No ratings yet
Curriculum Implementation and Evaluation
33 pages
Curriculum Leadership Is The Process of Taking Charge or Demonstrating The Initiative Towards Significant
100% (1)
Curriculum Leadership Is The Process of Taking Charge or Demonstrating The Initiative Towards Significant
2 pages
50+ Interesting Quantitative Research Topics
No ratings yet
50+ Interesting Quantitative Research Topics
3 pages
Employability - Skills10 New 2025 Class 10
No ratings yet
Employability - Skills10 New 2025 Class 10
83 pages
Introduction to Philosophy Lesson
No ratings yet
Introduction to Philosophy Lesson
2 pages
Sba Moderation Tool
No ratings yet
Sba Moderation Tool
4 pages
Schoolobjects
No ratings yet
Schoolobjects
5 pages
Mother Tongue Lesson Plan
No ratings yet
Mother Tongue Lesson Plan
7 pages
9.reading Strategies
No ratings yet
9.reading Strategies
5 pages
Transcript
No ratings yet
Transcript
1 page
AP Biology Course Overview and Tips
No ratings yet
AP Biology Course Overview and Tips
4 pages
Grade 10 Lesson Plan: Central Tendency
67% (3)
Grade 10 Lesson Plan: Central Tendency
3 pages
Impact of School Infrastructure On Average Schooling Hours
No ratings yet
Impact of School Infrastructure On Average Schooling Hours
15 pages
BSC Civil Engineering
No ratings yet
BSC Civil Engineering
4 pages
Long Short-Term Memory Survey Paper
No ratings yet
Long Short-Term Memory Survey Paper
6 pages
CLIL Presentation
No ratings yet
CLIL Presentation
2 pages
Resume Final
No ratings yet
Resume Final
3 pages

ML Unit-1

Uploaded by

ML Unit-1

Uploaded by

 VC DIMENSIONS

• VC DIMENTIONS IS A MEASURE OF COMPLEXITY OR CAPACITY OF A CLASS OF

• Examples data set: Houses-price, size, number of bedrooms, location.

• Data sparsity-clustering and classification challenging

• Increased Computation-more resourse and time

• Overfitting-reduces the model's ability to generalize to new data.

• Performace degradation-k-nearest neighbors can drop in performance.

• Visualizatoin challenges- hard to visualize, making EDA more difficult.

• The solution to the curse of dimensionalty is "dimensionality reduction".

• Dimensioality reduction methods:

• Vapnik-Chervonenkis (VC) dimension is a measure of the size (capacity, complexity, expressive

• Vapnik-Chervonenkis dimension, is a measure of the complexity of a machine learning model.

VC dimension 2, The model can divide the points into

• r=1 if x is a positive example

• dataset containing N points.N points can be labeled in 2N ways

• The maximum number of number of points that can be shattered

• VC dimensionof H, is denoted as VC(H), and measures the

• VC dimension is the capacity of a machine learning algorithm.

• capacity-its ability to learn from a given dataset

• accuracy-its ability to correctly identify labels for a given dataset.

• VC dimension act as a guiding light in model selection.capacity of classification model=complexity

Applications of PAC learning in machine learning

• Probability of successful learning Number of training examples Complexity of hypothesis space

• Instances X (set of instance or objects in world)

• Target concept c (subset of instance space)

• Hypothesis H (collection of concepts over X)

• Training data D (example from instance space)

• PAC learning is a for analyzing the efficiency of machine learning algorithms.

• The concept is a function that maps instances to binary labels (0 or 1).

Error and Confidence:

• Imagine we're doing classification with categorical inputs.

Example hypotheses: citizenship

• if there are 3 attributes, what is the complete set of hypotheses in f?

• the average squared difference between predictions and true values.

• It's a measure of how good your model fits the data.

• The additional constraints or criteria here are called inductive bias.

• Learning from Limited Data- Inductive bias helps models

• generalize to unseen data based on the assumptions they carry.

• Based on this training data, the model learns to make predictions.

TRAINING INCREASE, TESTING REDUCE

TRAINING DATA REDUCE, TESTING INCREASE

• split data into training, AND evaluate.

• Reduce the input features or number of parameters as a model is overfitted

• Do not use a much complex model.

• Increase the training data.

• Increase the Regularization term

• How to measure it? VC dimension theory (Vapnik-Chervonenkis dimension) Correctness. Only

• How to measure it? PAC learning theory (Probably Approximately Correct)

• Bias: Error in training data

• Training Data Accuracy is high and Test Data Accuracy is low.

• The bias-variance tradeoff is a stand-alone theory that provides a different perspective on

You might also like