0% found this document useful (0 votes)

20 views15 pages

Classification and Regression

Learning about Classification and Regression.

Uploaded by

halkohi12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views15 pages

Classification and Regression

Learning about Classification and Regression.

Uploaded by

halkohi12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY

FACULTY OF ENGINEERING & TECHNOLOGY

Unit -4
Classification and Regression

Supervised Learning vs. Unsupervised Learning

Supervised learning and unsupervised learning are two fundamental paradigms in
machine learning, each with its own unique characteristics and applications.
1. Supervised Learning: -

Definition: In supervised learning, the algorithm is trained on a labeled dataset,

where each input data point is associated with a corresponding target or output.
The goal is to learn a mapping from inputs to outputs. –

Objective: The primary objective of supervised learning is to make predictions or

classify new, unseen data accurately based on the patterns learned from the labeled
data.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

Examples: Classification and regression are common tasks in supervised learning.

Examples include image classification (assigning labels to images), spam email
detection (categorizing emails as spam or not), and predicting house prices based on
features like square footage and location.

2. Unsupervised Learning:

Definition: In unsupervised learning, the algorithm is trained on an unlabeled

dataset, where there are no target outputs provided. The algorithm must find
patterns or structure in the data on its own.

Objective: The main objective of unsupervised learning is to discover hidden

patterns, group similar data points together, or reduce the dimensionality of the
data.

Examples: Clustering and dimensionality reduction are common tasks in

unsupervised learning. Examples include clustering customers based on their
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

purchasing behavior (customer segmentation), topic modeling in text data, and

reducing the dimensionality of image data for easier visualization or processing.

Here are some key differences between the two: -

Data Type:
- Supervised learning requires labeled data (input-output pairs), whereas
-Unsupervised learning works with unlabeled data (only input data).

Objective:
- Supervised learning aims to predict or classify based on existing knowledge
from labeled data.
- Unsupervised learning aims to discover patterns or structure in data when no
prior information is available.

Applications:
- Supervised learning is used for tasks that involve making predictions or
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

decisions, such as classification and regression.

- Unsupervised learning is used for tasks like clustering, dimensionality reduction,
and data exploration.

Evaluation:
- In supervised learning, performance can be evaluated by comparing the model's
predictions to the true labels using metrics like accuracy, precision, recall, or
mean squared error.

- In unsupervised learning, evaluation is often less straightforward since there are no

target labels. Evaluation may involve assessing the quality of clusters or the
effectiveness of dimensionality reduction.

Examples:

- Supervised: Spam email detection, image classification, sentiment analysis.

- Unsupervised: Customer segmentation, anomaly detection, principal component
analysis (PCA).
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

Parameters Supervised machine Unsupervised machine

learning learning
Input Data Algorithms are trained Algorithms are used
using labelled data. against data that is not
labelled
Computational Simpler method Computationally complex
Complexity
Accuracy Highly accurate Less accurate
No. of classes No. of classes is known No. of classes is not known
Data Analysis Uses offline analysis Uses real-time analysis of
data
Algorithms used Linear and Logistics K-Means clustering,
regression, Random Hierarchical clustering,
Forest, Support Vector Apriori algorithm, etc.
Machine, Neural Network,
etc.
Output Desired output is given. Desired output is not
given.
Training data Use training data to infer No training data is used.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

model.
Complex model It is not possible to learn It is possible to learn larger
larger and more complex and more complex models
models than with with unsupervised
supervised learning. learning.
Model We can test our model. We can not test our model.
Called as Supervised learning is also Unsupervised learning is
called classification. also called clustering.
Example Example: Optical Example: Find a face in an
character recognition. image.

Supervised Learning

Supervised learning is one of the primary paradigms in machine learning. In

supervised learning, an algorithm learns a mapping from input data to output
labels or targets by using a labeled dataset for training. Here are the key
components and steps involved in supervised learning:
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

1. Input Data (Features): This is the set of data points or observations that the
algorithm uses to make predictions or classifications. Each data point is
represented by a set of features or attributes that describe it. Features can be
numeric, categorical, or even more complex data types, depending on the
problem.

2. Output Labels (Targets): In supervised learning, each data point in the

training dataset is associated with a corresponding output label or target. These
labels represent the desired outcome or prediction that the algorithm should aim to
achieve.

3. Training Dataset: The training dataset is the labeled dataset used to train the
supervised learning model. It consists of a collection of input data samples and
their corresponding output labels. The model learns to make predictions by
finding patterns and relationships within this data.

4. Model Selection: Choose a machine learning algorithm or model that is suitable

for the problem at hand. Common supervised learning algorithms include linear
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

regression, decision trees, support vector machines, neural networks, and more.
The choice of model depends on factors such as the nature of the data and the
problem's requirements.

5. Training (Learning): The selected model is trained on the training dataset.

During training, the model adjusts its internal parameters or weights to
minimize the difference between its predictions and the actual output labels in the
training data. This process typically involves optimization techniques such as
gradient descent.

Linear Regression

Linear regression is a fundamental statistical and machine learning technique used

for modeling the relationship between a dependent variable (target) and one or
more independent variables (features or predictors) by fitting a linear equation
to the observed data. It is one of the simplest and most widely used regression
methods and is often employed for tasks like predicting numerical values (regression
problems).
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

Here are the key concepts and components of linear regression:

1. Linear Equation:

- In simple linear regression, which deals with one independent variable, the linear
equation is represented as: `y = mx + b`, where:
- `y` is the dependent variable (the one you want to predict).
- `x` is the independent variable (the feature).
- `m` is the slope of the line, representing the relationship between `x` and `y`.
- `b` is the y-intercept, which is the value of `y` when `x` is 0.

2. Multiple Linear Regressions:

- In multiple linear regression, you have more than one independent variable, and the
equation becomes: `y = b0 + b1*x1 + b2*x2 + ... + bn*xn`, where:
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

- `y` is still the dependent variable.

- `x1`, `x2`, ..., `xn` are the independent variables.
- `b0` is the y-intercept. - `b1`, `b2`, ..., `bn` are the coefficients associated with each
independent variable, representing their respective contributions to `y`.

3. Assumptions of Linear Regression: -

Linear relationship: There should be a linear relationship between the

independent and dependent variables.

- Independence: The residuals (the differences between predicted and actual values)
should be independent of each other.
- Homoscedasticity: The variance of the residuals should be constant across all
levels of the independent variables.
- Normality: The residuals should follow a normal distribution.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

4. Least Squares Method: -

Linear regression aims to find the best-fitting line by minimizing the sum of squared
differences (residuals) between the predicted and actual values. This method is
called the least squares method.

5. Coefficient Estimation: -

The coefficients (`m` and `b` or `b0`, `b1`, `b2`, ...) are estimated during the training
process to find the best-fitting line that minimizes the sum of squared residuals.

6. Model Evaluation: -

Common metrics for evaluating linear regression models include Mean Absolute
Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and
R-squared (R²) which measures the proportion of variance in the dependent variable
explained by the model.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

7. Overfitting and Underfitting: -

- Overfitting occurs when the model is too complex and fits the training data too
closely, leading to poor generalization to new data.
- Underfitting happens when the model is too simple and cannot capture the
underlying patterns in the data.

8. Regularization:

- Regularization techniques, like Ridge and Lasso regression, can be applied to

prevent overfitting and improve model stability.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

Overfitting
Definition: Overfitting occurs when a model learns the training data too well, including
its noise and outliers, rather than capturing the underlying patterns. This leads to
excellent performance on the training set but poor performance on new, unseen data.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

Symptoms:
 High accuracy or low error on the training set.
 Poor accuracy or high error on the validation or test set.
Causes:
 A model that is too complex relative to the amount and variability of the training
data (e.g., too many parameters or a very flexible model).
 Too many features or interactions that lead to a model that fits the noise in the data.
Prevention/Mitigation:
 Simplify the Model: Use a less complex model with fewer parameters.
 Regularization: Techniques like L1 (Lasso) or L2 (Ridge) regularization can
penalize large weights, helping to avoid overfitting.
 Cross-Validation: Use techniques like k-fold cross-validation to assess the model’s
performance on different subsets of the data.
 Early Stopping: Monitor the performance on a validation set and stop training when
performance starts to degrade.
PARUL INSTITUTE OF ENGINEERING & TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY

 More Data: Increasing the size of the training dataset can help the model generalize
better.

Unit3aiml 230421054431 97b34666
No ratings yet
Unit3aiml 230421054431 97b34666
62 pages
Unit 3
No ratings yet
Unit 3
62 pages
AI Lab6
No ratings yet
AI Lab6
7 pages
Machine Learning
No ratings yet
Machine Learning
100 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
Supervised Learning
No ratings yet
Supervised Learning
19 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
Unit - 2, Updated Notes
No ratings yet
Unit - 2, Updated Notes
121 pages
Slide 1
No ratings yet
Slide 1
29 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
115 pages
Unit 1
No ratings yet
Unit 1
24 pages
Machine Learning Reg
No ratings yet
Machine Learning Reg
45 pages
Applied ML Notes
No ratings yet
Applied ML Notes
123 pages
Introduction To Ai & ML
No ratings yet
Introduction To Ai & ML
27 pages
Ai ML 3
No ratings yet
Ai ML 3
27 pages
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
No ratings yet
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
19 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
49 pages
ML Quation Bank
No ratings yet
ML Quation Bank
50 pages
Ai Unit 3
No ratings yet
Ai Unit 3
30 pages
ML Type
No ratings yet
ML Type
13 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
2 ML
No ratings yet
2 ML
9 pages
Machine Learning: A Guide for Students
No ratings yet
Machine Learning: A Guide for Students
18 pages
MLT Unit 1
No ratings yet
MLT Unit 1
15 pages
Chapter - 2-ML
No ratings yet
Chapter - 2-ML
63 pages
Unit 3
No ratings yet
Unit 3
45 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
21 pages
Week - 03 Week04
No ratings yet
Week - 03 Week04
32 pages
Ann Unit 2
No ratings yet
Ann Unit 2
21 pages
Intro to Supervised Learning
No ratings yet
Intro to Supervised Learning
52 pages
Unit 2
No ratings yet
Unit 2
63 pages
Supervised vs Unsupervised Learning
No ratings yet
Supervised vs Unsupervised Learning
46 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
61 pages
CH 1
No ratings yet
CH 1
34 pages
Ca10bd6d De86 4bae 9427 c60d433d2076 Supervised Learning
No ratings yet
Ca10bd6d De86 4bae 9427 c60d433d2076 Supervised Learning
17 pages
FAM Unit5
No ratings yet
FAM Unit5
47 pages
Regression vs Segmentation in Learning
No ratings yet
Regression vs Segmentation in Learning
13 pages
Unit 4 - Ai
No ratings yet
Unit 4 - Ai
17 pages
Unit 1 Machine Learning (2) (Autosaved)
No ratings yet
Unit 1 Machine Learning (2) (Autosaved)
44 pages
DUnit I
No ratings yet
DUnit I
25 pages
Supervised Vs Unsupervised Learning
No ratings yet
Supervised Vs Unsupervised Learning
2 pages
Unit 4
No ratings yet
Unit 4
72 pages
Deep Learning
No ratings yet
Deep Learning
23 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
BDAunit 5
No ratings yet
BDAunit 5
26 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
Unit II
No ratings yet
Unit II
25 pages
Machine Learning
No ratings yet
Machine Learning
56 pages
Cp4252 ML Unit-II
No ratings yet
Cp4252 ML Unit-II
44 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
17 pages
AI 4 Unit Notes
No ratings yet
AI 4 Unit Notes
47 pages
PDF&Rendition 1 2
No ratings yet
PDF&Rendition 1 2
27 pages
Unit - III
No ratings yet
Unit - III
40 pages
Unit 2 Supervised Learning Regression
No ratings yet
Unit 2 Supervised Learning Regression
111 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
10 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Supervised Unsupervised Reinforcement
No ratings yet
Supervised Unsupervised Reinforcement
39 pages
Manual Diesel DZ7187 DZ7188
No ratings yet
Manual Diesel DZ7187 DZ7188
18 pages
Yield Based Process Capability Indices For Nonnormal Continuous Data - JQT - 2019
No ratings yet
Yield Based Process Capability Indices For Nonnormal Continuous Data - JQT - 2019
11 pages
WS TLEICT7 W7 v2
No ratings yet
WS TLEICT7 W7 v2
14 pages
WiseInsight Software User Manual
No ratings yet
WiseInsight Software User Manual
68 pages
How To Become A Digital Nomad
No ratings yet
How To Become A Digital Nomad
52 pages
M.Com Computer Applications Syllabus
No ratings yet
M.Com Computer Applications Syllabus
37 pages
Master Thesis Topics in Communication Engineering
100% (1)
Master Thesis Topics in Communication Engineering
6 pages
ECE Project Abstract PDF
No ratings yet
ECE Project Abstract PDF
6 pages
A Marketer's Guide To Digital A - Shailin Dhar
No ratings yet
A Marketer's Guide To Digital A - Shailin Dhar
217 pages
Regulation-2019 Curriculum - PG - M.E - CSE
No ratings yet
Regulation-2019 Curriculum - PG - M.E - CSE
2 pages
iPhone 11 Pro Max Quality Dimensions Analysis
No ratings yet
iPhone 11 Pro Max Quality Dimensions Analysis
26 pages
Unit 1.5 Processor and Memory
No ratings yet
Unit 1.5 Processor and Memory
51 pages
4.4 Tiger Hash
No ratings yet
4.4 Tiger Hash
10 pages
Experiment 7: Magnitude Comparators
No ratings yet
Experiment 7: Magnitude Comparators
5 pages
Empowerment Technology: Self-Learning Package in
No ratings yet
Empowerment Technology: Self-Learning Package in
12 pages
Functional Dependency Lesson Plan Using Flipped Classroom Approach
No ratings yet
Functional Dependency Lesson Plan Using Flipped Classroom Approach
44 pages
Netcool Impact
No ratings yet
Netcool Impact
330 pages
Data Security: Technical and Organizational Protection Measures Against Data Loss and Computer Crime 1st Edition Thomas H. Lenhard Full Access
No ratings yet
Data Security: Technical and Organizational Protection Measures Against Data Loss and Computer Crime 1st Edition Thomas H. Lenhard Full Access
128 pages
Explorer 325 Installation Manual 98-131306-b01
No ratings yet
Explorer 325 Installation Manual 98-131306-b01
59 pages
Topic-1 (Advanced Web Designing)
No ratings yet
Topic-1 (Advanced Web Designing)
29 pages
Python Sorting Algorithms Overview
No ratings yet
Python Sorting Algorithms Overview
31 pages
Tutorial 3 Answers Part 1
No ratings yet
Tutorial 3 Answers Part 1
16 pages
AI Powered Robotics and Automation
No ratings yet
AI Powered Robotics and Automation
12 pages
Computer Networks - 1 - 4
No ratings yet
Computer Networks - 1 - 4
28 pages
Pirated Games
No ratings yet
Pirated Games
6 pages
GTMedia V8Nova V8Honor Rease Note2020.2.13
No ratings yet
GTMedia V8Nova V8Honor Rease Note2020.2.13
6 pages
Adobe XD Advanced Tools Guide
No ratings yet
Adobe XD Advanced Tools Guide
6 pages
CCS334 Set1
No ratings yet
CCS334 Set1
3 pages
Migrate Exchange to M365 E3 Guide
No ratings yet
Migrate Exchange to M365 E3 Guide
6 pages
Gps Trimble 4800
No ratings yet
Gps Trimble 4800
7 pages