0% found this document useful (0 votes)

5 views18 pages

Unit 2 Regression

Uploaded by

skrandom145

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views18 pages

Unit 2 Regression

Uploaded by

skrandom145

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Unit II : Regression

What is Regression? Explain types of Regressions.

1. What is Regression?

Regression is a supervised learning technique in machine learning used to predict a

continuous numerical value (quantity) based on one or more input features.

 Goal: Find the relationship between a dependent variable (target) and independent
variables (predictors).
 Example: Predicting a house price using features like size, location, and number of
rooms.

Key Terms:

1. Dependent Variable (Target) – The value we want to predict (e.g., house price).
2. Independent Variables (Features) – Input factors affecting the prediction (e.g.,
locality, rooms).

Need for Regression:

 Price prediction (houses, stocks, etc.)

 Trend forecasting (sales, demand)
 Risk analysis (medical or financial risk)
 Decision-making based on patterns

2. Types of Regression

There are several types, but the main ones covered in your syllabus are:

A. Linear Regression

 Definition: Models the relationship between dependent and independent variables

with a straight-line equation.
 Formula:

y=b0+b1xy = b_0 + b_1x

where b0b_0 = intercept, b1b_1 = slope.

 Types:
1. Simple Linear Regression:
 One independent variable.
 Example: Predicting marks based on study hours.
2. Multiple Linear Regression:
 Two or more independent variables.
 Example: Predicting house price using size, location, and number of
bedrooms.
 Advantages: Easy to interpret, works well for linear data.
 Limitations: Cannot model non-linear relationships well.

B. Non-Linear Regression

 Definition: Models situations where the relationship between variables is not a

straight line.
 Formula: Could be polynomial, exponential, logarithmic, etc.
 Example: Population growth, disease spread curves.
 Advantages: Can handle complex patterns.
 Limitations: More complex, harder to interpret, may require iterative methods.

C. Polynomial Regression

 Definition: A special case of non-linear regression where the model is a polynomial

of the independent variable(s).
 Formula:

y=b0+b1x+b2x^2+...+b^nx^n

 Example: Predicting traffic flow across different times of the day.

 Advantage: Fits curves better than linear regression.
 Limitation: Risk of overfitting if degree is too high.

D. Stepwise Regression

 Definition: Iteratively adds or removes variables to find the most relevant predictors.
 Types:
1. Forward Selection – Start with no variables, add one by one.
2. Backward Elimination – Start with all variables, remove the least useful
ones.
 Advantages: Reduces complexity, focuses on important variables.
 Limitations: Can lead to overfitting, may miss the best combination of features.

E. Decision Tree Regression

 Definition: Uses a tree-like model to split data into smaller groups based on feature
values, predicting the average of the group.
 Advantages: Easy to interpret, handles non-linear data.
 Limitations: Can overfit, unstable to small changes in data.
F. Random Forest Regression

 Definition: An ensemble method that combines many decision trees to improve

accuracy.
 Advantages: High accuracy, handles missing data, less overfitting.
 Limitations: More complex, less interpretable than a single tree.

✅ Summary Table:

Relationship Handles Non-Linear

Type Complexity Example
Shape Data?
Marks vs Study
Simple Linear Straight line Low ❌ No
Hours
Multiple
Straight plane Medium ❌ No House Price
Linear
Polynomial Curved line Medium ✅ Yes Traffic Flow
Stepwise Variable Medium ✅ Sometimes Feature Selection
Decision Tree Piecewise splits Medium ✅ Yes Salary Prediction
Random
Many trees High ✅ Yes Stock Price
Forest
Differentiate multivariate regression and univariate regression.
Aspect Univariate Regression Multivariate Regression
Deals with only one dependent
variable and one independent variable
Number of (Simple Linear Regression) OR one Deals with more than one dependent
Variables dependent variable and multiple variable and multiple independent
Considered independent variables (Multiple variables.
Regression can still be univariate if
only one dependent variable).
Studies the relationship between a
Studies relationships among multiple
Purpose single dependent variable and
dependent variables simultaneously.
predictors.
Less complex, easier to visualize and More complex, requires advanced
Complexity
interpret. statistical techniques.
Multiple equations, one for each
dependent variable, e.g.,
y=b0+b1xy = b_0 + b_1x (or
Equation y1=b01+b11x1+...y_1 = b_{01} +
extended for multiple predictors but
Form b_{11}x_1 + ... and
still one y).
y2=b02+b12x1+...y_2 = b_{02} +
b_{12}x_1 + ....
Predicts multiple output values at
Output Predicts one output value.
once.
Predicting height and weight of a
Predicting student’s marks based on
Example person based on age, diet, and
study hours.
exercise.

In Short:
 Univariate regression → 1 dependent variable
 Multivariate regression → 2 or more dependent variables

Explain Bias-Variance Trade-off with respect to Machine Learning.

1. What is Bias?

 Definition: The error caused by wrong assumptions in the learning algorithm.

 High Bias → Underfitting
o Model is too simple.
o Misses important patterns in the data.
o Performs poorly on both training and test data.
 Example: Trying to fit a straight line to curved data.

2. What is Variance?

 Definition: The error caused by model sensitivity to small changes in the training
data.
 High Variance → Overfitting
o Model is too complex.
o Fits noise as well as actual patterns.
o Performs well on training data but poorly on new data.
 Example: Very deep decision tree memorizing the training set.

3. Bias–Variance Trade-off

 Definition: The balance between bias and variance to achieve the best generalization
on unseen data.
 Goal: Find the "sweet spot" where total error is minimal.
 Reason for Trade-off:
o If a model is too simple → High bias, low variance → Underfits.
o If a model is too complex → Low bias, high variance → Overfits.
o We need a model that’s just complex enough to capture patterns without
memorizing noise.

4. Graphical Understanding

Imagine a curve showing:

 Bias decreases as model complexity increases.

 Variance increases as model complexity increases.
 Total error is minimized at a middle point → This is the ideal trade-off.

5. Summary Table

Aspect Low Bias & High Variance High Bias & Low Variance
Model Complexity Too complex Too simple
Error on Training Data Low High
Error on Test Data High High
Problem Type Overfitting Underfitting
Example High-degree polynomial curve Straight line for curved data

✅ Key Tip for Exams:

Think of it like Goldilocks’ porridge:

 Too simple → underfit (high bias).

 Too complex → overfit (high variance).
 Just right → good trade-off, best performance.
Differentiate Ridge and Lasso Regression techniques
1. Basic Idea

Both Ridge and Lasso are regularization techniques used in regression to:

 Reduce overfitting
 Improve model generalization
 Work by adding a penalty term to the regression equation

3. Summary in Simple Words

 Ridge → "Shrink but don’t delete" coefficients.

 Lasso → "Shrink and sometimes delete" coefficients.

4. Quick Example

Imagine predicting house prices with 100 features:

 Ridge will keep all features but reduce the importance of less useful ones.
 Lasso will completely remove irrelevant features and keep only the most important
ones.
Explain three evaluation metrics used for regression model.
Explain the Random forest Regression in detail.
1. What is Random Forest Regression?

 Definition: A machine learning algorithm that predicts continuous numerical values

by combining results from multiple decision trees (an ensemble method).
 Idea: Instead of relying on one decision tree (which might overfit), build many trees
and average their predictions.
 Type: Supervised learning algorithm.

2. How It Works

Random Forest builds multiple decision trees in four main steps:

1. Bootstrap Sampling (Bagging)

o Randomly select samples with replacement from the dataset to train each
tree.
o Ensures each tree gets slightly different data.
2. Feature Sampling
o At each split in a tree, only a random subset of features is considered.
o Helps make trees diverse and less correlated.
3. Tree Building
o Each tree is grown independently using its sampled data and features.
o Uses Mean Squared Error (MSE) as splitting criterion for regression tasks.
4. Prediction Aggregation
o For regression, predictions from all trees are averaged to get the final output.

3. Example

Suppose we want to predict house price:

 Tree 1 predicts ₹52 lakh

 Tree 2 predicts ₹50 lakh
 Tree 3 predicts ₹55 lakh
 Final Prediction = (52 + 50 + 55) / 3 = ₹52.33 lakh

4. Advantages

 High Accuracy: Averaging multiple trees reduces error.

 Handles Non-linear Relationships: Works well with complex patterns.
 Robustness: Less affected by noise or missing values.
 Feature Importance: Can tell which features impact predictions most.
 Less Overfitting: Bagging and feature sampling reduce variance.

5. Disadvantages

 Complexity: More difficult to interpret compared to a single tree.

 Computation Time: Slower to train and predict if there are many trees.
 Memory Usage: Requires storing multiple trees in memory.

6. When to Use

 Large datasets with many features.

 Problems with non-linear or complex relationships.
 When avoiding overfitting is important.

✅ Quick Summary Table

Aspect Random Forest Regression

Type Ensemble (Bagging)
Aspect Random Forest Regression
Base Learner Decision Tree
Output Average of tree outputs
Strength High accuracy, robust
Weakness Less interpretable, slower

Differentiate between Regression and Correlation.

Aspect Correlation Regression
Measures the strength and Models the relationship between
Meaning direction of the relationship dependent and independent variables
between two variables. to make predictions.
To predict the value of a dependent
To see if variables are related and
Purpose variable based on one or more
how strongly.
independent variables.
A single value (correlation An equation that describes the
Output coefficient, e.g., Pearson’s r) relationship, e.g., y=b0+b1xy = b_0 +
between -1 and +1. b_1x.
Shows how much the dependent
Direction of Shows positive, negative, or no
variable changes when the
Relationship correlation.
independent variable changes.
Prediction ❌ Cannot be used for prediction. ✅ Can be used for prediction.
Does not prove causation, but can
Causation Does not imply causation. help investigate possible causal
effects.
Mathematical Equation with coefficients (slope,
Single coefficient rr.
Expression intercept).
Correlation between ice cream Predicting house price based on size
Example
sales and temperature. and location.

✅ Key Tip to Remember:

 Correlation → "Are they related?" (strength & direction only)

 Regression → "How are they related?" + "Can we predict?"

What is underfitting and overfitting in machine Learning explain the

techniques to reduce overfitting?
1. Underfitting

Definition:

 Happens when a model is too simple to capture the underlying patterns in data.
 Performs poorly on both training data and test data.
Causes:

 Model complexity is too low.

 Not enough training time (early stopping too soon).
 Missing important features in the dataset.
 Incorrect assumptions (e.g., using linear regression for non-linear data).

Characteristics:

 High Bias, Low Variance.

 Predictions are inaccurate even on training data.

Example:

 Using a straight line (linear model) to fit a dataset with a clear curve.

2. Overfitting

Definition:

 Happens when a model memorizes the training data, including noise and outliers.
 Performs well on training data but poorly on unseen (test) data.

Causes:

 Model complexity is too high.

 Too many features without proper regularization.
 Training for too many epochs without monitoring performance.
 Small dataset with high model capacity.

Characteristics:

 Low Bias, High Variance.

 Training error is low, but test error is high.

Example:

 Very deep decision tree fitting every point in training data, including noise.

3. Bias–Variance View

 Underfitting → High Bias, Low Variance.

 Overfitting → Low Bias, High Variance.
 Goal: Find the right bias–variance trade-off for best generalization.
4. Techniques to Reduce Overfitting

Here are the main methods used in practice:

A. Simplify the Model

 Reduce the number of features (Feature Selection).

 Use fewer parameters.

B. Regularization

 Add penalty terms to control coefficient size:

o Ridge Regression (L2 penalty)
o Lasso Regression (L1 penalty)
o Elastic Net (combination of L1 & L2).

C. Cross-Validation

 Use k-fold cross-validation to check performance on different subsets of data and

prevent reliance on a single train/test split.

D. Early Stopping

 Stop training when validation error starts increasing, even if training error is
decreasing.

E. Pruning (in Decision Trees)

 Remove unnecessary branches to simplify the tree.

F. Dropout (in Neural Networks)

 Randomly drop some neurons during training to prevent over-dependence on certain

paths.

G. Data Augmentation

 Create more training data artificially (especially for images, text) by transformations
like rotation, flipping, cropping, etc.

H. Increase Training Data

 More diverse training samples help the model generalize better.

5. Quick Comparison Table

Feature Underfitting Overfitting

Model Complexity Too simple Too complex
Feature Underfitting Overfitting
Bias High Low
Variance Low High
Training Error High Low
Test Error High High
Fix Increase complexity Reduce complexity / Regularize

Explain Elastic Net regression in Machine Learning.

1. What is Elastic Net Regression?

 Definition:
Elastic Net is a regularization technique that combines Ridge Regression (L2
penalty) and Lasso Regression (L1 penalty) into a single model.
 Purpose:
To handle limitations of both Ridge and Lasso and work well when:
o There are many correlated features.
o We need both feature selection and coefficient shrinkage.

3. How It Works

 L1 (Lasso) part → forces some coefficients to exactly zero (feature selection).

 L2 (Ridge) part → shrinks remaining coefficients smoothly (reduces variance).
 This combination helps when:
o Some features are irrelevant.
o Some features are highly correlated.
4. Advantages

 Handles multicollinearity (like Ridge).

 Performs feature selection (like Lasso).
 Works well when:
o Number of predictors > Number of observations.
o Features are highly correlated.
 More stable than Lasso when predictors are correlated.

5. Disadvantages

 Slightly more complex to tune because we have two parameters (λ\lambda and α\
alpha).
 Requires careful cross-validation to find best values.

6. Example Use Case

 Genomics: Thousands of gene features, many correlated, but only some relevant for
predicting a disease risk.
 Finance: Predicting stock returns where many economic indicators are correlated.

7. Quick Summary Table

Feature Ridge Lasso Elastic Net

Penalty L2 L1 L1 + L2
Feature Selection No Yes Yes
Handles
Yes No Yes
Multicollinearity
Coefficient Shrinking Yes Yes (some to zero) Yes
All features Few features Many correlated features & need
Best When
useful important selection

ML 3
No ratings yet
ML 3
50 pages
Machine Learning Regression Guide
No ratings yet
Machine Learning Regression Guide
6 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
LECTURE Regression
No ratings yet
LECTURE Regression
12 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
Module 2
No ratings yet
Module 2
5 pages
Supervised Learning Regression
No ratings yet
Supervised Learning Regression
15 pages
Unit 2 ML Regression
No ratings yet
Unit 2 ML Regression
46 pages
Regression
No ratings yet
Regression
56 pages
Types of Regression in Data Science
No ratings yet
Types of Regression in Data Science
8 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
DS Unit 2: Regression & Classification
No ratings yet
DS Unit 2: Regression & Classification
17 pages
Module 2 Modified
No ratings yet
Module 2 Modified
67 pages
Linear Regression & Decision Trees
No ratings yet
Linear Regression & Decision Trees
16 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
Unit 6
No ratings yet
Unit 6
107 pages
Data Science
No ratings yet
Data Science
5 pages
9 Types of Regression Analysis
No ratings yet
9 Types of Regression Analysis
16 pages
Chapter 2
No ratings yet
Chapter 2
50 pages
Data Analysis Chap 3
No ratings yet
Data Analysis Chap 3
21 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
ML 01 (Pranavv)
No ratings yet
ML 01 (Pranavv)
14 pages
ML 7th Sem AIML ITE Notes Complete LONG (1) - 34-62
No ratings yet
ML 7th Sem AIML ITE Notes Complete LONG (1) - 34-62
29 pages
MLRS Assignment 1 24070146008 Sreemanth Mannem
No ratings yet
MLRS Assignment 1 24070146008 Sreemanth Mannem
12 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Introduction To AI and ML
No ratings yet
Introduction To AI and ML
22 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Assignment-2 ML Very Shortcut
No ratings yet
Assignment-2 ML Very Shortcut
6 pages
Unit 5
No ratings yet
Unit 5
18 pages
SML
No ratings yet
SML
8 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Data Science Module 5 Q & A
No ratings yet
Data Science Module 5 Q & A
8 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Regression
No ratings yet
Regression
45 pages
Module 5 Machine Learning
No ratings yet
Module 5 Machine Learning
36 pages
ML Unit 3
No ratings yet
ML Unit 3
2 pages
5 Regression-1
No ratings yet
5 Regression-1
46 pages
S&ML Unit 5 - Q & A
No ratings yet
S&ML Unit 5 - Q & A
15 pages
ML Week 4
No ratings yet
ML Week 4
5 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
ML Points
No ratings yet
ML Points
13 pages
Parametric
No ratings yet
Parametric
15 pages
DAV 2201079 Exp 2 2-1
No ratings yet
DAV 2201079 Exp 2 2-1
35 pages
Datamining Unit4
No ratings yet
Datamining Unit4
21 pages
Regression Algorithms Guide
No ratings yet
Regression Algorithms Guide
22 pages
SCIKIT
No ratings yet
SCIKIT
12 pages
1 - Intro To Machine Learning
No ratings yet
1 - Intro To Machine Learning
34 pages
Machine Learning - Develop Machine Learning Model - Regression
No ratings yet
Machine Learning - Develop Machine Learning Model - Regression
36 pages
Understanding Dependent Variables in Regression
No ratings yet
Understanding Dependent Variables in Regression
5 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
ML Theory
No ratings yet
ML Theory
10 pages
ML 01 (Shubham)
No ratings yet
ML 01 (Shubham)
14 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
What Is An SVM
No ratings yet
What Is An SVM
24 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
Key Machine Learning Concepts and Applications
No ratings yet
Key Machine Learning Concepts and Applications
22 pages
Unit Iii Machine Learning
No ratings yet
Unit Iii Machine Learning
19 pages
Machine Learning Insem
No ratings yet
Machine Learning Insem
86 pages
Unit 1 - Intro - ML
No ratings yet
Unit 1 - Intro - ML
20 pages
DMV Unit 1
No ratings yet
DMV Unit 1
33 pages
DMV Unit 2
No ratings yet
DMV Unit 2
50 pages
Business Research Assignment 2
No ratings yet
Business Research Assignment 2
7 pages
Deep Learning Approaches For Speech Emotion Recognition: State of The Art and Research Challenges
No ratings yet
Deep Learning Approaches For Speech Emotion Recognition: State of The Art and Research Challenges
68 pages
Syllabus of Core Mathematics For ALevel
No ratings yet
Syllabus of Core Mathematics For ALevel
80 pages
8physics 8110 - Electromagnetic Theory II: Solutions For Homework # 1
No ratings yet
8physics 8110 - Electromagnetic Theory II: Solutions For Homework # 1
10 pages
Bell States
No ratings yet
Bell States
16 pages
ENEM 2018: Linguagens e Tecnologias
0% (1)
ENEM 2018: Linguagens e Tecnologias
44 pages
ML Insem PYQ 2022 To 24
No ratings yet
ML Insem PYQ 2022 To 24
6 pages
Pete 311 Lab 11 Memo
No ratings yet
Pete 311 Lab 11 Memo
3 pages
Olandoski M. Logic Design of Switching Circuits. Vol. 1 2015
No ratings yet
Olandoski M. Logic Design of Switching Circuits. Vol. 1 2015
301 pages
SASMO 2014 Round 1 Secondary 1 Problems
100% (1)
SASMO 2014 Round 1 Secondary 1 Problems
3 pages
Variance
No ratings yet
Variance
5 pages
Applications of Burzynski Failure Criteria - I Iso
No ratings yet
Applications of Burzynski Failure Criteria - I Iso
11 pages
Karl Pearson - Grammar of Science
100% (2)
Karl Pearson - Grammar of Science
542 pages
Hans Stephani - Relativity An Introduction To Special 123
No ratings yet
Hans Stephani - Relativity An Introduction To Special 123
283 pages
Finding Efficient Portfolios Using Machine Learning - Claudio Salvetti 03-2025
No ratings yet
Finding Efficient Portfolios Using Machine Learning - Claudio Salvetti 03-2025
25 pages
(David H. Barlow, Michel Hersen) Single Case Exper
100% (1)
(David H. Barlow, Michel Hersen) Single Case Exper
432 pages
Review - Mathematics, Surveying and Transportation Engineering
No ratings yet
Review - Mathematics, Surveying and Transportation Engineering
11 pages
6th Semester Course Offerings
No ratings yet
6th Semester Course Offerings
3 pages
EC303 - SIGNAL - Lecture 3,4,5
No ratings yet
EC303 - SIGNAL - Lecture 3,4,5
12 pages
Compass Surveying Essentials
No ratings yet
Compass Surveying Essentials
11 pages
Ngữ pháp Trọng âm
No ratings yet
Ngữ pháp Trọng âm
7 pages
Automated Discovery of Workflow Models From Hospital Data: L.Maruster and W.van Der Aalst
No ratings yet
Automated Discovery of Workflow Models From Hospital Data: L.Maruster and W.van Der Aalst
5 pages
Fluid Mechanics for Engineers
No ratings yet
Fluid Mechanics for Engineers
9 pages
Grade 9 TLE/Computer Grading Sheet
No ratings yet
Grade 9 TLE/Computer Grading Sheet
65 pages
VLSI Parallel Prefix Adders Analysis
No ratings yet
VLSI Parallel Prefix Adders Analysis
5 pages
Language, Embodiment, and The Cognitive Niche: Andy Clark
No ratings yet
Language, Embodiment, and The Cognitive Niche: Andy Clark
5 pages
Astm D 945
No ratings yet
Astm D 945
11 pages
Management Accounting For Engineers: Final Examination
No ratings yet
Management Accounting For Engineers: Final Examination
11 pages
Grade 10 Physics: Forces Worksheet
100% (2)
Grade 10 Physics: Forces Worksheet
3 pages
CO 5 SKILL MATERIAL 22 23maths
No ratings yet
CO 5 SKILL MATERIAL 22 23maths
73 pages

Unit 2 Regression

Uploaded by

Unit 2 Regression

Uploaded by

Unit II : Regression

What is Regression? Explain types of Regressions.

Regression is a supervised learning technique in machine learning used to predict a

Need for Regression:

 Price prediction (houses, stocks, etc.)

 Definition: Models the relationship between dependent and independent variables

y=b0+b1xy = b_0 + b_1x

where b0b_0 = intercept, b1b_1 = slope.

 Definition: Models situations where the relationship between variables is not a

 Definition: A special case of non-linear regression where the model is a polynomial

 Example: Predicting traffic flow across different times of the day.

E. Decision Tree Regression

 Definition: An ensemble method that combines many decision trees to improve

Relationship Handles Non-Linear

Explain Bias-Variance Trade-off with respect to Machine Learning.

 Definition: The error caused by wrong assumptions in the learning algorithm.

Imagine a curve showing:

 Bias decreases as model complexity increases.

✅ Key Tip for Exams:

 Too simple → underfit (high bias).

3. Summary in Simple Words

 Ridge → "Shrink but don’t delete" coefficients.

Imagine predicting house prices with 100 features:

 Definition: A machine learning algorithm that predicts continuous numerical values

Random Forest builds multiple decision trees in four main steps:

1. Bootstrap Sampling (Bagging)

Suppose we want to predict house price:

 Tree 1 predicts ₹52 lakh

 High Accuracy: Averaging multiple trees reduces error.

 Complexity: More difficult to interpret compared to a single tree.

 Large datasets with many features.

✅ Quick Summary Table

Aspect Random Forest Regression

Differentiate between Regression and Correlation.

✅ Key Tip to Remember:

 Correlation → "Are they related?" (strength & direction only)

What is underfitting and overfitting in machine Learning explain the

 Model complexity is too low.

 High Bias, Low Variance.

 Model complexity is too high.

 Low Bias, High Variance.

 Underfitting → High Bias, Low Variance.

Here are the main methods used in practice:

A. Simplify the Model

 Reduce the number of features (Feature Selection).

 Add penalty terms to control coefficient size:

 Use k-fold cross-validation to check performance on different subsets of data and

E. Pruning (in Decision Trees)

 Remove unnecessary branches to simplify the tree.

F. Dropout (in Neural Networks)

 Randomly drop some neurons during training to prevent over-dependence on certain

H. Increase Training Data

 More diverse training samples help the model generalize better.

5. Quick Comparison Table

Feature Underfitting Overfitting

Explain Elastic Net regression in Machine Learning.

 L1 (Lasso) part → forces some coefficients to exactly zero (feature selection).

 Handles multicollinearity (like Ridge).

6. Example Use Case

7. Quick Summary Table

Feature Ridge Lasso Elastic Net

You might also like