Experiment No.
2
Aim: To study and implement Simple Linear Regression in Python.
Software used: Python
Theory: Simple Linear Regression is one of the simplest and most widely used statistical
techniques in machine learning and predictive modeling. It is used to establish a linear
relationship between two variables: an independent variable (x) and a dependent variable (y).
The goal is to model the dependent variable as a linear function of the independent variable.
Key Objectives
1. Understand the relationship between the two variables: How does y change when x
changes.
2. Predict future values of y based on new observations of x.
3. Optimize the model by minimizing prediction errors.
Assumptions of Simple Linear Regression
1. Linearity: The relationship between the independent and dependent variable must be
linear.
2. Independence of Errors: The residuals (differences between observed and predicted
values) should be independent.
3. Homoscedasticity: The variance of residuals should remain constant across all levels
of the independent variable.
4. Normality of Errors: The residuals should be approximately normally distributed.
Mathematical Representation
The equation of a simple linear regression is:
Where:
• y: Dependent variable (response variable or target).
• x: Independent variable (predictor or feature).
• Β0: Intercept (the predicted value of y when x=0).
• β1: Slope (rate of change in y per unit change in x). • ϵ: Error term (captures
noise or variability not explained by the model).
Coefficients Calculation
Using the Ordinary Least Squares (OLS) method:
Where:
• xˉ: Mean of x. yˉ: Mean of y.
Model Performance Metrics
1. Mean Squared Error (MSE): Measures the average squared difference between
observed and predicted values.
2. Root Mean Squared Error (RMSE): Square root of MSE, representing errors in the
same units as y.
3. R*2 (Coefficient of Determination): Explains the proportion of variance in y that can
be explained by x.
Application Scenarios
• Predictive Analytics: Forecast sales, stock prices, or weather.
• Risk Assessment: Estimate probabilities of events based on influencing factors.
• Scientific Research: Explore relationships between variables in experiments.
Python Implementation Outline
• Import Libraries: Use libraries like numpy, pandas, and scikit-learn.
Prepare Data: Load and preprocess the dataset.
• Fit the Model: Use Linear Regression or manually compute β0 and β1.
• Make Predictions: Use the model to predict y values for new x inputs.
• Evaluate: Assess performance using metrics like MSE or R*2.
Diagram of Simple Linear Regression
Conclusion: Thus we have studied about Simple Linear Regression using Python
Experiment No. 2
Name: Madhura Kanse ROLL NO: 619
Code:
Output: