Introduction Regression Analysis
Muhammad Naveed Aman
School of Computing
About Me
• B. Sc. Computer Systems Engineering
• UET Peshawar, Pakistan.
• M. Sc. Computer Engineering
• CASE Islamabad, Pakistan.
• M. Engg. Industrial and Management Engineering
• RPI, Troy, NY, USA
• Ph. D. Electrical Engineering
• RPI, Troy, NY, USA
Pearson Correlation
Introduction to Linear
Regression
• Regression is a statistical procedure that
determines the equation for the straight
line that best fits a specific set of data.
• A linear relationship between the
dependent and independent variables
• Nonlinear estimation is extremely difficult
to perform
• Even though a relationship is nonlinear we
can make a linear approximation for that 4
nonlinear relationship.
Introduction to Linear
Regression (Contd)
• Equation for straight line
• 𝑌=𝑎𝑋+𝑏
• The value of b is called the slope
constant and determines the direction
and degree to which the line is tilted.
• The value of a is called the Y-
intercept and determines the point
where the line crosses the Y-axis. 5
Introduction to Linear
Regression (Contd)
• Distance between the data points and the
line
• The total error between the data points and
the line is obtained by squaring each
distance and then summing the squared
values.
• The regression equation is designed to
produce the minimum sum of squared
errors.
Regression Line
• 𝒃=𝒓 𝑺↓𝒀 /𝑺↓𝑿
• 𝑨= 𝑴↓𝒀 −𝒃𝑴↓𝑿
• Standardized Variables
• 𝑏=𝑟
• 𝐴=0
R2 and Adj R2
• The R-squared statistic provides a measure of how
well the model is fitting the actual data
• Measure of the linear relationship between our
predictor variable and our response / target variable
• Lies between 0 and 1
• A number near 0 represents a regression that does
not explain the variance in the response variable well
and a number close to 1 does explain the observed
variance in the response variable
• For Example: R2 = 0.65
• 65% of the variance found in the response variable
can be explained by the predictor variable
Residuals vs Fitted Values
Question & Answer