Experiment-1: Exercises to solve the real-world problems using the following machine
learning methods: Linear Regression Logistic Regression.
Linear regression
Description: Linear regression predicts the relationship between two variables by assuming a
linear connection between the independent and dependent variables. Linear regression makes
predictions for continuous/real or numeric variables such as sales, salary, age, product price,
etc. It seeks the optimal line that minimizes the sum of squared differences between predicted
and actual values. Applied in various domains like economics and finance, this method
analyses and forecasts data trends.
To calculate best-fit line linear regression uses a traditional slope-intercept form which is
given below,
1. Yi = β0 + β1Xi
Where Yi = Dependent variable, β0 = constant/Intercept, β1 = Slope/Intercept, Xi =
Independent variable.
You can use simple linear regression to model the relationship between two variables, such as
these:
Rainfall and crop yield
Age and height in children
Temperature and expansion of the metal mercury in a thermometer
Import the modules you need.
import matplotlib.pyplot as plt
from scipy import stats
Create the arrays that represent the values of the x and y axis:
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
Execute a method that returns some important key values of Linear Regression:
slope, intercept, r, p, std_err = stats.linregress(x, y)
Create a function that uses the slope and intercept values to return a new value. This new
value represents where on the y-axis the corresponding x value will be placed:
def myfunc(x):
return slope * x + intercept
Run each value of the x array through the function. This will result in a new array with new
values for the y-axis:
mymodel = list(map(myfunc, x))
Draw the original scatter plot:
plt.scatter(x, y)
Draw the line of linear regression:
plt.plot(x, mymodel)
Display the diagram:
plt.show()
Program:
Start by drawing a scatter plot:
import matplotlib.pyplot as plt
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
plt.scatter(x, y)
plt.show()
Output:
Import scipy and draw the line of Linear Regression:
import matplotlib.pyplot as plt
from scipy import stats
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
slope, intercept, r, p, std_err = stats.linregress(x, y)
def myfunc(x):
return slope * x + intercept
mymodel = list(map(myfunc, x))
plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()
Output:
Logistic Regression
Description: Logistic regression aims to solve classification problems. It does this by
predicting categorical outcomes, unlike linear regression that predicts a continuous outcome.
In the simplest case there are two outcomes, which is called binomial, an example of
which is predicting if a tumor is malignant or benign. Other cases have more than two
outcomes to classify, in this case it is called multinomial. A common example for multinomial
logistic regression would be predicting the class of an iris flower between 3 different species.
import numpy
Store the independent variables in X.
Store the dependent variable in y.
Below is a sample dataset:
#X represents the size of a tumor in centimeters.
X = numpy.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69,
5.88]).reshape(-1,1)
#Note: X has to be reshaped into a column from a row for the LogisticRegression() function to
work.
#y represents whether or not the tumor is cancerous (0 for "No", 1 for "Yes").
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
We will use a method from the sklearn module, so we will have to import that module as well:
from sklearn import linear_model
From the sklearn module we will use the LogisticRegression() method to create a logistic
regression object.
This object has a method called fit() that takes the independent and dependent values as
parameters and fills the regression object with data that describes the relationship:
logr = linear_model.LogisticRegression()
logr.fit(X,y)
Now we have a logistic regression object that is ready to whether a tumor is cancerous based
on the tumor size:
#predict if tumor is cancerous where the size is 3.46mm:
predicted = logr.predict(numpy.array([3.46]).reshape(-1,1))
Program:
import numpy
from sklearn import linear_model
#Reshaped for Logistic function.
X = numpy.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69,
5.88]).reshape(-1,1)
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
logr = linear_model.LogisticRegression()
logr.fit(X,y)
#predict if tumor is cancerous where the size is 3.46mm:
predicted = logr.predict(numpy.array([3.46]).reshape(-1,1))
print(predicted)
Output:
[0]