Experiment No 2
Aim:To implement Logistic Regression.
Theory:
What is Logistic Regression?
Logistic Regression is a supervised learning algorithm used for
classification problems, where the goal is to predict a categorical
dependent variable (usually binary: 0 or 1, Yes or No, True or False).
Unlike linear regression which predicts continuous values, logistic
regression predicts the probability of a data point belonging to a particular
class.
How Does Logistic Regression Work?
1. Input: Features X=(x1,x2,...,xn)
2. Linear Combination: Calculate a weighted sum:
3.
4. Apply Sigmoid Function:
This squashes the output to a value between 0 and 1, interpreted as the
probability of belonging to class 1.
5. Decision:
Code and Output:
import numpy as np
import pandas as pd
data = pd.read_csv("framingham.csv") #importing the dataset
data.sample(5)
data.drop(['education'],axis=1,inplace=True) # removing the 'education'
column
data.shape # checking the shape
data.isnull().sum() #checking if any null value present
data = data.dropna() # Remove the null values row
data.isnull().sum() # Check if any null value present
data.shape #Check the shape
data.dtypes #checking the data types
data['cigsPerDay'] = data['cigsPerDay'].astype(dtype='int64')
data['BPMeds'] = data['BPMeds'].astype(dtype='int64')
data['totChol'] = data['totChol'].astype(dtype='int64')
data['heartRate'] = data['heartRate'].astype(dtype='int64')
data['glucose'] = data['glucose'].astype(dtype='int64')
data.dtypes #checking the data types
X = data.iloc[:,0:-1] # All columns except last one as X
y = data.iloc[:,-1] # Only last column as y
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test =
train_test_split(X,y,test_size=.30,random_state=1) #splitting the data as
train and test
X_train.shape
X_test.shape
from sklearn.linear_model import LogisticRegression
l_reg = LogisticRegression() # Making a logistic regression model
l_reg.fit(X_train,y_train) # Fitting the data
y_pred = l_reg.predict(X_test) # Predict the X_test data
from sklearn import metrics
metrics.accuracy_score(y_test,y_pred) # calculate the accuracy
We got accuracy score as 0.8497777777777777 means almost 85%
accurate prediction.
Conclusion:
The logistic regression experiment successfully demonstrated how a
binary classification model can be used to predict categorical outcomes
based on input features. The model effectively estimated the probability
of class membership, allowing for clear decision boundaries. The results
indicated that logistic regression is a suitable and interpretable approach
for this classification problem, providing insights into the influence of
predictor variables.