Experiment No - 03
Aim: To implement Logistics Regression.
Theory:
What is Logistics Regression:
Logistic regression is a statistical method used for binary classification problems, where the
outcome or dependent variable has two possible values, often represented as 0 and 1. Unlike
linear regression, which predicts continuous values, logistic regression predicts the probability
that a given input point belongs to a specific class.
Key Concepts:
1. Sigmoid Function: Logistic regression uses the sigmoid function to map predicted
values to probabilities. The sigmoid function, σ(z)\sigma(z)σ(z), is defined as:
Binary Outcome: The dependent variable yyy is binary, taking values of 0 or 1. The logistic
function outputs probabilities between 0 and 1, which can then be mapped to these binary
outcomes using a threshold (typically 0.5).
Log-Odds and Odds Ratio: Logistic regression models the log-odds of the probability of the
outcome:
Applications:
● Medical: Predicting the presence or absence of a disease.
● Finance: Predicting the likelihood of a customer defaulting on a loan.
● Marketing: Determining whether a customer will buy a product.
● Binary Classification Problems: Any problem where the outcome is one of two
categories.
Program For Logistics Regression.
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report,
confusion_matrix
import matplotlib.pyplot as plt
# Generate some example data
np.random.seed(0)
X = np.random.rand(100, 2) # 100 samples with 2 features
y = (X[:, 0] + X[:, 1] > 1).astype(int) # Binary target variable
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
# Create the Logistic Regression model
model = LogisticRegression()
# Train the model using the training sets
model.fit(X_train, y_train)
# Make predictions using the testing set
y_pred = model.predict(X_test)
# Print evaluation metrics
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test,
y_pred))
# Optional: Visualize the decision boundary
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.01),
np.arange(y_min, y_max, 0.01))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.Spectral)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o',
cmap=plt.cm.Spectral)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Logistic Regression Decision Boundary')
plt.show()
Explanation:
Data Generation: We generate some synthetic data for demonstration purposes. X is our matrix
of features, and y is the binary target variable.
Data Splitting: The data is split into training and testing sets using train_test_split.
Model Creation: A LogisticRegression model is created.
Model Training: The model is trained on the training data using the fit method.
Prediction: Predictions are made on the test set using the predict method.
Evaluation: The confusion matrix and classification report are printed to evaluate the model's
performance.
Visualization: An optional plot shows the decision boundary of the logistic regression model.
Output:
Conclusion:
Hence , We have successfully implement Logistics Regression using python
libraries like numpy, sk learn, and predicted dependent variable from an
independent variable.