0% found this document useful (0 votes)

23 views21 pages

Maths

Uploaded by

kushalabrijesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views21 pages

Maths

Uploaded by

kushalabrijesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 21

Topic

PREDICTING
POSSIBLE LOAN
DEFAULTERS
L S
KI A M O
UK N A U
S HU N J
HI S I A
AT H S N
L HA H Y
AA A A
L S
KI A M O
UK N A U
S HU N J
HI S I A
AT H S N
L HA H Y
AA A A
L S
KI M A O
UK A N U
SH N U J
HI I S A
AT S H N
LH H A Y
AA A A
L S
KI M A O
UK A N U
SH N U J
HI I S A
AT S H N
LH H A Y
AA A A
L S
K I M A O
U K A N U
S H N U J
H I I S A
A T S H N
L H H A Y
A A A A
OUR TEAM
KUSHALA B Coding and Data
1KS23AI023
GOWDA Analysis

Coding and
MANISHA T P 1KS23AI028
Data collection

Presentation
ANUSHA C 1KS23AI003 Layout and
Design

Report and
LIKHITHA M 1KS23AI025
editing

Presentation
SOUJANYA Coding
Typing

7
Objectives
The program aims to predict the likelihood of
loan defaults among borrowers using statistical,
probabilistic, and machine learning techniques.

This helps financial institutions make informed

lending decisions and manage risk effectively.

Language
Programming language used is python.

8
LOAN
In Last Class We Discussed About

1. PROBLEMS FACED BY BANK

2. SOLUTIONS
3. MATHEMATIC TOOLS
* STATISTICS
* GRAPHS
*PROBABILITY

9
SAMPLE DATA

1
Married House_ Car_
Experienc / Ownershi Ownershi Professio CURRENT_ CURRENT_
ID Income Age e Single p p n CITY STATE JOB_YRS HOUSE_YRS
739309 West
1 0 59 19 single rented no Geologist Malda Bengal 4 13
121500 Firefighte Maharashtr
2 4 25 5 single rented no r Jalna a 5 10
890134 Maharashtr
3 2 50 12 single rented no Lawyer Thane a 9 14
194442 Maharashtr
4 1 49 9 married rented yes Analyst Latur a 3 12
Comedia West
5 13429 25 18 single rented yes n Berhampore Bengal 13 11
343762 Economis
6 1 78 14 single rented no t Ramgarh Jharkhand 3 10
510149
7 8 55 0 married rented no Artist Pallavaram Tamil Nadu 0 14
671694 Flight
8 6 70 15 single rented yes attendant Yamunanagar Haryana 14 13
836980
9 2 43 7 single rented no Secretary Anand Gujarat 6 13
1 956545 Andhra
0 7 65 5 single rented yes Engineer Nandyal Pradesh 3 12
1
CODE
LANGUAGE -->PYTHON
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))

import matplotlib.pyplot as plt

import seaborn as sns
%matplotlib inline

sns.set_theme(style = "darkgrid")

data = pd.read_csv("/kaggle/input/loan-prediction-based-on-customer-behavior/Training Data.csv")

data.head()

rows, columns = data.shape #understanding the data set

('Rows:', rows)
('Columns:',columns)

data.info()
1
()
data.isnull().sum()

data.columns

data.describe() #Analysing Numerical columns

data.corr()

data.hist( figsize = (22, 20) )

plt.show()

data["Risk_Flag"].value_counts()
fig, ax = plt.subplots( figsize = (12,8) )
corr_matrix = data.corr()
corr_heatmap = sns.heatmap( corr_matrix, cmap = "flare", annot=True, ax=ax, annot_kws={"size": 14})
plt.show()

def categorical_valcount_hist(feature): #Analysing the categorical features

(data[feature].value_counts())
fig, ax = plt.subplots( figsize = (6,6) )
sns.countplot(x=feature, ax=ax, data=data)
plt.show()
categorical_valcount_hist("Married/Single")
categorical_valcount_hist("House_Ownership")

Print( "Total categories in STATE:", len(data["STATE"].unique() ) )

Print()
Print(data["STATE"].value_counts() )
Print( "Total categories in Profession:",len ( data["Profession"].unique() ) )
Print()
data["Profession"].value_counts()

data.info() #Data Analysis

sns.boxplot(x ="Risk_Flag",y="Income" ,data = data)

sns.boxplot(x ="Risk_Flag",y="Age" ,data = data)
sns.boxplot(x ="Risk_Flag",y="Experience" ,data = data)
sns.boxplot(x ="Risk_Flag",y="CURRENT_JOB_YRS" ,data = data)
sns.boxplot(x ="Risk_Flag",y="CURRENT_HOUSE_YRS" ,data = data)
fig, ax = plt.subplots( figsize = (8, 6) )
sns.countplot(x='House_Ownership', hue='Risk_Flag', ax=ax, data=data)

fig, ax = plt.subplots( figsize = (8,6) )

sns.countplot(x='Car_Ownership', hue='Risk_Flag', ax=ax, data=data)

fig, ax = plt.subplots( figsize = (8,6) )

sns.countplot( x='Married/Single', hue='Risk_Flag', data=data )

fig, ax = plt.subplots( figsize = (10,8) )

sns.boxplot(x = "Risk_Flag", y = "CURRENT_JOB_YRS", hue='House_Ownership', data = data)

#Feature Engineering
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
import category_encoders as ce

data.info()

label_encoder = LabelEncoder()
for col in ['Married/Single','Car_Ownership’]:
data[col] = label_encoder.fit_transform( data[col] )
onehot_encoder = OneHotEncoder(sparse = False)
data['House_Ownership'] = onehot_encoder.fit_transform(data['House_Ownership'].values.reshape(-1, 1) )

high_card_features = ['Profession', 'CITY', 'STATE']

count_encoder = ce.CountEncoder()

# Transform the features, rename the columns with the _count suffix, and join to dataframe
count_encoded = count_encoder.fit_transform( data[high_card_features] )
data = data.join(count_encoded.add_suffix("_count"))
data.head()
data= data.drop(labels=['Profession', 'CITY', 'STATE'], axis=1)
data.head()

#Splitting the data into train and test splits

x = data.drop("Risk_Flag", axis=1)y = data["Risk_Flag"]
from sklearn.model_selection import train_test_splitx_train, x_test, y_train, y_test = train_test_split(x, y,
test_size = 0.2, stratify = y, random_state = 7)
#Random Forest Classifier
from sklearn.ensemble import RandomForestClassifier
from imblearn.over_sampling import SMOTE
from imblearn.pipeline import Pipeline

rf_clf = RandomForestClassifier(criterion='gini', bootstrap=True, random_state=100)

smote_sampler = SMOTE(random_state=9)
pipeline = Pipeline(steps = [['smote', smote_sampler], ['classifier', rf_clf]])
pipeline.fit(x_train, y_train)
y_pred = pipeline.predict(x_test)
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score,
accuracy_score, roc_auc_score

Print("-------------------------TEST SCORES-----------------------")
Print(f"Recall: {(recall_score(y_test, y_pred)*100, 4) }")
Print(f"Precision: ({precision_score(y_test, y_pred)*100, 4) }")
Print(f"F1-Score:{(f1_score(y_test, y_pred)*100, 4)} ")
Print(f"Accuracy score: {(accuracy_score(y_test, y_pred)*100, 4) }")
Print(f"AUC Score:{ (roc_auc_score(y_test, y_pred)*100, 4) }")
Reference

YOUTUBE CHANNELS
@NYCDataScienceAcademy
@PyDataTV
WEBSITES
global.pydata.org
numfocus.org
https://www.kaggle.com/

2
THANK YOU

Kritika Sejwal 24MCI10023 ML Lab Project Report
No ratings yet
Kritika Sejwal 24MCI10023 ML Lab Project Report
10 pages
Python Code For Loan Default Prediction
No ratings yet
Python Code For Loan Default Prediction
4 pages
Machine Learning Paper BD
No ratings yet
Machine Learning Paper BD
16 pages
Capstone Project
No ratings yet
Capstone Project
33 pages
LDA CreditCardDefault Code N
No ratings yet
LDA CreditCardDefault Code N
11 pages
Group 5 Dseb64a Report
No ratings yet
Group 5 Dseb64a Report
10 pages
LOan Final
No ratings yet
LOan Final
6 pages
Loan Approval Prediction Models
No ratings yet
Loan Approval Prediction Models
10 pages
Credit Risk Prediction Model Overview
No ratings yet
Credit Risk Prediction Model Overview
19 pages
Loan Status Prediction
No ratings yet
Loan Status Prediction
23 pages
Report
No ratings yet
Report
34 pages
Credit Card Default Prediction
No ratings yet
Credit Card Default Prediction
33 pages
Final Project Making Predictions From Data-Course 2: October 6, 2020
No ratings yet
Final Project Making Predictions From Data-Course 2: October 6, 2020
20 pages
Project Report On Credit Risk Analysis Using Random Forest
No ratings yet
Project Report On Credit Risk Analysis Using Random Forest
8 pages
Loan Prediction
No ratings yet
Loan Prediction
33 pages
Final-12-Lab Programs
No ratings yet
Final-12-Lab Programs
30 pages
Final Project Title and Abstract Group-3
No ratings yet
Final Project Title and Abstract Group-3
5 pages
Credit Card Default Prediction PRESENTATION
No ratings yet
Credit Card Default Prediction PRESENTATION
12 pages
Day89 90 Loan Predictions Model 1706059551
No ratings yet
Day89 90 Loan Predictions Model 1706059551
25 pages
An Kit
No ratings yet
An Kit
12 pages
Ai It HW MST Prac
No ratings yet
Ai It HW MST Prac
14 pages
Loan Prediction
No ratings yet
Loan Prediction
26 pages
Data Science for Home Loan Automation
No ratings yet
Data Science for Home Loan Automation
11 pages
EDA Assignment Summary PDF
No ratings yet
EDA Assignment Summary PDF
12 pages
Reading Material - Module-5 - Introduction To Special Topics
No ratings yet
Reading Material - Module-5 - Introduction To Special Topics
27 pages
Loan Approval Prediction
No ratings yet
Loan Approval Prediction
23 pages
EasyVisa: Streamlining Visa Approvals
No ratings yet
EasyVisa: Streamlining Visa Approvals
67 pages
Final Report
No ratings yet
Final Report
69 pages
Loan Approval Prediction Python
No ratings yet
Loan Approval Prediction Python
6 pages
Capstone Project Report v1 - Abhishek Bihani
No ratings yet
Capstone Project Report v1 - Abhishek Bihani
16 pages
Credit Risk Prediction Model Analysis
No ratings yet
Credit Risk Prediction Model Analysis
7 pages
LendingClub Loan Default Prediction Model
No ratings yet
LendingClub Loan Default Prediction Model
18 pages
Loan Default Prediction Models
No ratings yet
Loan Default Prediction Models
23 pages
Machine Learning (P1)
No ratings yet
Machine Learning (P1)
9 pages
Predicting Loan Defaults
No ratings yet
Predicting Loan Defaults
3 pages
Inline: Import As Import As Import As Import As Matplotlib Import
100% (1)
Inline: Import As Import As Import As Import As Matplotlib Import
15 pages
Bank Marketing Ingles
No ratings yet
Bank Marketing Ingles
37 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
Credit EDA Case Study
No ratings yet
Credit EDA Case Study
42 pages
EDA Case Study on Loan Default Risk
No ratings yet
EDA Case Study on Loan Default Risk
33 pages
HCI ScorecardModel PPT
No ratings yet
HCI ScorecardModel PPT
9 pages
Naive Bayes Analysis for Personal Loans
No ratings yet
Naive Bayes Analysis for Personal Loans
4 pages
Loan Eligibility Prediction Model Analysis
No ratings yet
Loan Eligibility Prediction Model Analysis
12 pages
Step - 05
No ratings yet
Step - 05
56 pages
Credit Card Default Risk Analysis
100% (1)
Credit Card Default Risk Analysis
16 pages
SSRN Id3769854
No ratings yet
SSRN Id3769854
8 pages
FRA Cheat Sheet Week1
No ratings yet
FRA Cheat Sheet Week1
2 pages
Zindi Financial Inclusion Guide
No ratings yet
Zindi Financial Inclusion Guide
12 pages
Credit Risk Prediction in Taiwan
100% (1)
Credit Risk Prediction in Taiwan
7 pages
Credit Default Project 23124001
No ratings yet
Credit Default Project 23124001
13 pages
PA v0.7
No ratings yet
PA v0.7
15 pages
Building Logistic Regression Model in Python
No ratings yet
Building Logistic Regression Model in Python
24 pages
Shsconf Icdeba2023 02008
No ratings yet
Shsconf Icdeba2023 02008
5 pages
Data Mining Approach
No ratings yet
Data Mining Approach
4 pages
Credit Risk Analysis Model
100% (1)
Credit Risk Analysis Model
3 pages
System Unit Parts and Its Functions
No ratings yet
System Unit Parts and Its Functions
32 pages
Cybersecurity Incident Simulation
No ratings yet
Cybersecurity Incident Simulation
8 pages
State Design Gray Code
No ratings yet
State Design Gray Code
3 pages
Discounted Reinforcement Learning Is Not An Optimization Problem
No ratings yet
Discounted Reinforcement Learning Is Not An Optimization Problem
6 pages
Steam Gen: Detection and Classification of Discontinuities Using Discrete Wavelet Transform and MFL Testing
No ratings yet
Steam Gen: Detection and Classification of Discontinuities Using Discrete Wavelet Transform and MFL Testing
10 pages
N-Channel JFET Switching Specifications
No ratings yet
N-Channel JFET Switching Specifications
9 pages
9788advancing Into Analytics From Excel To Python and R 1st Edition Mount George Download
100% (2)
9788advancing Into Analytics From Excel To Python and R 1st Edition Mount George Download
62 pages
IP Quality of Service PDF
No ratings yet
IP Quality of Service PDF
368 pages
Model Paper 2 BEEE
No ratings yet
Model Paper 2 BEEE
2 pages
Corrective Maintenance Work Order Guide
No ratings yet
Corrective Maintenance Work Order Guide
2 pages
Datasheet TM104SDH01
No ratings yet
Datasheet TM104SDH01
20 pages
Backup: (Note 1)
No ratings yet
Backup: (Note 1)
20 pages
2019design Procedure For Two-Stage CMOS Opamp Using gmID Design Methodology in 16 NM FinFET Technology
No ratings yet
2019design Procedure For Two-Stage CMOS Opamp Using gmID Design Methodology in 16 NM FinFET Technology
5 pages
Looking For Real Exam Questions For IT Certification Exams!
No ratings yet
Looking For Real Exam Questions For IT Certification Exams!
14 pages
Block Chain Summary & MCQ
No ratings yet
Block Chain Summary & MCQ
26 pages
MxPro Manual
No ratings yet
MxPro Manual
438 pages
Key Questions in Computer Organization
No ratings yet
Key Questions in Computer Organization
3 pages
Marc Product Brochure
No ratings yet
Marc Product Brochure
20 pages
Engineering Optimization: Theory and Practice
No ratings yet
Engineering Optimization: Theory and Practice
3 pages
TRANSCRIPT
No ratings yet
TRANSCRIPT
4 pages
Assignment Brief BTEC Level 4-5 HNC/HND Diploma (QCF) : To Be Filled by The Learner
100% (1)
Assignment Brief BTEC Level 4-5 HNC/HND Diploma (QCF) : To Be Filled by The Learner
8 pages
Archclass 8 Advance Imaging (Solved)
No ratings yet
Archclass 8 Advance Imaging (Solved)
2 pages
Sprites: Sprite Animation
No ratings yet
Sprites: Sprite Animation
5 pages
Avast Antivirus For Linux Technical Documentation
No ratings yet
Avast Antivirus For Linux Technical Documentation
15 pages
AR in RETAIL Sector Report September 2019 v1.21 by STIQ
No ratings yet
AR in RETAIL Sector Report September 2019 v1.21 by STIQ
35 pages
Introduction To Communication Lab Manual Using Multisim
No ratings yet
Introduction To Communication Lab Manual Using Multisim
40 pages
MetaboScape 5.0 User Manual
No ratings yet
MetaboScape 5.0 User Manual
272 pages
Yeni Metin Belgesi
No ratings yet
Yeni Metin Belgesi
4 pages
Conversion of Nfa To Dfa-1 Up
100% (1)
Conversion of Nfa To Dfa-1 Up
12 pages
Directory Structure
100% (1)
Directory Structure
47 pages

Maths

Uploaded by

Maths

Uploaded by

Topic

This helps financial institutions make informed

1. PROBLEMS FACED BY BANK

import matplotlib.pyplot as plt

data = pd.read_csv("/kaggle/input/loan-prediction-based-on-customer-behavior/Training Data.csv")

rows, columns = data.shape #understanding the data set

data.describe() #Analysing Numerical columns

data.hist( figsize = (22, 20) )

def categorical_valcount_hist(feature): #Analysing the categorical features

Print( "Total categories in STATE:", len(data["STATE"].unique() ) )

data.info() #Data Analysis

sns.boxplot(x ="Risk_Flag",y="Income" ,data = data)

fig, ax = plt.subplots( figsize = (8,6) )

fig, ax = plt.subplots( figsize = (8,6) )

fig, ax = plt.subplots( figsize = (10,8) )

high_card_features = ['Profession', 'CITY', 'STATE']

#Splitting the data into train and test splits

rf_clf = RandomForestClassifier(criterion='gini', bootstrap=True, random_state=100)

You might also like