0% found this document useful (0 votes)

37 views8 pages

Loan Default Prediction Using Random Forest

The document presents the design and implementation of a loan default prediction system using the Random Forest algorithm, aimed at improving decision-making for financial institutions. By analyzing various data sources, the system achieved a high Area Under the Curve (AUC) score of 98%, indicating its effectiveness in predicting loan defaults. The methodology includes data loading, cleaning, processing, feature extraction, model training, and evaluation, demonstrating the potential for enhanced loan approval processes.

Uploaded by

maan younis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views8 pages

Loan Default Prediction Using Random Forest

Uploaded by

maan younis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

137

Scientia Africana, Vol. 22 (No. 3), December, 2023. Pp 137-144

DESIGN AND IMPLEMENTATION OF A LOAN DEFAULT PREDICTION SYSTEM

USING RANDOM FOREST ALGORITHM
1Oghenekaro,
L. U., and 2Chimela, M. C.
1,2
Computer Science Department, Faculty of Computing, University of Port Harcourt, Nigeria
Emails: [email protected], [email protected]

Received: 20-09-2023
Accepted: 01-11-2023

https://dx.doi.org/10.4314/sa.v22i3.12
This is an Open Access article distributed under the terms of the Creative Commons Licenses [CC BY-NC-ND 4.0]
http://creativecommons.org/licenses/by-nc-nd/4.0.
Journal Homepage: http://www.scientia-african.uniportjournal.info
Publisher: Faculty of Science, University of Port Harcourt.

ABSTRACT
Loan default prediction is a crucial task in the lending industry; it helps financial institutions make
informed decisions about granting loans. It is usually a daunting task for the bank or financial
institution to predict customers who will default on a loan especially when there are thousands of
applicants. This loan default prediction system aimed to improve the Area Under the Curve (AUC)
score. This loan default prediction system used various data sources, such as demographic
information, credit history, and financial performance to predict the likelihood of a loan being
defaulted. The system used a random forest (RF) machine learning algorithm to analyze the data
and build predictive models. The model was then used to make predictions about new loan
applicants and existing borrowers who may default in the future. The system can be customized to
meet the specific requirements of different lending institutions. The system enables lenders to make
better decisions on loan approval, interest rate determination, and credit risk, management. The
loan default prediction system also provides insights into risk factors that contribute to loan default
and helps lenders develop effective strategies to mitigate these risks, making it an indispensable
tool for lenders. The resultant system achieved an improved AUC score of 98%.
Keywords: AUC score, Loan Default, Loan Processing, Predictive Model, Random Forest
Algorithm

INTRODUCTION minimizes the losses that could be incurred

from defaults, hence increasing the profit
Loan processing is a crucial issue faced by
generate from the interest from the loan. Loans
banks in recent years. It is a way of checking if
produce the largest income but constitute a huge
a customer will default on a loan in the process
risk and exposure. In order to fund viable
of repayment, and this knowledge will
projects, banks mobilize deposits and create
determine if a loan should be granted to the
loans. When loans are of good quality, they
customer or not. Many financial institutions or
generate revenue for the bank and at the same
banks approve and disburse loans following a
time help to stimulate economic growth
long authentication and validation process, but
(Hussain and Shorouq, 2014). In finance, a loan
there is no assurance that the selected candidate
is the lending of money by one or more
is the most eligible of all applicants (Purohit et
individuals, organizations, or other entities to
al., 2011). Through this process, the bank
individuals etc. In a lot of instances, the lenders
138

Oghenekaro, L.U., and Chimela, M.C.: Design and Implementation of a Loan Default Prediction System Using Random…

usually add some charges called interest to the Decision Tree, K-Nearest Neighbor and
amount borrowed which the debtor must pay Lightgbm. The algorithms were trained with
while repaying the amount borrowed. The secondary data obtained from kaggle website,
repayment of this loan by the debtor is usually the dataset contained 10,128 applicants, 23
within a fixed time frame maybe months or attributes and 1 class attribute. The data was
weeks. At times the debtors do not pay their preprocessed using missing value handling,
loan as at when due, resulting in a loan default. feature extraction and categorical variables
This leads to loss of money on the part of the transformation. The adopted the hold-out
lenders due to the fact that the debtors might approach to validate the dataset, where 70% of
end up not paying part of the loan taken. Loan the data where for training the algorithm and
defaulting is a major financial risk for the 30% was for testing. With this approach, the
finance industry as it harms the interest of the performance of all six machine learning
financiers and destroys social trust (Twala, algorithms that were adopted for the work were
2010). Due to loss on the part of the lender evaluated under the metrics of precision,
(usually financial institutions) there has been accuracy, recall, F1-Score and Area Under
efforts to forecast the outcome of a loan before Curve (AUC). Of which the Lightgbm recorded
approval to curb instances of bad debts. In the highest accuracy with a score of 0.9189, and
recent times, with the introduction of new decision tree had the lowest accuracy score of
technologies data is been generated with every 0.8497. In addition to the evaluation of the
click, and data scientists have been researching models in terms of accuracy, the models were
and making progress in the finance and banking also evaluated using the AUC metrics, and
field (Hamid and Ahmad, 2011). Research has AUC graphs were produced for all six
been carried out to build systems that will classification algorithms. The Lightgbm
predict if a customer will pay back his/her loan outperformed other ML algorithms with an
on time. Before now when the applicant filled AUC score of 75%. Based on the result from
out a form to get a loan from the bank, the the test data, it was concluded that applicants
customer's credit score history was usually with low credit score should be denied access to
analyzed by the loan officers together with loan facility as they have a high probability of
other things like the amount to be loaned, the defaulting. The results showed that applicants
salary of the applicant, reason for applying for with high income, requesting for small loan
loan, amount in the bank currently, and also if amounts were ideal applicants to be granted
the customer is on any loan when he is applying loan. Their study showed that data features such
for the new loan, with all this process it was as gender and marital status were not
usually time consuming and tasking, especially determining factors for the prediction output.
when the number for loan applicants are more.
Wu, W. J. (2022) applied the random forest
Currently with a lot of data been generated on
algorithm and the XGBoost algorithm to build
daily basis and with the aid of machine learning
prediction models. Dataset was obtained from
algorithms, the processing of loan gets faster
Imperial College London, the dataset contained
and more efficient, saving losses as incidence of
a total of 105,471 records and 778 features.
bad debts are reduced. The traditional system
The work employed the variance threshold
becomes slow as compared to what the speed,
method at the feature engineering stage, where
and accuracy we could get with the help of
unimportant features were filtered out of the
machine learning.
dataset. Variance inflation factor (VIF) was
LITERATURE REVIEW used to measure multi core linearity of the data
set. The pre-processed dataset was randomly
Almamun et al. (2022) adopted six different
separated into 80-20 proportion, where 80%
machine learning (ML) algorithms to predict if
was the training dataset, and 20% was the test
a loan applicant is eligible. The ML algorithms
dataset. The model demonstrated that though
include Random Forest, Adaboost, XGBoost,
the random forest and the XGBoost algorithms
139
Scientia Africana, Vol. 22 (No. 3), December, 2023. Pp 137-144
© Faculty of Science, University of Port Harcourt, Printed in Nigeria ISSN 1118 – 1931

are decision tree algorithms, the random forest Huang et al. (2023) attempted to increase the
model recorded a prediction accuracy of percentage accuracy of predicting loan defaulter
0.90657, while XGBoost was 0.90635. The by adopting the ensemble learning
result indicated an insignificant accuracy algorithm.The paper selected Adaboost
between the two decision tree algorithms. The algorithm as best performing model for loan
study was able to demonstrate that the random default prediction. Secondary dataset was from
forest as well as the XGBoost algorithm are the credit platform provided in a Tianchi
suitable algorithms for loan default prediction. competition. The dataset originally contained
1.2 million records and 47 data features.
Uwais & Khaleghzadeh (2022) implemented
However, considering time factor in processing
the machine learning (ML) algorithms preset on
the huge dataset, a total of 100,000 records
the Sparks Big Data Platform, to build loan
were randomly selected for the purpose of
default prediction models. The work applied six
model building. The data was cleaned for
different supervised ML classification
missing values and outliers, and the feature
algorithms to predict loan default, they include;
selection technique was adopted to select
Decision Tree, Logistic Regression, Gradient
relevant features from irrelevant features. At the
Boosted Tree, Random Forest, Linear Support
model construction stage, the initial value of the
Vector Machine, and Factorization Machine.
parameters and the tuned values were
Secondary dataset was adopted from Kaggle
tabularized in the work. The proposed model
website, the dataset contained 640,000 instances
recorded an accuracy of 88%.
and 14 features. The dataset was randomly
separated. Income was plotted against education Li et al. (2021) aimed to improve prediction
using a scatter plot to identify correlation accuracy by using the blending method to fuse
between these two features of the dataset, using 3 models; Random forest (RF), CatBoost and
the pandas matplotlib function of the python Logistics Regression (LR). The blending
language, available on Spark. A positive method involved training a new learner, and the
correlation was seen between applicant’s model of the blending method was a two-layer
educational level and income, because as level framework. Loan data was obtained from a
of education increases, the income increases. lending club for Q4 2019, as made publicly
The work adopted several histograms to available on kaggle website. The data contained
visualize the information from the dataset based 128,262 records and 150 attributes, however,
on minority and gender status. Data was pre- over 40% of the data was removed as they were
processed by removal of null values and insignificant to the study. The adaptive
adjustment of attribute data type. The pre- synthetic sampling approach (ADASYN) was
processed data was further prepared using the adopted to address class imbalance problem of
steps of feature selection, addressing class the dataset, and solve the problem of
imbalance problem, converting categorical data performance degradation due to data imbalance.
to numerical data, and randomly splitting data The RF, CatBoost, and LR served as benchmark
in 70% training and 30% test data. The six to the proposed fused model. Validation metrics
supervised ML algorithms present on Spark of accuracy, roc curve, F1-score and recall,
MLib were applied to the training data, and demonstrated that the fused model
used to train the models, while the test data was outperformed the other three individual models.
used to evaluate the model. Of all six ML
Odegua (2020) adopted the Extreme Gradient
classifiers, the decision tree and random forest
Boosting (XGBoost) to build a predictive model
demonstrated best performance with receiver
to predict loan defaulters. They obtained dataset
operating characteristic (ROC) curve score of
from Data Science Nigeria, hosted on Zindi
99.56%, recall 99.2%. F-Score 99.5%, and
platform. The dataset contained 26,897 records
precision 99.8%. The work demonstrated
and 31 attributes, which underwent data pre-
success in classifying loan defaulters in one of
processing and wrangling stages, before being
the available two classes.
140

Oghenekaro, L.U., and Chimela, M.C.: Design and Implementation of a Loan Default Prediction System Using Random…

used for training with the XGBoost classifier community data repository. The data contained
algorithm. The system was implemented with 148,670 thousand records and 34 features.
python programming language, and the
The following are the processes used to build
classifier was trained on the cleaned dataset,
the loan default prediction system:
using the good_bad_flag feature as target. Five
metrics; Recall, Accuracy, F1-Score, ROC 1) Data Loading;
value, and Precision were used to evaluate the 2) Data Cleaning;
model. 3) Data Processing;
4) Feature Extraction;
Literature Review has shown several attempts
5) Model Training;
made by researchers to improve the accuracy of
6) Model evaluation.
predicting loan defaulters automatically.
MATERIALS AND METHOD 1. Data Loading
The Dataset used in the loan default prediction The data was loaded into the Google Colab
dataset was compiled by M. Yasser and environment using the read_csv method from
uploaded to the Kaggle Data Science the pandas library. It can be seen in figure 1.

Figure1:Data loading using read_csv function

2. Data Cleaning

The data was cleaned from missing values, outliers, to make it fit for training, using the simple
imputer method in the scikit-learn library as seen in figure 2.

Figure 2: Data Cleaning

141
Scientia Africana, Vol. 22 (No. 3), December, 2023. Pp 137-144
© Faculty of Science, University of Port Harcourt, Printed in Nigeria ISSN 1118 – 1931

3. Data Processing
The data was processed to remove duplicate columns or features. One-hot encoding was done as seen
in figure 3, to convert categorical columns into numerical columns, filling those columns with 0’s and
1’s since the random forest classifier that will be used to train on the data cannot find patterns in
categorical values.

Figure 3: Code for data processing

4. Feature Extraction

Some features were expunged in this phase since they had little or no effect on the target (label) or
they were duplicates. In this phase, a total of twenty-four features were dropped, and ten remained
as seen in figure 4.

Figure 4: Code for Feature Extraction

5. Model Training
In this phase the data was fed into the random forest classifier in the scikit learn library in python.
The data was trained using 70% of the data set. The codes nippet can be seen in figure 5.
142

Oghenekaro, L.U., and Chimela, M.C.: Design and Implementation of a Loan Default Prediction System Using Random…

Figure 5: Code for model training

6. Model Evaluation
The model was tested with the test dataset and evaluated using the area under the curve score, recall
and precision. The following evaluation scores were generated as demonstrated in figure 6.

Figure 6. Code for model evaluation

RESULT DISCUSSION negative. This is the ability of the classifier to
classify all positive observation as positive,
Figure 7 shows the confusion matrix of the
the recall of the proposed system is 0.9965.
loan default prediction system. The number of
The ability of the model to classify all positive
true positives where 32,664 observations
observation was 99.6% accurate. Precision is
which implies that the number of those
the ratio of true positive to the sum of true
instances that will not default and were
positive and false positive. It represents the
predicted as such were 32,664 observations.
ability of this loan default classifier not to
The true negative where 10,930 observations
label as non–default, a sample that is default.
meaning the number of those instances that
The precision for the trained random forest
will default and were correctly predicted as
model is 0.9742, showing that the model
default were 10,930 observation. Table 1
classifies about 97 occurrences out of 100
shows some performance metrics of the
correctly. AUCscore represents the area under
model; such as precision, Recall, and
the curve. The AUC Score reflects how well a
F1_score. The F1_score is interpreted as the
model predicts the correct category a loan will
harmonic mean of precision and recall, where
fall into. The Area Under Curve score for the
an F1 score reaches it best value at 1 and
RF model was evaluated to be 0.9823. This
worst score at 0. The relative contribution of
represents the ability of the loan default
precision and recall to the F1 score are equal.
system to accurately make prediction, and
The F1_Score of 0.98 means that the model is
gives additional indication of the quality of
close to being optimal. Recall is the ratio true
prediction made by the model.
positive to the sum of true positive and false
143
Scientia Africana, Vol. 22 (No. 3), December, 2023. Pp 137-144
© Faculty of Science, University of Port Harcourt, Printed in Nigeria ISSN 1118 – 1931

Figure 7: Confusion matrix of the model

CONCLUSION Classification Method in Data Mining,

International Journal of Information
The study was aimed at achieving a higher
and Education Technology, 1(2): 150-
AUC score by adopting the random forest
155.
algorithm in building the predictive model for
Huang, Y., Shao, Y., Tang, D., Huang, J., and
predicting loan defaulters. Secondary data
Chen, S. (2023). Loan Default Prediction
was sourced for the research, and the data
Based on Ensemble Learning,
was preprocessed, and used to train the
International Journal of Innovation and
algorithm. The resultant model was evaluated
Research in Educational Sciences, 10(3):
using performance metrics and area under
149 – 159.
curve score. The results revealed that the
Hussain, A.B. and Shorouq,F.K.E. (2014).
predictive system built with the random forest
Credit risk assessment model for
algorithm recorded high performance
Jordanian commercial banks: Neural
percentage both in accuracy metrics and AUC
scoring approach”, Review of
score. Further works can be done, in the
Development Finance, Elsevier, 4(10):
aspect of creating a graphic user interface for
20–28.
the application, to make the system more
Li, X., Ergu, D., Zhang, D.,Qiu, D., Cai, Y.
user-friendly.
and Ma, B. (2021) Prediction of Loan
REFERENCES Default Based on Multi-model Fusion,
Procedia Computer Science.
Almamun, M., Farjana, A., Mamun, M.
Odegua, R. (2020) Predicting Bank Loan
(2022). Predicting Bank Loan Eligibility
Default with Extreme Gradient
uing Machine Learning Models and
Boosting, Preprint Cornell University.
Comparison Analysis, Proceedings of
Purohit, S. U., Mahadevan, V. and Kulkarni,
the 7th North American International
A. N. (2011) Credit Evaluation Models
Conference on Industrial Engineering
of Loan Proposals for Indian Banks,
and Operations Management, Florida.
International Journal of Modelling and
1423 – 1432.
Optimization. 2(4): 529 – 534.
Hamid,E. N. andAhmad, N (2011). A New
Twala, B. (2010) Multiple classifier
Approach for Labeling the Class of
Application to Credit Risk Assessment,
Bank Credit Customers via
144

Oghenekaro, L.U., and Chimela, M.C.: Design and Implementation of a Loan Default Prediction System Using Random…

Expert Systems with Applications. 37(4): Intelligence and Cognitive Science,

3326–3336. Dublin, 118-129.
Uwais, A. M. and Khaleghzadeh, H. (2022) Wu, W. J. (2022) Machine Learning
Loan Default Prediction using Spark Approaches to Predict Loan Default,
Machine Learning Algorithms, AIAI Intelligent Information Management.
29th Irish Conference on Artificial 14(3), 157-164.

Xtreme Boosting Machine
No ratings yet
Xtreme Boosting Machine
5 pages
Project Documents
No ratings yet
Project Documents
9 pages
Loan Default Risk Assessment Using Supervised Learning
No ratings yet
Loan Default Risk Assessment Using Supervised Learning
7 pages
Coser Al. Crisan Albu (T)
No ratings yet
Coser Al. Crisan Albu (T)
17 pages
IJCRT2106313
No ratings yet
IJCRT2106313
3 pages
Loan Approval Prediction via DM Techniques
No ratings yet
Loan Approval Prediction via DM Techniques
8 pages
Loan Default Prediction: Decision Trees vs. Random Forest
No ratings yet
Loan Default Prediction: Decision Trees vs. Random Forest
13 pages
Loan Default Prediction System
No ratings yet
Loan Default Prediction System
44 pages
2022 V13i1198
No ratings yet
2022 V13i1198
12 pages
Lending Club Data Analysis PDF
No ratings yet
Lending Club Data Analysis PDF
3 pages
Credit Loan Default Prediction Model
No ratings yet
Credit Loan Default Prediction Model
4 pages
Loan Approval Prediction with ML
No ratings yet
Loan Approval Prediction with ML
4 pages
Madaan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012042
No ratings yet
Madaan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012042
13 pages
Shsconf Icdeba2023 02008
No ratings yet
Shsconf Icdeba2023 02008
5 pages
Loan Prediction with ML Models
No ratings yet
Loan Prediction with ML Models
11 pages
Bank Loan Fraud Detection Insights
No ratings yet
Bank Loan Fraud Detection Insights
7 pages
Iim 2022092709434339
No ratings yet
Iim 2022092709434339
8 pages
Ajol-File-Journals 543 Articles 255840 650d5184b77f4
No ratings yet
Ajol-File-Journals 543 Articles 255840 650d5184b77f4
14 pages
Predicting Loan Defaults with AI
No ratings yet
Predicting Loan Defaults with AI
10 pages
Algorithm Comparison For Data Mining Classification: Assessing Bank Customer Credit Scoring Default Risk
No ratings yet
Algorithm Comparison For Data Mining Classification: Assessing Bank Customer Credit Scoring Default Risk
10 pages
1 PB
No ratings yet
1 PB
13 pages
Loan Prediction 10
No ratings yet
Loan Prediction 10
10 pages
Capstone Project Report v1 - Abhishek Bihani
No ratings yet
Capstone Project Report v1 - Abhishek Bihani
16 pages
Research Report
No ratings yet
Research Report
8 pages
JRFM 18 00023
No ratings yet
JRFM 18 00023
20 pages
10.3934 Dsfe.2024009
No ratings yet
10.3934 Dsfe.2024009
14 pages
Machine Learning for Loan Approval Prediction
No ratings yet
Machine Learning for Loan Approval Prediction
7 pages
Credit Risk Management Using ML
No ratings yet
Credit Risk Management Using ML
4 pages
Make 06 00004
No ratings yet
Make 06 00004
25 pages
Loan Default Prediction Analysis Report
No ratings yet
Loan Default Prediction Analysis Report
8 pages
27.fintect Fir The Poor Tantri2020
No ratings yet
27.fintect Fir The Poor Tantri2020
50 pages
Qtmfinalpresentationpaper
No ratings yet
Qtmfinalpresentationpaper
19 pages
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
No ratings yet
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
11 pages
Prathyush PullaUB9A
No ratings yet
Prathyush PullaUB9A
9 pages
Cluster Credit Risk R PDF
No ratings yet
Cluster Credit Risk R PDF
13 pages
Predicting Consumer Default
No ratings yet
Predicting Consumer Default
71 pages
Assessment of Default Risk Factors in The Disbursement of Home Loans
No ratings yet
Assessment of Default Risk Factors in The Disbursement of Home Loans
13 pages
Loan Default Prediction in P2P Lending
No ratings yet
Loan Default Prediction in P2P Lending
11 pages
Introduction 1
No ratings yet
Introduction 1
6 pages
金融违约笔记
No ratings yet
金融违约笔记
10 pages
Loan Default Prediction Models
No ratings yet
Loan Default Prediction Models
23 pages
Loan Default Prediction System
No ratings yet
Loan Default Prediction System
24 pages
B.E Cse Batchno 149
No ratings yet
B.E Cse Batchno 149
43 pages
Rapport Loan Prediction Finance
No ratings yet
Rapport Loan Prediction Finance
24 pages
Loan Approval Prediction Using Machine Learning
No ratings yet
Loan Approval Prediction Using Machine Learning
2 pages
The VIth International Conference Advanced Information Systems and Technologies, AIST 2018
No ratings yet
The VIth International Conference Advanced Information Systems and Technologies, AIST 2018
4 pages
Credit Risk Analysis in Peer-to-Peer Lending System: September 2016
No ratings yet
Credit Risk Analysis in Peer-to-Peer Lending System: September 2016
5 pages
Credit Card Default Risk Analysis
100% (1)
Credit Card Default Risk Analysis
16 pages
Application of AI in Credit Risk Asseement in Emerging Economies
No ratings yet
Application of AI in Credit Risk Asseement in Emerging Economies
17 pages
Paper 3
No ratings yet
Paper 3
5 pages
Credit Loan Default Prediction
No ratings yet
Credit Loan Default Prediction
22 pages
Report
No ratings yet
Report
34 pages
ML Implementation in Lending and Credit Scoring in Rural Areas
No ratings yet
ML Implementation in Lending and Credit Scoring in Rural Areas
24 pages
Paper 1
No ratings yet
Paper 1
10 pages
Hypothesis Testing in Employee Age Study
No ratings yet
Hypothesis Testing in Employee Age Study
2 pages
IdeaPad Slim 5 13ARP10 83J2000HIN
No ratings yet
IdeaPad Slim 5 13ARP10 83J2000HIN
2 pages
GAC - Presentation N
No ratings yet
GAC - Presentation N
33 pages
Contextualized Lesson Plan During The Recovery Program
No ratings yet
Contextualized Lesson Plan During The Recovery Program
3 pages
Teacher's Role in Community & Ethics
No ratings yet
Teacher's Role in Community & Ethics
13 pages
Introduction To Human Movement Analysis: Jason Friedman
No ratings yet
Introduction To Human Movement Analysis: Jason Friedman
33 pages
Saep 21
No ratings yet
Saep 21
29 pages
Example To REST Provider Service
No ratings yet
Example To REST Provider Service
11 pages
Choosing A Feedline Choke Palomar Engineers 2022 Update by AK6R
No ratings yet
Choosing A Feedline Choke Palomar Engineers 2022 Update by AK6R
26 pages
LEMD-LIME Operational Flight Plan
No ratings yet
LEMD-LIME Operational Flight Plan
106 pages
Enhanced Night Vision Goggle (ENVG), AN/PSQ-20, F6023
No ratings yet
Enhanced Night Vision Goggle (ENVG), AN/PSQ-20, F6023
2 pages
Sac - Iti
100% (1)
Sac - Iti
44 pages
Order Number Inputting On Selection Screen
No ratings yet
Order Number Inputting On Selection Screen
5 pages
Rohini 16959215024
No ratings yet
Rohini 16959215024
16 pages
Aramco Approved QC Inspector Resume
No ratings yet
Aramco Approved QC Inspector Resume
5 pages
Daksh 1.3 Python
No ratings yet
Daksh 1.3 Python
7 pages
RPS Technology For Instructional Media
0% (1)
RPS Technology For Instructional Media
11 pages
TEST 13 - THE FIRST TERM TEST No 1
No ratings yet
TEST 13 - THE FIRST TERM TEST No 1
4 pages
Duncans Substation Busworks Drawings
No ratings yet
Duncans Substation Busworks Drawings
6 pages
Math Lab 007
No ratings yet
Math Lab 007
8 pages
Reaction Paper Writing Tips: Step-By-Step Guide To Help You Get Your Reaction Paper Done
No ratings yet
Reaction Paper Writing Tips: Step-By-Step Guide To Help You Get Your Reaction Paper Done
2 pages
Mixer Direct Catalog
No ratings yet
Mixer Direct Catalog
32 pages
Radio Communications Phraseology and Techniques (P-8740-47)
No ratings yet
Radio Communications Phraseology and Techniques (P-8740-47)
16 pages
CHEMISTRY PRACTICAL Assignments
No ratings yet
CHEMISTRY PRACTICAL Assignments
8 pages
Child Protection Committee Action Plan
100% (1)
Child Protection Committee Action Plan
6 pages
Graduate Management Problem Set
No ratings yet
Graduate Management Problem Set
4 pages
Unit 2
No ratings yet
Unit 2
25 pages
Evolution of Transistors Humble Beginnings To The Ubiquitous Present
No ratings yet
Evolution of Transistors Humble Beginnings To The Ubiquitous Present
9 pages
7th Semester CSE Rest Booklet 2023-24
No ratings yet
7th Semester CSE Rest Booklet 2023-24
86 pages
Roy, FTS Icai
No ratings yet
Roy, FTS Icai
176 pages

Loan Default Prediction Using Random Forest

Uploaded by

Loan Default Prediction Using Random Forest

Uploaded by

137

Scientia Africana, Vol. 22 (No. 3), December, 2023. Pp 137-144

DESIGN AND IMPLEMENTATION OF A LOAN DEFAULT PREDICTION SYSTEM

INTRODUCTION minimizes the losses that could be incurred

Figure1:Data loading using read_csv function

Figure 2: Data Cleaning

Figure 3: Code for data processing

Figure 4: Code for Feature Extraction

Figure 5: Code for model training

Figure 6. Code for model evaluation

Figure 7: Confusion matrix of the model

CONCLUSION Classification Method in Data Mining,

Expert Systems with Applications. 37(4): Intelligence and Cognitive Science,

You might also like