71% found this document useful (7 votes)

4K views28 pages

Project Report - FRA V1.0

This document outlines a project to develop an India credit risk model using financial data to predict the likelihood of default. It describes the variables in the datasets, the methodology which includes exploratory data analysis, missing value treatment, outlier treatment, creating new variables, and building and analyzing a logistic regression model. Performance will be evaluated by predicting accuracy on validation data and sorting probabilities to check model performance across probability deciles. Figures and code are provided in the appendices. The goal is to create a model to assess credit risk of customers in India.

Uploaded by

Bibin Vadakkekara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

71% found this document useful (7 votes)

4K views28 pages

Project Report - FRA V1.0

Uploaded by

Bibin Vadakkekara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 28

PROJECT - ASSIGNMENT

Financial Risk Analytics

Submitted by: Bibin Vadakkekara Bhaskaran (G1 - PGP BABI)

Great Lakes Institute of Management

Financial Risk Analytics
Table of Contents
List of Tables....................................................................................................................................0
List of Figures...................................................................................................................................0
1. Project Objective, Background................................................................................................1
2. Methodology...........................................................................................................................2
3. Solutions/Question Answer (Rubric Based).............................................................................2
3.1 Missing Value Treatment..................................................................................................3
3.2 Outlier Treatment - Outlier Treatment.............................................................................3
3.3 New Variables Creation (One ration for profitability, leverage, liquidity and company's
size each).....................................................................................................................................3
3.4 Check for multicollinearity................................................................................................3
3.5 Univariate & bivariate analysis.........................................................................................4
3.6 Build Logistic Regression Model on most important variables........................................8
3.7 Analyze coefficient & their signs.......................................................................................8
3.8 Predict accuracy of model on dev and validation datasets..............................................9
3.9 Sort the data in descending order based on probability of default and then divide into
10 deciles based on probability & check how well the model has performed.........................10
Appendix 1 – Source Code.............................................................................................................11
List of Tables
No table of figures entries found.
List of Figure

Figure 1 : Box plot of independent variables.................................................................................................2

Figure 2 : Education Level density bar plot....................................................................................................2
Figure 3 : Age in years vs Personal Loan........................................................................................................3
Figure 4 : Experience in years vs Personal loan.............................................................................................3
Figure 5 : Credit card spending vs Personal Loan..........................................................................................4
Figure 6 : Income per month vs Education level............................................................................................4
Figure 7 : Confusion Matrix – Credit Card holders vs Personal loan.............................................................5
Figure 8 : Confusion Matrix – Cash deposit holders vs Personal loan...........................................................5
Figure 9 : Confusion Matrix – Securities account holders vs Personal loan..................................................5
Figure 10 : Confusion Matrix – Internet banking customers vs Personal loan..............................................5
Figure 11 : Initial CART plot tree....................................................................................................................6
Figure 12 : Before pruning – Rel. error vs cp vs size of tree..........................................................................7
Figure 13 : After pruning - CART plot.............................................................................................................7
Figure 14 : Random forest error rate and characteristics..............................................................................8
Figure 15 : Random forest - Error rate vs trees.............................................................................................8
Figure 16 : TPR vs FPR -CART........................................................................................................................10
Figure 17: TPR vs FPR - Random Forest.......................................................................................................10
Figure 18 : TPR vs FPR - CHAID model..........................................................................................................11
1. Project Objective, Background
To create an India credit risk(default) model, using the data provided (financial data) in the spreadsheet
raw-data.xlsx, and validate it on validation_data.xlsx.
Use the logistic regression framework to develop the credit default model.
Variable explanation as per the data dictionary are as follows :
Variable Name Discreption
Networth Next Year Net worth of the customer in next year
Total assets Total assets of customer
Net worth Net worth of the customer of present year
Total income Total income of the customer
difference between value of current stock and the value of stock in last trading
Change in stock day
Total expenses Total expense done by customer
Profit after tax Profit after tax deduction
PBDITA Profit before depreciation, income tax and amortization
PBT Profit before tax deduction
Cash profit Total Cash profit
PBDITA as % of total
income PBDITA / Total income
PBT as % of total income PBT / Total income
PAT as % of total income PAT / Total income
Cash profit as % of total
income Cash Profit / Total income
PAT as % of net worth PAT / Net worth
Sales Sales done by customer
Income from financial
services Income from financial services
Other income Income from other sources
Total capital Total capital of the customer
Reserves and funds Total reserves and funds of the customer
Deposits (accepted by
commercial banks) All blank values
Borrowings Total amount borrowed by customer
Current liabilities &
provisions current liabilities of the customer
Deferred tax liability Future income tax customer will pay because of the current transaction
Shareholders funds Amount of equity in a company, which is belong to shareholder
Cumulative retained
profits Total cumulative profit retained by customer
Capital employed Current asset minus current liabilities
TOL/TNW Total liabilities of the customer divided by Total net worth
Total term liabilities /
tangible net worth Short + long term liabilities divided by tangible net worth
Contingent liabilities / Contingent liabilities / Net worth
Page | 1
Net worth (%)
Contingent liabilities Liabilities because of uncertain events
Net fixed assets purchase price of all fixed assets
Investments Total invested amount
Current assets Assets that are expected to be converted to cash within a year
Net working capital Difference of current liabilities and current assets
Quick ratio (times) Total cash divided by current liabilities
Current ratio (times) Current assets divided by current liabilities
Debt to equity ratio
(times) Total liabilities divided by its shareholder equity
Cash to current liabilities
(times) Total liquid cash divided by current liabilities
Cash to average cost of
sales per day Total cash divided by average cost of the sales
Creditors turnover Net credit purchase divided to average trade creditors
Debtors turnover Net credit sales divided by average accounts receivable
Finished goods turnover Annual sales divided by average inventory
The cost of goods sold for a period divided by the average inventory for that
WIP turnover period
Raw material turnover Cost of goods sold is divided by the average inventory for the same period
Shares outstanding Number of issued shares minus the number of share held in the company
Equity face value cost of the equity at the time of issuing
EPS Net income divided by total number of outstanding share
Adjusted net earning divided by the weighted average number of common
Adjusted EPS share outstanding on a diluted basis during the plan year
Total liabilities Sum of all type of liabilities
PE on BSE Company current stock price divided by its earning per share

2. Methodology

EDA (Datatype corrections -> Missing value treatment -> Outlier corrections -> Univariate and Bivariate
analysis -> Creating new variables for Profitability, Leverage, Liquidity & Company’s size)
Modelling (Logistic Regression model to be built and analysed)
Model performance (Accuracy of the model on training and test dataset, 10 decile sorting)

3. Solutions/Question Answer (Rubric Based)

 While importing dataset using the import option in R studio, datatypes that was seen as
characters is changed into numeric from character type.
 Default, a new column is added using the Networth Next year as a criteria and for those who
have positive net worth is given not a defaulter or ‘0’ and those who have negative is given
defaulter ‘1’ status.
 In the total training data set named as rawdata, 6.8% of recordings fall under the defaulter
category.

Page | 2
3.1 Missing Value Treatment
 Missing values are first visually represented / observed using the plot_intro function from the
Data explorer package
 Each of the missing value is replaced using the median of the corresponding column using a loop
function throughout for the training dataset
 In both the training and test dataset, an entire column (namely Deposits (accepted by
commercial banks)) is blank and thus that coloumn is removed from the datasets.
 Testing dataset follows the same procedure as the training set
3.2 Outlier Treatment - Outlier Treatment
 Treatment is done using the scales library -> squish function
 Observations below the 1st percentile is given value of the first percentile and the observations
above the 95th percentile is given the value of 95th percentile
 Redundant variables (Number & Net worth next year (now categorised by Default) from the
training and Number from the testing dataset) are removed.
3.3 New Variables Creation (One ration for profitability, leverage, liquidity and company's size each)
 Profitability ratio (variable name: Profitability) = Profit After Tax / Sales
 Leverage ratio (variable name: leverage) = Total Assets / (Ratio of liabilities and debt to equity
ratio)
 Liquidity ratio (variable name: liquidity) = Net working capital / Total assets
 Company’s size ratio (variable name: comp.size) = Total Assets / Total Liabilities
3.4 Check for multicollinearity
 Correlation is plotted between all the variables available in the dataset (training)
 There exists significant correlation between certain independent variables

Page | 3

 Clusters of Red indicate that there exists correlation between the variables
3.5 Univariate & bivariate analysis
 Univariate Analysis – Histogram of independent variables

Page | 4


Page | 5
 Univariate Analysis – Density plot of independent variables


Page | 6



 Bi variate analysis

Page | 7


3.6 Build Logistic Regression Model on most important variables

 Logistic reg. model is built on the dataset with all variables primarily
 Analysing the importance of variables based on Z scores, secondary model is generated
3.7 Analyze coefficient & their signs
 AIC value of initial model is 1060.5

Page | 8
 Secondary model generated using the significant variables indicate a reduction in AIC value to
975.16
3.8 Predict accuracy of model on dev and validation datasets
 Accuracy of the training model is calculated and found to be 94 percent whereas the same model
when implemented on the testing / validation dataset, accuracy was found to be 93%
 Confusion matrix is generated for both the training set and validation set and is depicted below
 Training dataset


 Testing data set


 ROC is calculated and plotted for both the training and test dataset as shown below

Page | 9

3.9 Sort the data in descending order based on probability of default and then divide into 10 deciles
based on probability & check how well the model has performed
 Deciling of dataset based on the probability of default is done.
 Plots are made to compare the training dataset, test dataset (default vs predictions)

Page | 10


Page | 11
Appendix 1 – Source Code

####intro ####
getwd()
setwd("C:/Users/bibin/OneDrive/Great Lakes/FRA/Project Works/Project Works -RCode/ProjectV1.0")
#importing train dataset (rawdata) using inbuilt import dataset function
#correcting datatypes from character to numeric in the UI itself
rawdata=readxl::read_excel("raw-data.xlsx")
rawdata
str(rawdata)
rawdata=as.data.frame(rawdata)
for(i in 1:ncol(rawdata))
{
rawdata[,i]=as.numeric(rawdata[,i])
}
str(rawdata)

#importing test data (validate) set

validate=readxl::read_excel("validation_data.xlsx")
str(validate)
validate=as.data.frame(validate)
for(i in 1:ncol(validate))
{
validate[,i]=as.numeric(validate[,i])
}
str(validate)

#dataset while importing via -> import function was corrected for datatype from character to numeric

Page | 12
library(DataExplorer)

dim(rawdata)
names(rawdata)

####missing value & outlier treatment ####

rawdata_new=rawdata
validate_new=validate
str(validate_new)
rawdata_new$Default=ifelse(rawdata_new$`Networth Next Year`>0,0,1)
rawdata_new=rawdata_new[!rawdata_new$`Total assets`<=3,]
summary(rawdata_new$Default)

summary(rawdata_new$Default)
summary(rawdata_new[,53])
#3298 are non defaulters and 243 are defaulters
(243/(3298+243))*100
#6.86 % of total are defaulters

summary(rawdata_new)
summary(rawdata_new$`Total assets`)

#Deposits (accepted by commercial banks) is blank for both rawdata_new and validate_new

str(rawdata_new)
#22nd coloumn in both datasets

Page | 13
#removing the coloumns

rawdata_new=rawdata_new[,-22]
validate_new=validate_new[,-22]
plot_intro(rawdata_new)
plot_intro(validate_new)

summary(rawdata_new$Default)
sum(is.na(rawdata_new$Default))

#replace missing values of rawdata using median

rawdata_new=as.data.frame(rawdata_new)

#understanding the dataset under consideration

plot_intro(rawdata_new)
dim(rawdata_new)
#replace the missing NA with Median
for(a in 1:51)
{
rawdata_new[,a]=as.numeric(rawdata_new[,a])
rawdata_new[is.na(rawdata_new[,a]),a]=median(rawdata_new[,a],na.rm=TRUE)
}
str(validate_new)
#replacing the test dataset - validate_new with median
validate_new=as.data.frame(validate_new)
#removing the first two columns from replacing the omitted values
for(b in 3:ncol(validate_new))
{

Page | 14
validate_new[,b]=as.numeric(validate_new[,b])
validate_new[is.na(validate_new[,b]),b]=median(validate_new[,b],na.rm=TRUE)
}

summary(rawdata_new)
summary(validate_new)
plot_intro(rawdata_new)
plot_intro(validate_new)

#all obersvations are present / replaced for both the training dataset and test data set

####outlier identification ####

ncol(rawdata_new)

boxplot(rawdata_new)

library(scales)
ncol(rawdata_new)

str(rawdata_new)

summary(rawdata_new[,52])

#squish function utilisation for

for(c in 2:51)
{

Page | 15
quan=quantile(rawdata_new[,c],c(0.01,0.95))
rawdata_new[,c]=squish(rawdata_new[,c],quan)
}

summary(rawdata_new)

rawdata_new1=rawdata_new[,-c(1,2)]
validate_new1 = validate_new[,-1]
summary(rawdata_new1)
summary(validate_new1)

#rawdata new 1 -> does not contain the id of the company and net worth next year
#validate data new 1 -> does not contain the id of the company but starts with the default coloumn

#### univariate & bi variate analysis #####

plot_str(rawdata_new)
plot_intro(rawdata_new)
plot_missing(rawdata_new)

dev.off()
plot_histogram(rawdata_new1,ggtheme = theme_light())
dev.off()
plot_density(rawdata_new1,ggtheme= theme_minimal())

Page | 16
?plot_correlation
plot_correlation(rawdata_new1,title = "Correlation Plot between all variables", ggtheme =
theme_dark())

#bi variate
??plot
dev.off()
rawdata_new1$Default=as.factor(rawdata_new1$Default)
par(mfrow=c(3,2))
plot(log(rawdata_new1$`Net worth`)~rawdata_new1$Default, main = "Networth")
plot(log(rawdata_new1$`Total expenses`)~rawdata_new1$Default, main = "Tot. Exp")
plot(log(rawdata_new1$Sales)~rawdata_new1$Default, main = "Sales")
plot(log(rawdata_new1$`Total liabilities`)~rawdata_new1$Default, main = "liabilities")
plot(log(rawdata_new1$`Cash to average cost of sales per day`)~rawdata_new1$Default, main = "Cash
to avg cost of sale")
plot(log(rawdata_new1$`Current assets`)~rawdata_new1$Default, main = "Curnt Assts")
plot(log(rawdata_new1$`Capital employed`)~rawdata_new1$Default, main = "Cap. employed")
plot(log(rawdata_new1$`Other income`)~rawdata_new1$Default, main = "Oth Income")
plot((rawdata_new1$`Reserves and funds`)~rawdata_new1$Default, main = "Rsrv and Fund")
plot((rawdata_new1$`Net fixed assets`)~rawdata_new1$Default)

#multi collinearity check

####new variable creation #####

#profitability, leverage, liquidity and company's size

#profitability -> new variable

# profit after tax / sales

Page | 17
rawdata_new1$Profitability=rawdata_new1$`Profit after tax`/rawdata_new1$Sales
boxplot(log(rawdata_new1$Profitability),main="Profitability")

#leverage variable
# total assets / (ratio of liabilities and debt to equity ratio)

rawdata_new1$leverage = rawdata_new1$`Total assets`/(rawdata_new1$`Total

liabilities`/rawdata_new1$`Debt to equity ratio (times)`)
plot(density(rawdata_new1$leverage))

#liquidity variable
#net working capital / total assets
rawdata_new1$liquidity=rawdata_new1$`Net working capital`/rawdata_new1$`Total assets`

#company's size
#size = Total Assets / total liabilities
rawdata_new1$comp.size=rawdata_new1$`Total assets`/rawdata_new1$`Total liabilities`

#same variables generated for testing dataset - validate_new1

#profitability -> new variable

# profit after tax / sales

validate_new1$Profitability=validate_new1$`Profit after tax`/validate_new1$Sales

boxplot(log(validate_new1$Profitability),main="Profitability")
Page | 18
#leverage variable
# total assets / (ratio of liabilities and debt to equity ratio)

validate_new1$leverage = validate_new1$`Total assets`/(validate_new1$`Total

liabilities`/validate_new1$`Debt to equity ratio (times)`)
plot(density(validate_new1$leverage))

#liquidity variable
#net working capital / total assets
validate_new1$liquidity=validate_new1$`Net working capital`/validate_new1$`Total assets`

#company's size
#size = Total Assets / (total income - total expenses)
validate_new1$comp.size=validate_new1$`Total assets`/validate_new1$`Total liabilities`
boxplot(log(validate_new1$comp.size))

####modeling #####
#log reg
summary(rawdata_new1$Default)
set.seed(1000)
rawdata.logit.train = glm(rawdata_new1$Default~ ., data=rawdata_new1,family=binomial)
summary(rawdata.logit.train)
attach(rawdata_new1)
rawdata.logit.train.1=glm(rawdata_new1$Default~ `PAT as % of net worth`+`Current liabilities &
provisions`
+`Cumulative retained profits`+ `Cash to current liabilities (times)`+ `Current ratio (times)`
+`Debt to equity ratio (times)`+`Debtors turnover`+Profitability, data =
rawdata_new1,family=binomial)

Page | 19
summary(rawdata.logit.train.1)

rawdata.logit.train.2 =glm(rawdata_new1$Default~ `PAT as % of net worth`+`Current liabilities &

provisions`
+ `Cash to current liabilities (times)`+ `Current ratio (times)`
+`Debt to equity ratio (times)`+`Debtors turnover`+Profitability , data =
rawdata_new1,family=binomial)
summary(rawdata.logit.train.2)

trainmodel=rawdata.logit.train.2
summary(trainmodel)

library(pROC)
library(caret)
library(e1071)
library(SDMTools)
?predict.glm
prediction = predict.glm(trainmodel, newdata=rawdata_new1,type='response')
## install.packages("caret", repos="http://cran.rstudio.com/", dependencies=TRUE)
rawdata_new1$predictvalue= predict.glm(trainmodel, newdata=rawdata_new1,type='response')

rawdata_new1$prediction=ifelse(rawdata_new1$predictvalue>0.9,1,0)

library(caTools)

conf.mat.table= table(rawdata_new1$Default,rawdata_new1$prediction)

Page | 20
print(conf.mat.table)

accuracy.prediction = sum(diag(conf.mat.table)/sum(conf.mat.table))

print(accuracy.prediction)*100

roc.train = roc(rawdata_new1$Default,prediction)
plot(roc.train, main="ROC curve - Training Dataset")

#94 percent accuracy in the training dataset

validate_new1$predictvalue= predict.glm(trainmodel, newdata=validate_new1,type='response')

prediction.test = predict.glm(trainmodel, newdata=validate_new1,type='response')

validate_new1$prediction=ifelse(prediction.test>0.9,1,0)

conf.mat.table.test= table(validate_new1$`Default - 1`,validate_new1$prediction)

print(conf.mat.table.test)

accuracy.prediction.test = sum(diag(conf.mat.table.test)/sum(conf.mat.table.test))
print(accuracy.prediction.test)*100

roc.test = roc(validate_new1$`Default - 1`,prediction.test)

plot(roc.test, main="ROC curve - Test Dataset")

Page | 21
#93 percent accurate in the test data set

####deciling#####

#for training
decile<-function(x)
{ decile = vector(length=10)
for(i in seq(0.1,1,.1))
{
decile[i*10]<-quantile(x,i,na.rm=T)
}
return(
ifelse(x<decile[1],1,
ifelse(x<decile[2],2,
ifelse(x<decile[3],3,
ifelse(x<decile[4],4,
ifelse(x<decile[5],5,
ifelse(x<decile[6],6,
ifelse(x<decile[7],7,
ifelse(x<decile[8],8,
ifelse(x<decile[9],9,10 ))))))))))
}

rawdata_new1$decile= decile(rawdata_new1$predictvalue)

Page | 22
temporary_table=data.table::data.table(rawdata_new1)

ranking = temporary_table[,list(cnt=length(rawdata_new1$prediction),
cnt_resp=sum(rawdata_new1$prediction==1),
cnt_non_resp=sum(rawdata_new1$prediction==0)
)
, by =decile][order(-decile)]

View(ranking)

#for testing

decile2<-function(x)
{ decile = vector(length=10)
for(i in seq(0.1,1,.1))
{
decile[i*10]<-quantile(x,i,na.rm=T)
}
return(
ifelse(x<decile[1],1,
ifelse(x<decile[2],2,
ifelse(x<decile[3],3,
ifelse(x<decile[4],4,
ifelse(x<decile[5],5,
ifelse(x<decile[6],6,
ifelse(x<decile[7],7,

Page | 23
ifelse(x<decile[8],8,
ifelse(x<decile[9],9,10 ))))))))))
}

validate_new1$decile= decile2(validate_new1$predictvalue)
temporary_table.test=data.table::data.table(validate_new1)

ranking.test = temporary_table.test[,list(cnt=length(validate_new1$prediction),
cnt_resp=sum(validate_new1$prediction==1),
cnt_non_resp=sum(validate_new1$prediction==0)
)
, by =decile][order(-decile)]

View(ranking.test)

#training

mean.observations.training =
aggregate(rawdata_new1$Default~rawdata_new1$decile,data=rawdata_new1,mean)
mean.predictions.training = aggregate(rawdata_new1$predictvalue
~rawdata_new1$decile,data=rawdata_new1,mean)

par(mfrow=c(1,2))
plot(mean.observations.training, main = "training dataset , default ")
lines(mean.observations.training)

Page | 24
plot(mean.predictions.training, main = "training dataset , prediction")
lines(mean.predictions.training)

#test

mean.observations.test = aggregate(validate_new1$`Default - 1`~validate_new1$decile

,data=validate_new1,mean)
mean.predictions.test = aggregate(validate_new1$predictvalue
~validate_new1$decile,data=validate_new1,mean)

par(mfrow=c(1,2))
plot(mean.observations.test, main = "test dataset , default ")
lines(mean.observations.test)

plot(mean.predictions.test, main = "test dataset , prediction")

lines(mean.predictions.test)

Page | 25

FRA Milestone1 - Maminulislam
100% (4)
FRA Milestone1 - Maminulislam
23 pages
FRA Assignment
100% (1)
FRA Assignment
31 pages
India Credit Risk Model Analysis
100% (4)
India Credit Risk Model Analysis
19 pages
Manali Andyal 26 05 2025 FRA Part A Guided Project Report PDF
100% (1)
Manali Andyal 26 05 2025 FRA Part A Guided Project Report PDF
19 pages
RFM Analysis for Auto Parts Sales Insights
100% (8)
RFM Analysis for Auto Parts Sales Insights
30 pages
MRA Project MIlestone1
83% (18)
MRA Project MIlestone1
29 pages
FRA Project Milestone 1 Overview
90% (21)
FRA Project Milestone 1 Overview
44 pages
Data Insights for Auto Parts Company
100% (3)
Data Insights for Auto Parts Company
29 pages
Project +Sweta+Kumari+ +FRA+Milestone+1+ July+ 2021
100% (2)
Project +Sweta+Kumari+ +FRA+Milestone+1+ July+ 2021
31 pages
MRA Project Milesone-1: BY-Shorya Goel PGP Dsba Oct - 20 B
92% (25)
MRA Project Milesone-1: BY-Shorya Goel PGP Dsba Oct - 20 B
35 pages
Problem 2
100% (1)
Problem 2
10 pages
Points
0% (6)
Points
1 page
MRA Project Milestone 2
92% (12)
MRA Project Milestone 2
26 pages
MRA Assignment: by Chitra Mukadam
100% (2)
MRA Assignment: by Chitra Mukadam
19 pages
MRA Project Milestone 1 - Maminulislam
83% (6)
MRA Project Milestone 1 - Maminulislam
30 pages
Café Chain POS Data Analysis Insights
83% (6)
Café Chain POS Data Analysis Insights
35 pages
FRA Project Data Analysis Overview
100% (2)
FRA Project Data Analysis Overview
27 pages
Boston Condo Sale Story
0% (1)
Boston Condo Sale Story
11 pages
Business Report M2 PDF
100% (2)
Business Report M2 PDF
14 pages
RFM Analysis for Customer Segmentation
100% (7)
RFM Analysis for Customer Segmentation
43 pages
Credit Risk and Stock Analysis Insights
100% (4)
Credit Risk and Stock Analysis Insights
19 pages
Sales Prediction & Car Crash Analysis
54% (13)
Sales Prediction & Car Crash Analysis
14 pages
Vaibhav Kumar MRA Project Milestone 2
No ratings yet
Vaibhav Kumar MRA Project Milestone 2
18 pages
Predictive Modelling ALOK KUMAR
100% (1)
Predictive Modelling ALOK KUMAR
25 pages
Gowtham Mra 2
No ratings yet
Gowtham Mra 2
18 pages
Mra Project
No ratings yet
Mra Project
12 pages
MRA Project Milestone 2
71% (17)
MRA Project Milestone 2
20 pages
Project FRA Milestone1 JPY Nikita Chaturvedi 05.05.2022 Jupyter Notebook PDF
76% (21)
Project FRA Milestone1 JPY Nikita Chaturvedi 05.05.2022 Jupyter Notebook PDF
102 pages
Mra Project: Prepared By: Deepak Batabyal Date:-09 Feb 2020
100% (2)
Mra Project: Prepared By: Deepak Batabyal Date:-09 Feb 2020
32 pages
Predicting Credit Risk Using Financial Data
100% (3)
Predicting Credit Risk Using Financial Data
42 pages
Grocery Store Combo Meal Analysis
100% (2)
Grocery Store Combo Meal Analysis
31 pages
Life Insurance Sales Insights and EDA
100% (1)
Life Insurance Sales Insights and EDA
16 pages
MRA Project Milestone 1 PDF
No ratings yet
MRA Project Milestone 1 PDF
1 page
Project 7 - DVT - Manoj
No ratings yet
Project 7 - DVT - Manoj
1 page
Capstone Proect Notes 2
100% (2)
Capstone Proect Notes 2
16 pages
Data Mining - Business Report: Clustering Clean - Ads
100% (4)
Data Mining - Business Report: Clustering Clean - Ads
24 pages
Anushi Sparkling
100% (4)
Anushi Sparkling
70 pages
PM - ExtendedProject - Business Report
100% (5)
PM - ExtendedProject - Business Report
35 pages
Arnab Chowdhury DM
75% (4)
Arnab Chowdhury DM
14 pages
Life Insurance Agent Bonus Analysis
80% (5)
Life Insurance Agent Bonus Analysis
34 pages
Grocery Store Combo Meal Suggestions
100% (4)
Grocery Store Combo Meal Suggestions
24 pages
TSF Week3 Quiz Part2 PDF
67% (3)
TSF Week3 Quiz Part2 PDF
3 pages
House Price Prediction Analysis
100% (2)
House Price Prediction Analysis
26 pages
Marketing & Retail Analytics Assignment: Cafe Data - 02
100% (1)
Marketing & Retail Analytics Assignment: Cafe Data - 02
11 pages
MRA+Project+-+Milestone+2+ Sweta+Kumari+ July+2021
100% (4)
MRA+Project+-+Milestone+2+ Sweta+Kumari+ July+2021
29 pages
Statistics Project Cover Page
0% (1)
Statistics Project Cover Page
10 pages
Marketing & Retail Analytics-Milestone 1 - 300521
71% (14)
Marketing & Retail Analytics-Milestone 1 - 300521
18 pages
Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
Boston Condo Sales Analysis
No ratings yet
Boston Condo Sales Analysis
61 pages
FRA Project Report Milestone 1 PDF
No ratings yet
FRA Project Report Milestone 1 PDF
29 pages
FRA Report
100% (1)
FRA Report
30 pages
FRA Module Business Report
100% (1)
FRA Module Business Report
37 pages
India Credit Risk Model Report
No ratings yet
India Credit Risk Model Report
18 pages
Fra Business Report
No ratings yet
Fra Business Report
37 pages
Finance and Risk Analytics Project Sai Vinayak Sanam PDF
No ratings yet
Finance and Risk Analytics Project Sai Vinayak Sanam PDF
99 pages
Finance and Risk Analytics Project PDF
No ratings yet
Finance and Risk Analytics Project PDF
94 pages
India Credit Risk Model Using Logistic Regression
No ratings yet
India Credit Risk Model Using Logistic Regression
7 pages
India Credit Risk Model Analysis
No ratings yet
India Credit Risk Model Analysis
35 pages
Financial Risk Analysis: Great Learning PGPBABI 2017
No ratings yet
Financial Risk Analysis: Great Learning PGPBABI 2017
25 pages
Finance & Risk Analytics: Project Report
No ratings yet
Finance & Risk Analytics: Project Report
16 pages
Project Report Adv Stat V1.0
No ratings yet
Project Report Adv Stat V1.0
5 pages
Project Report - Tableau
No ratings yet
Project Report - Tableau
4 pages
Retail Analytics
No ratings yet
Retail Analytics
21 pages
Cold Storage Temperature Analysis
No ratings yet
Cold Storage Temperature Analysis
21 pages
Retail Analytics
No ratings yet
Retail Analytics
21 pages
Retail Analytics
No ratings yet
Retail Analytics
21 pages
Project Report - Advanced - Stats - Final PDF
No ratings yet
Project Report - Advanced - Stats - Final PDF
25 pages
BRS Practice Questions
100% (3)
BRS Practice Questions
2 pages
E0014446 Endt2 Billing
No ratings yet
E0014446 Endt2 Billing
2 pages
Basic Information For The Ethiopian Diaspora
No ratings yet
Basic Information For The Ethiopian Diaspora
70 pages
Data Analysis On Financial Inclusion
67% (3)
Data Analysis On Financial Inclusion
21 pages
Internship Report: MUCO Bank Ltd.
No ratings yet
Internship Report: MUCO Bank Ltd.
72 pages
Deposit Offer Letter
100% (1)
Deposit Offer Letter
1 page
Principles of Risk Management and Insurance Class Notes Chapter 3 Introduction To Risk Management Topics
No ratings yet
Principles of Risk Management and Insurance Class Notes Chapter 3 Introduction To Risk Management Topics
7 pages
Project-Comparative Analysis of ULIPs With Traditional Plans
91% (11)
Project-Comparative Analysis of ULIPs With Traditional Plans
83 pages
Home Loan Application
No ratings yet
Home Loan Application
4 pages
E Business Strategy of Capital One
No ratings yet
E Business Strategy of Capital One
15 pages
Q3FY18 Investor Presentation Highlights
No ratings yet
Q3FY18 Investor Presentation Highlights
38 pages
Marketing Database
No ratings yet
Marketing Database
139 pages
Metrobank Annual Report 2017
No ratings yet
Metrobank Annual Report 2017
42 pages
Credit Card
100% (3)
Credit Card
55 pages
Dubai UAE Information
No ratings yet
Dubai UAE Information
28 pages
Jan 2022 OCS Salary Slip
No ratings yet
Jan 2022 OCS Salary Slip
14 pages
Project On SBI
100% (1)
Project On SBI
84 pages
Rekening Koran Andi Lola 2022
No ratings yet
Rekening Koran Andi Lola 2022
2 pages
Chime Checking Statement September 2025
No ratings yet
Chime Checking Statement September 2025
7 pages
HDFC Bank Account Statement March 2023
No ratings yet
HDFC Bank Account Statement March 2023
17 pages
Intermediate Financial Accounting II
No ratings yet
Intermediate Financial Accounting II
10 pages
Bank Statement for Abdul Razak
No ratings yet
Bank Statement for Abdul Razak
31 pages
ISBP 745 Updates and Key Changes
No ratings yet
ISBP 745 Updates and Key Changes
17 pages
Buying A Foreclosed Home
No ratings yet
Buying A Foreclosed Home
2 pages
Business Banking Tariffs 2019
No ratings yet
Business Banking Tariffs 2019
14 pages
Welcome To Aditya Birla Insurance Brokers Limited, A Subsidiary of Aditya Birla Capital Limited!
No ratings yet
Welcome To Aditya Birla Insurance Brokers Limited, A Subsidiary of Aditya Birla Capital Limited!
4 pages
SIWES Book1
No ratings yet
SIWES Book1
10 pages
Reliance Capital Stock Buy Analysis
No ratings yet
Reliance Capital Stock Buy Analysis
18 pages
Weightage Meaning &amp Factors, Calculation &amp Techniques For Distribution of Profit Under Mudaraba System112
No ratings yet
Weightage Meaning &amp Factors, Calculation &amp Techniques For Distribution of Profit Under Mudaraba System112
4 pages

Project Report - FRA V1.0

Uploaded by

Project Report - FRA V1.0

Uploaded by

PROJECT - ASSIGNMENT

Financial Risk Analytics

Great Lakes Institute of Management

Figure 1 : Box plot of independent variables.................................................................................................2

3. Solutions/Question Answer (Rubric Based)

3.6 Build Logistic Regression Model on most important variables

#importing test data (validate) set

####missing value & outlier treatment ####

#replace missing values of rawdata using median

#understanding the dataset under consideration

####outlier identification ####

#squish function utilisation for

#### univariate & bi variate analysis #####

#multi collinearity check

####new variable creation #####

#profitability -> new variable

rawdata_new1$leverage = rawdata_new1$`Total assets`/(rawdata_new1$`Total

#same variables generated for testing dataset - validate_new1

#profitability -> new variable

validate_new1$Profitability=validate_new1$`Profit after tax`/validate_new1$Sales

validate_new1$leverage = validate_new1$`Total assets`/(validate_new1$`Total

rawdata.logit.train.2 =glm(rawdata_new1$Default~ `PAT as % of net worth`+`Current liabilities &

#94 percent accuracy in the training dataset

validate_new1$predictvalue= predict.glm(trainmodel, newdata=validate_new1,type='response')

prediction.test = predict.glm(trainmodel, newdata=validate_new1,type='response')

conf.mat.table.test= table(validate_new1$`Default - 1`,validate_new1$prediction)

roc.test = roc(validate_new1$`Default - 1`,prediction.test)

mean.observations.test = aggregate(validate_new1$`Default - 1`~validate_new1$decile

plot(mean.predictions.test, main = "test dataset , prediction")

You might also like