0% found this document useful (0 votes)

94 views4 pages

Predictive Model Plan Report

The report outlines a predictive model plan for assessing customer delinquency using a Random Forest algorithm, emphasizing data preparation, feature selection, model training, and evaluation strategies. It highlights the importance of accuracy, precision, recall, and ethical considerations such as transparency and fairness in the model's deployment. The model aims to balance performance with interpretability to support risk-based decision-making in a financial context.

Uploaded by

sk3818966

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views4 pages

Predictive Model Plan Report

Uploaded by

sk3818966

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Predictive model plan report

1 model logic (generated with genAI)

1 Data Preparation

• Impute missing values (Income, Loan_Balance, Credit_Score)

• Cap Credit_Utilization at 1.0

• Encode categorical variables (e.g., Employment_Status, Month_1–Month_6)

• Normalize or scale numerical features if needed

2. Feature Selection

Select key predictors:

• Credit_Utilization
• Debt_to_Income_Ratio
• Missed_Payments
• Account_Tenure
• Recent Payment Status (Month_6)
• Employment_Status
• Credit_Score, Income

3. Model Training

• Choose model: Logistic Regression (baseline), or Random Forest (for feature importance)

• Train on labeled data (Delinquent_Account as target)

4. Model Evaluation

• Use train-test split or cross-validation

• Evaluate with accuracy, precision, recall, F1-score

• Review confusion matrix for false positives/negatives

5. Prediction and Risk Scoring

• Output a binary label (0 = Non-delinquent, 1 = Delinquent)

• Optional: Output probability score for risk ranking.

Pseudocode

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import classification_report

# Step 1: Define features and target

X = df_cleaned[selected_features] # e.g., Credit_Utilization, etc.

y = df_cleaned['Delinquent_Account']

# Step 2: Train-test split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Model training

model = RandomForestClassifier()

model.fit(X_train, y_train)

# Step 4: Prediction

y_pred = model.predict(X_test)

# Step 5: Evaluation

print(classification_report(y_test, y_pred))

2 justification for model choice

For the task of predicting customer delinquency, the Random Forest model was selected based on a
balance of accuracy, interpretability, and operational fit for Geldium’s business environment.

Factor Justification
Random Forests are ensemble models that
combine multiple decision trees to improve
Factor Justification prediction performance and reduce overfitting.
They consistently outperform simpler models
Accurac on structured financial data.
Transparency While not as transparent as logistic regression,
Random Forests provide feature importance
scores, allowing business analysts to
understand which variables most influence risk
predictions.
Ease of Use Random Forests are easy to implement using
libraries like scikit-learn, require minimal
feature scaling, and are robust to outliers and
missing data.
Financial Relevance tree-based models like Random Forests have a
proven track record in credit scoring and fraud
detection, making them well-aligned with
financial use cases.
Geldium needs fast, interpretable, and
Business Suitability (Geldium) deployable solutions to identify high-risk
customers. Random Forests offer a strong
trade-off between performance and
explainability, making them ideal for risk-based
decision support systems.

Alternative considerations

• Logistic Regression was considered for its simplicity and transparency, but it lacks the ability
to capture complex, non-linear relationships in behavioral data.
• Neural Networks were ruled out for now due to their "black-box" nature, which is not ideal
for regulated industries like finance where model interpretability is crucial.

3 Evaluation strategy
To ensure the model is both effective and responsible, we use a combination of performance
metrics, bias detection techniques, and ethical safeguards.

Metric What is measure Why it matter for delinquency

predictions
Accuracy Overall correctness of the Useful for balanced datasets,
model's predictions but can be misleading if classes
are imbalanced
Precision % of predicted delinquents Important for minimizing false
that were actually delinquent positives (wrongly labeling
someone as high-risk).
Recall Critical for catching at-risk
% of actual delinquents customers and preventing
correctly identified financial losses.
F1 score Balances false positives and
Harmonic mean of precision
false negatives, ideal for
and recal
uneven class distributions.
Area under ROC curve(AUC) Ability of model to distinguish Robust summary of model
between delinquent and non- performance across
delinquent cases thresholds. AUC closer to 1 is
ideal

Metric interpretation

• High Recall + Moderate Precision: Acceptable in early warning systems to flag potential risk,
followed by manual review.
• Low Recall: Risk of missing truly delinquent customers — unacceptable for financial
applications.
• High F1 Score: Signals strong balance — key indicator of a model ready for production.
bias detections & mitigations

Technique Purpose
Stratified sampling Ensures balanced representation of delinquent
and non-delinquent cases during training.
Fairness audits Evaluate model performance across subgroups
(e.g., gender, location, income level).
Feature sensitivity analysis
Detect if non-relevant features (e.g., ZIP code,
ethnicity) are unduly influencing outcomes
Re-weighting
Adjust class distribution to prevent model
from favoring the majority class.

Ethical considerations

• Transparency: Customers have a right to know if and why they were flagged as high-risk. The
model must support explainability.
• Fairness: Avoid discrimination against protected groups. Model inputs should be behavior-
and performance-based, not demographic.
• Human Oversight: High-risk predictions should trigger manual review, not automatic
rejections or penalties.
• Data Privacy: All customer data used must be anonymized, securely stored, and aligned with
data protection regulations (e.g., GDPR).

Geldium Task2 Model Plan
0% (1)
Geldium Task2 Model Plan
4 pages
No 2
No ratings yet
No 2
2 pages
EDA Report (1
No ratings yet
EDA Report (1
4 pages
Task 2 ModelPlan Template
No ratings yet
Task 2 ModelPlan Template
3 pages
Task 2 ModelPlan Template
No ratings yet
Task 2 ModelPlan Template
3 pages
Predictive Model Plan
No ratings yet
Predictive Model Plan
4 pages
Predictive Model Plan
No ratings yet
Predictive Model Plan
3 pages
Predictive Model Plan
No ratings yet
Predictive Model Plan
2 pages
Predictive Modeling Plan For Delinquency Risk
No ratings yet
Predictive Modeling Plan For Delinquency Risk
2 pages
? Structured Model Plan
No ratings yet
? Structured Model Plan
2 pages
Task 2 Model Plan Example Answer
No ratings yet
Task 2 Model Plan Example Answer
1 page
Delinquency Risk Model Plan
No ratings yet
Delinquency Risk Model Plan
2 pages
Document 9
No ratings yet
Document 9
2 pages
Delinquency Pre Task3
No ratings yet
Delinquency Pre Task3
2 pages
Geldium Delinquency Model Plan IndraS
No ratings yet
Geldium Delinquency Model Plan IndraS
4 pages
Kritika Sejwal 24MCI10023 ML Lab Project Report
No ratings yet
Kritika Sejwal 24MCI10023 ML Lab Project Report
10 pages
Credit Risk Prediction Model Deployment
No ratings yet
Credit Risk Prediction Model Deployment
6 pages
Project Report On Credit Risk Analysis Using Random Forest
No ratings yet
Project Report On Credit Risk Analysis Using Random Forest
8 pages
Delinquency Risk Analysis Using CART
No ratings yet
Delinquency Risk Analysis Using CART
4 pages
Task 2 Model Plan
No ratings yet
Task 2 Model Plan
2 pages
Predictive Modeling Plan
No ratings yet
Predictive Modeling Plan
2 pages
Aniket Project
No ratings yet
Aniket Project
4 pages
Credit Risk Prediction Model Analysis
No ratings yet
Credit Risk Prediction Model Analysis
7 pages
Credit Default Project 23124001
No ratings yet
Credit Default Project 23124001
13 pages
Business Report M2 PDF
100% (2)
Business Report M2 PDF
14 pages
T2 Geldium Delinquency
No ratings yet
T2 Geldium Delinquency
3 pages
Finance and Risk Analytics Project Sai Vinayak Sanam PDF
No ratings yet
Finance and Risk Analytics Project Sai Vinayak Sanam PDF
99 pages
Task 2 ModelPlan Template
No ratings yet
Task 2 ModelPlan Template
3 pages
Business Report
No ratings yet
Business Report
2 pages
Finclub Summer Project 2 (2025)
No ratings yet
Finclub Summer Project 2 (2025)
7 pages
Ai It HW MST Prac
No ratings yet
Ai It HW MST Prac
14 pages
PSB Hackathon
No ratings yet
PSB Hackathon
15 pages
Loan Delinquency Prediction-1
No ratings yet
Loan Delinquency Prediction-1
4 pages
ML - Report
No ratings yet
ML - Report
10 pages
Predict The Probability of Financial Distress
No ratings yet
Predict The Probability of Financial Distress
13 pages
Machine Learning Paper BD
No ratings yet
Machine Learning Paper BD
16 pages
No 3
No ratings yet
No 3
2 pages
Credit Card Default Report
No ratings yet
Credit Card Default Report
4 pages
Introduction of Phase 4
No ratings yet
Introduction of Phase 4
14 pages
Phase 3
No ratings yet
Phase 3
19 pages
Updated Business Summary Report
No ratings yet
Updated Business Summary Report
3 pages
Based On This Dataset, Which Predic
No ratings yet
Based On This Dataset, Which Predic
1 page
Session 5
No ratings yet
Session 5
21 pages
Business Summary Report
No ratings yet
Business Summary Report
4 pages
EDA Report
No ratings yet
EDA Report
2 pages
Decision Making Assignment
No ratings yet
Decision Making Assignment
6 pages
Group Assignment - Fraud Detection-1
No ratings yet
Group Assignment - Fraud Detection-1
15 pages
Random Forest
100% (1)
Random Forest
11 pages
Project: Creditworthiness: Step 1: Business and Data Understanding
No ratings yet
Project: Creditworthiness: Step 1: Business and Data Understanding
12 pages
B17 Discrete Report
No ratings yet
B17 Discrete Report
16 pages
SSRN Id3769854
No ratings yet
SSRN Id3769854
8 pages
Credit Risk Analysis
No ratings yet
Credit Risk Analysis
6 pages
Azki-Loan Data Analysis & Modeling
No ratings yet
Azki-Loan Data Analysis & Modeling
7 pages
Reading Material - Module-5 - Introduction To Special Topics
No ratings yet
Reading Material - Module-5 - Introduction To Special Topics
27 pages
Shiva Lab6
No ratings yet
Shiva Lab6
9 pages
Assigemet No 7
No ratings yet
Assigemet No 7
10 pages
DSA Lab Assigment 4
No ratings yet
DSA Lab Assigment 4
21 pages
DSA Lab Assigment 2
No ratings yet
DSA Lab Assigment 2
14 pages
Assigement Lab 3.3
No ratings yet
Assigement Lab 3.3
15 pages
The Paper
No ratings yet
The Paper
3 pages
Alf Report
No ratings yet
Alf Report
21 pages
Interview Q&A
No ratings yet
Interview Q&A
7 pages
AML AfterMid Merged
No ratings yet
AML AfterMid Merged
389 pages
Dsbda Unit4
No ratings yet
Dsbda Unit4
110 pages
Project Report
No ratings yet
Project Report
21 pages
1 s2.0 S0098300419310039 Main
No ratings yet
1 s2.0 S0098300419310039 Main
15 pages
3887-Article Text-6946-1-10-20190702
No ratings yet
3887-Article Text-6946-1-10-20190702
8 pages
Opinion Spam and Analysis: (WSDM, 08) Nitin Jindal and Bing Liu
No ratings yet
Opinion Spam and Analysis: (WSDM, 08) Nitin Jindal and Bing Liu
30 pages
Homework4 DSCI 552
No ratings yet
Homework4 DSCI 552
4 pages
Generative AI and Machine Learning Course Content
No ratings yet
Generative AI and Machine Learning Course Content
19 pages
Intrusion Detection System For IoT Environments Using Machine Learning Techniques
No ratings yet
Intrusion Detection System For IoT Environments Using Machine Learning Techniques
7 pages
Detecting False Alarms From Automatic Static Analysis Tools: How Far Are We?
No ratings yet
Detecting False Alarms From Automatic Static Analysis Tools: How Far Are We?
12 pages
XPrediction of Reference Evapotranspiration For
No ratings yet
XPrediction of Reference Evapotranspiration For
10 pages
MDU B.Tech CSE 8th Sem Syllabus
No ratings yet
MDU B.Tech CSE 8th Sem Syllabus
7 pages
Hierarchical Variable Importance With Statistical Control For Medical Data-Based Prediction
No ratings yet
Hierarchical Variable Importance With Statistical Control For Medical Data-Based Prediction
15 pages
Long-Haul Tourist Preferences
No ratings yet
Long-Haul Tourist Preferences
17 pages
Risk-Sensitive Conformal Prediction For Catheter Placement Detection in Chest X-Rays
No ratings yet
Risk-Sensitive Conformal Prediction For Catheter Placement Detection in Chest X-Rays
12 pages
Initial Lactate Levels Versus Lactate Clearance For Predicting Mortality in Sepsis A Prospective Observational Analytical Study - PN
No ratings yet
Initial Lactate Levels Versus Lactate Clearance For Predicting Mortality in Sepsis A Prospective Observational Analytical Study - PN
5 pages
40 ML Interview Questions That You Must Know (2024) - Reader View
No ratings yet
40 ML Interview Questions That You Must Know (2024) - Reader View
13 pages
Cost-Sensitive Learning For Imbalanced Medical Dat
No ratings yet
Cost-Sensitive Learning For Imbalanced Medical Dat
73 pages
RMPE Handout
No ratings yet
RMPE Handout
9 pages
CAD System For Lung Nodule Detection Using Deep Learning With CNN
No ratings yet
CAD System For Lung Nodule Detection Using Deep Learning With CNN
8 pages
Diagnosis of Carpal Tunnel Syndrome
No ratings yet
Diagnosis of Carpal Tunnel Syndrome
11 pages
Detecting Cybersecurity Attacks Across Different Network Features and Learners
No ratings yet
Detecting Cybersecurity Attacks Across Different Network Features and Learners
29 pages
Classification Metrics Overview
No ratings yet
Classification Metrics Overview
13 pages
Anomaly Detection in Surveillance
No ratings yet
Anomaly Detection in Surveillance
9 pages
A Review On Lane Detection and Tracking Techniques
No ratings yet
A Review On Lane Detection and Tracking Techniques
8 pages
Scorecard
No ratings yet
Scorecard
32 pages
Predictive+Modelling+-+Logistic+Regression+-+Student+Version-New2.3.ipynb - Colaboratory
No ratings yet
Predictive+Modelling+-+Logistic+Regression+-+Student+Version-New2.3.ipynb - Colaboratory
12 pages
Business Intelligence Question Bank
No ratings yet
Business Intelligence Question Bank
49 pages

Predictive Model Plan Report

Uploaded by

Predictive Model Plan Report

Uploaded by

Predictive model plan report

1 model logic (generated with genAI)

• Impute missing values (Income, Loan_Balance, Credit_Score)

• Cap Credit_Utilization at 1.0

• Encode categorical variables (e.g., Employment_Status, Month_1–Month_6)

• Normalize or scale numerical features if needed

Select key predictors:

• Train on labeled data (Delinquent_Account as target)

• Use train-test split or cross-validation

• Evaluate with accuracy, precision, recall, F1-score

• Review confusion matrix for false positives/negatives

5. Prediction and Risk Scoring

• Output a binary label (0 = Non-delinquent, 1 = Delinquent)

• Optional: Output probability score for risk ranking.

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import classification_report

# Step 1: Define features and target

# Step 2: Train-test split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Model training

2 justification for model choice

Metric What is measure Why it matter for delinquency

You might also like