0% found this document useful (1 vote)
662 views4 pages

Geldium Task2 Model Plan

The document outlines a predictive model plan for credit delinquency using a binary classification approach with logistic regression. It details the model logic, justification for model choice, and an evaluation strategy that includes key metrics, bias detection, and ethical considerations. The model aims to classify customers at risk of delinquency to support proactive decision-making for reducing default rates.

Uploaded by

omhire2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
662 views4 pages

Geldium Task2 Model Plan

The document outlines a predictive model plan for credit delinquency using a binary classification approach with logistic regression. It details the model logic, justification for model choice, and an evaluation strategy that includes key metrics, bias detection, and ethical considerations. The model aims to classify customers at risk of delinquency to support proactive decision-making for reducing default rates.

Uploaded by

omhire2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Predictive Model Plan – Student

Template
1. Model Logic (Generated with GenAI)
The predictive model for credit delinquency will be a binary classification model. The goal is
to predict the `Delinquent_Account` (1 for delinquent, 0 for not delinquent) based on
various customer features.

Pseudo-code/Step-by-Step Process:

- Load Data:
Read the customer dataset containing features like Age, Income, Credit_Score,
Credit_Utilization, Missed_Payments, Loan_Balance, Debt_to_Income_Ratio,
Employment_Status, Account_Tenure, Credit_Card_Type, Location, and Month_1 to Month_6
payment statuses.

- Data Preprocessing:
- Handle missing values using median or mean for numerical features.
- Encode categorical variables:
- One-Hot Encoding for Employment_Status, Credit_Card_Type, and Location.
- Map ordinal values for Month_1 to Month_6 statuses (On-time=0, Late=1, Missed=2).

- Feature Engineering:
- Derive new features such as the total number of missed payments over 6 months and the
proportion of late payments.

- Feature Scaling:
- Use StandardScaler to normalize numerical values.

- Train-Test Split:
- Divide the dataset into 80% training and 20% testing with stratified sampling.

- Model Selection:
- Logistic Regression.

- Model Training:
- Train the logistic regression model to learn optimal coefficients.

- Prediction:
- Predict delinquency probability; classify customers as delinquent if probability > 0.5.
- Evaluation:
- Use metrics such as Precision, Recall, F1 Score, and AUC-ROC.
- Conduct fairness analysis and bias detection across customer groups.

What the model is designed to do:


This model classifies customers into two groups: those at risk of delinquency and those not
at risk. It supports early identification and helps Geldium make informed, proactive
decisions to reduce default rates.

2. Justification for Model Choice


I selected Logistic Regression as the preferred model for predicting credit delinquency due
to the following reasons:

- Accuracy: Logistic Regression is well-suited for binary classification tasks and performs
well when relationships in the data are linear or can be linearized.

- Transparency: This model allows for direct interpretation of feature importance using
coefficients. It provides clear explanations for why a customer is classified as delinquent,
which is critical for:
- Regulatory compliance,
- Business stakeholder trust, and
- Delivering actionable insights.

- Ease of Use: It requires minimal computational power, simple implementation, and less
tuning compared to complex models.

- Financial Relevance: Logistic regression is widely used in credit scoring due to its
interpretability and the ability to estimate risk probabilities.

- Fit for Geldium: For a financial institution like Geldium, where clarity, fairness, and
explainability are vital, logistic regression balances predictive capability with business
requirements. Alternatives like decision trees may suffer from overfitting, and neural
networks, while powerful, act as black boxes—limiting interpretability and raising fairness
concerns.

3. Evaluation Strategy
To ensure robust and ethical performance of the model, the following evaluation strategy
will be implemented:
Key Metrics:

- Precision: Measures the proportion of correctly predicted delinquents out of all


delinquency predictions. High precision reduces unnecessary customer interventions.

- Recall (Sensitivity): Measures the proportion of actual delinquents correctly identified.


High recall helps avoid missing high-risk customers.

- F1 Score: The harmonic mean of precision and recall. Especially useful in imbalanced
datasets where both false positives and false negatives are costly.

- AUC-ROC Curve: Assesses the model's ability to distinguish between delinquent and non-
delinquent customers across thresholds.

Bias Detection and Fairness Checks:

- Data Bias Review: Check dataset for demographic representation imbalances (e.g., age,
employment status, location).

- Disparate Impact Analysis: Evaluate whether model predictions differ across subgroups in
terms of false positives or false negatives.

- Equal Opportunity Checks: Confirm whether the model performs equally across all
demographic groups in terms of true positive rates.

Bias Mitigation Techniques (if needed):

- Pre-processing: Apply sampling or re-weighting to improve representation.

- In-training Adjustments: Use fairness-aware objectives if bias is detected.

- Post-processing: Adjust classification thresholds to equalize outcomes across sensitive


groups.

Ethical Considerations:

- Transparency: Maintain clear justifications for all predictions.

- Fairness: Avoid proxy discrimination through careful feature selection and fairness audits.

- Data Privacy: Ensure compliance with GDPR/local regulations.

- Human Oversight: Model decisions should be reviewed by analysts to avoid sole reliance
on AI.

- Customer Impact: Monitor for and minimize harm from false predictions. Establish
feedback channels.

- Ongoing Monitoring: Regularly check for data drift and model performance degradation
over time. Retrain or recalibrate when needed.

You might also like