0% found this document useful (0 votes)

17 views7 pages

Fraud Detection

The document outlines a fraud detection model implemented in Python using a dataset of transactions. It includes data preprocessing steps such as handling missing values, outlier detection, feature encoding, and scaling, followed by training a Random Forest classifier. The model's performance is evaluated using classification metrics and a confusion matrix, achieving a high ROC AUC score.

Uploaded by

karan.17475

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views7 pages

Fraud Detection

Uploaded by

karan.17475

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

7/27/25, 6:47 PM Fraud_Detection_Model.

ipynb - Colab

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score

import warnings
warnings.filterwarnings('ignore')

df = pd.read_csv('/Fraud.csv')
df.shape, df.head()

((206154, 11),
step type amount nameOrig oldbalanceOrg newbalanceOrig \
0 1 PAYMENT 9839.64 C1231006815 170136.0 160296.36
1 1 PAYMENT 1864.28 C1666544295 21249.0 19384.72
2 1 TRANSFER 181.00 C1305486145 181.0 0.00
3 1 CASH_OUT 181.00 C840083671 181.0 0.00
4 1 PAYMENT 11668.14 C2048537720 41554.0 29885.86

nameDest oldbalanceDest newbalanceDest isFraud isFlaggedFraud

0 M1979787155 0.0 0.0 0.0 0.0
1 M2044282225 0.0 0.0 0.0 0.0
2 C553264065 0.0 0.0 1.0 0.0
3 C38997010 21182.0 0.0 1.0 0.0
4 M1230701703 0.0 0.0 0.0 0.0 )

df.info()
df.describe(include='all')
df.isnull().sum()

df['isFraud'].value_counts(normalize=True) * 100

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 206154 entries, 0 to 206153
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 step 206154 non-null int64
1 type 206154 non-null object
2 amount 206154 non-null float64
3 nameOrig 206154 non-null object
4 oldbalanceOrg 206154 non-null float64
5 newbalanceOrig 206154 non-null float64
6 nameDest 206154 non-null object
7 oldbalanceDest 206153 non-null float64
8 newbalanceDest 206153 non-null float64
9 isFraud 206153 non-null float64
10 isFlaggedFraud 206153 non-null float64
dtypes: float64(7), int64(1), object(3)
memory usage: 17.3+ MB
proportion

isFraud

0.0 99.926268

1.0 0.073732

dtype: float64

print("Missing Values:\n", df.isnull().sum())

Missing Values:
step 0
type 0
amount 0
nameOrig 0
oldbalanceOrg 0

https://colab.research.google.com/drive/1rQs8J3QDlxAZqZCVthXdMevFCaTLfXU0#scrollTo=et0tIoemzlqu 2/7
7/27/25, 6:47 PM Fraud_Detection_Model.ipynb - Colab
newbalanceOrig 0
nameDest 0
oldbalanceDest 1
newbalanceDest 1
isFraud 1
isFlaggedFraud 1
dtype: int64

sns.boxplot(x=df['amount'])
plt.title("Outliers in Transaction Amount")
plt.show()

Q1 = df['amount'].quantile(0.25)
Q3 = df['amount'].quantile(0.75)
IQR = Q3 - Q1
df = df[(df['amount'] >= Q1 - 1.5 * IQR) & (df['amount'] <= Q3 + 1.5 * IQR)]

df['type_encoded'] = LabelEncoder().fit_transform(df['type'])
df_model = df.drop(['nameOrig', 'nameDest', 'isFlaggedFraud', 'type'], axis=1)
scaler = StandardScaler()
df_model['amount_scaled'] = scaler.fit_transform(df_model[['amount']])
df_model.drop('amount', axis=1, inplace=True)

df_model.head()

step oldbalanceOrg newbalanceOrig oldbalanceDest newbalanceDest isFraud type_encoded amount_scaled

0 1 170136.0 160296.36 0.0 0.0 0.0 3 -0.800597

1 1 21249.0 19384.72 0.0 0.0 0.0 3 -0.859746

2 1 181.0 0.00 0.0 0.0 1.0 4 -0.872230

3 1 181.0 0.00 21182.0 0.0 1.0 1 -0.872230

4 1 41554.0 29885.86 0.0 0.0 0.0 3 -0.787036

df_model.dropna(subset=['isFraud'], inplace=True)

X = df_model.drop('isFraud', axis=1)
y = df_model['isFraud']

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, class_weight='balanced', random_state=42)

model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Classification Report:\n", classification_report(y_test, y_pred))

print("ROC AUC Score:", roc_auc_score(y_test, model.predict_proba(X_test)[:,1]))

sns.heatmap(confusion_matrix(y_test, y_pred), annot=True, fmt='d', cmap='Blues')

plt.title("Confusion Matrix")
plt.show()

Classification Report:
precision recall f1-score support

0.0 1.00 1.00 1.00 38761

1.0 0.89 0.35 0.50 23

accuracy 1.00 38784

macro avg 0.94 0.67 0.75 38784
weighted avg 1.00 1.00 1.00 38784

ROC AUC Score: 0.9115863883800728

feat_importances = pd.Series(model.feature_importances_, index=X.columns)

feat_importances.nlargest(10).plot(kind='barh')
plt.title("Top 10 Important Features")
https://colab.research.google.com/drive/1rQs8J3QDlxAZqZCVthXdMevFCaTLfXU0#scrollTo=et0tIoemzlqu 6/7
7/27/25, 6:47 PM Fraud_Detection_Model.ipynb - Colab
p ( p p )
plt.show()

https://colab.research.google.com/drive/1rQs8J3QDlxAZqZCVthXdMevFCaTLfXU0#scrollTo=et0tIoemzlqu 7/7

Task
No ratings yet
Task
15 pages
Observation: As We Can See We Have Threwe Types of Datatypes I.E. (Int, Float, Object) That Means We Have Both Categorical and Numerical Data
No ratings yet
Observation: As We Can See We Have Threwe Types of Datatypes I.E. (Int, Float, Object) That Means We Have Both Categorical and Numerical Data
2 pages
Fraud Data Cleaning and Analysis
No ratings yet
Fraud Data Cleaning and Analysis
26 pages
"Normal" "Fraud": #Check For Any Null Values
No ratings yet
"Normal" "Fraud": #Check For Any Null Values
7 pages
Fault Prediction
No ratings yet
Fault Prediction
6 pages
IBM Credit Card Fraud Detection
No ratings yet
IBM Credit Card Fraud Detection
12 pages
Assignment
No ratings yet
Assignment
5 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
8 pages
Capstone Report: FIRST NAME: Gopalakrishnan LAST NAME: Kalarikovilagam Subramanian M12821535
No ratings yet
Capstone Report: FIRST NAME: Gopalakrishnan LAST NAME: Kalarikovilagam Subramanian M12821535
17 pages
Credit Card 1679991215
No ratings yet
Credit Card 1679991215
26 pages
Fraud 2
No ratings yet
Fraud 2
20 pages
Phase 2 New
No ratings yet
Phase 2 New
14 pages
2.3 - Jupyter Notebook
No ratings yet
2.3 - Jupyter Notebook
24 pages
Fraud Prediction Random Forest
No ratings yet
Fraud Prediction Random Forest
22 pages
Credit - Card - Fraud - Detection Using ML - Jupyter Notebook2
No ratings yet
Credit - Card - Fraud - Detection Using ML - Jupyter Notebook2
13 pages
Credit Card Fraud Detection Methods
100% (1)
Credit Card Fraud Detection Methods
20 pages
Introduction of Phase 4
No ratings yet
Introduction of Phase 4
14 pages
Credit - Card - Fraud - Detection Using ML - Jupyter Notebook
No ratings yet
Credit - Card - Fraud - Detection Using ML - Jupyter Notebook
12 pages
Assgn 04 ML Jatan - Colab
No ratings yet
Assgn 04 ML Jatan - Colab
4 pages
Ads Phase4
No ratings yet
Ads Phase4
5 pages
ML
No ratings yet
ML
10 pages
Credit Card Fraud Detection Model
No ratings yet
Credit Card Fraud Detection Model
1 page
Eda Case Study Code
No ratings yet
Eda Case Study Code
40 pages
Analyzing Customer Data with NumPy
No ratings yet
Analyzing Customer Data with NumPy
9 pages
Dsbda Exp4 Part1
No ratings yet
Dsbda Exp4 Part1
39 pages
Data Mining - Project
100% (2)
Data Mining - Project
11 pages
05 E RandomForest LoanData
No ratings yet
05 E RandomForest LoanData
8 pages
AI10
No ratings yet
AI10
2 pages
Credit Card Fraud Detection With CNN 99 Accuracy
No ratings yet
Credit Card Fraud Detection With CNN 99 Accuracy
12 pages
Hands-On Activity 3.3 Random Forest Mantaring - Ipynb - Mantaring
No ratings yet
Hands-On Activity 3.3 Random Forest Mantaring - Ipynb - Mantaring
13 pages
Credit Scores Classification
No ratings yet
Credit Scores Classification
104 pages
Documentation Part by Pranay Kashyap
No ratings yet
Documentation Part by Pranay Kashyap
7 pages
Online Payment Fraud Detection - Ipynb
No ratings yet
Online Payment Fraud Detection - Ipynb
120 pages
Credit-Card - Notebooks - Preprocessed-Data - Data - Preprocessing - Ipynb at Main Shubhamdongarjal - Credit-Card
No ratings yet
Credit-Card - Notebooks - Preprocessed-Data - Data - Preprocessing - Ipynb at Main Shubhamdongarjal - Credit-Card
15 pages
Module 3.4 Classification Models, Case Study
No ratings yet
Module 3.4 Classification Models, Case Study
12 pages
DS2 C5 S1 Preparing Data Machine Learning Concept Codebook
No ratings yet
DS2 C5 S1 Preparing Data Machine Learning Concept Codebook
1 page
Credit Card Fraud Detection Using Autoencoder
No ratings yet
Credit Card Fraud Detection Using Autoencoder
8 pages
Detecting Unauthorized Credit Card Fraud
No ratings yet
Detecting Unauthorized Credit Card Fraud
28 pages
Project Report
No ratings yet
Project Report
34 pages
Predictive Modelling Alternative Firm Level PDF
100% (4)
Predictive Modelling Alternative Firm Level PDF
26 pages
Feature Engg Code
No ratings yet
Feature Engg Code
16 pages
Fruaddetectiondata2 CSV
No ratings yet
Fruaddetectiondata2 CSV
24 pages
Exp 343
No ratings yet
Exp 343
18 pages
Afbpr 7
No ratings yet
Afbpr 7
7 pages
Fraud Transaction Detection - Ipynb - Colab - Rameshkumar
No ratings yet
Fraud Transaction Detection - Ipynb - Colab - Rameshkumar
7 pages
Data Overview: 25480 Entries
No ratings yet
Data Overview: 25480 Entries
11 pages
Project Intern - Jupyter Notebook
No ratings yet
Project Intern - Jupyter Notebook
16 pages
Data Preprocessing 1
No ratings yet
Data Preprocessing 1
6 pages
Tsne On Credit Card
No ratings yet
Tsne On Credit Card
9 pages
Bis Micro Project
No ratings yet
Bis Micro Project
8 pages
Fraud Detection with Python Techniques
No ratings yet
Fraud Detection with Python Techniques
30 pages
Practical 4
No ratings yet
Practical 4
3 pages
Yolo-NAS Predictions for Fraud Detection
No ratings yet
Yolo-NAS Predictions for Fraud Detection
25 pages
DA Lab Manual r22
No ratings yet
DA Lab Manual r22
31 pages
PROJECT1
No ratings yet
PROJECT1
17 pages
Practical 3
No ratings yet
Practical 3
8 pages
Fraud Detection with Machine Learning
No ratings yet
Fraud Detection with Machine Learning
8 pages
Online Payment Fraud Detection ML
No ratings yet
Online Payment Fraud Detection ML
40 pages
AI Basics and Python Numpy Guide
No ratings yet
AI Basics and Python Numpy Guide
6 pages
Lecture TWP Python A05 1a 2D Graphics
No ratings yet
Lecture TWP Python A05 1a 2D Graphics
39 pages
Análisis Estructural de Armaduras 2D
No ratings yet
Análisis Estructural de Armaduras 2D
15 pages
Python Matplotlib Hands On
100% (1)
Python Matplotlib Hands On
6 pages
Stat - Lab 1
No ratings yet
Stat - Lab 1
6 pages
Puthon Loucuras
No ratings yet
Puthon Loucuras
2 pages
Numpy - Ipynb - Colaboratory
No ratings yet
Numpy - Ipynb - Colaboratory
4 pages
Pengantar Perkuliahan
No ratings yet
Pengantar Perkuliahan
5 pages
3D Plotting Techniques in Matplotlib
No ratings yet
3D Plotting Techniques in Matplotlib
5 pages
Prak 4
No ratings yet
Prak 4
8 pages
Heatmaps: Temperature, Humidity, Wind, Light, Sound
No ratings yet
Heatmaps: Temperature, Humidity, Wind, Light, Sound
5 pages
Lab1.ipynb - Colab
No ratings yet
Lab1.ipynb - Colab
5 pages
Combined Numpy Pandas Matplotlib Seaborn Roadmap
No ratings yet
Combined Numpy Pandas Matplotlib Seaborn Roadmap
2 pages
Homework 12 IP 2025-26 02 Based On Series Summer Vacation
No ratings yet
Homework 12 IP 2025-26 02 Based On Series Summer Vacation
4 pages
Data Analysis in Python RichContent
No ratings yet
Data Analysis in Python RichContent
61 pages
Pandas Questions Ip File
No ratings yet
Pandas Questions Ip File
13 pages
30 Day Data Science Tracker
No ratings yet
30 Day Data Science Tracker
1 page
Requirements
No ratings yet
Requirements
7 pages
7-Mavzu To'liq
No ratings yet
7-Mavzu To'liq
12 pages
Math PR 4
No ratings yet
Math PR 4
7 pages
Chapter 2 Pandas Python I Extra Questions
No ratings yet
Chapter 2 Pandas Python I Extra Questions
8 pages
K Means Clustering - Ipynb - Colaboratory
No ratings yet
K Means Clustering - Ipynb - Colaboratory
4 pages
Pronosticos - Ipynb - Colaboratory
No ratings yet
Pronosticos - Ipynb - Colaboratory
13 pages
Lucky Kumar 070423
No ratings yet
Lucky Kumar 070423
10 pages
Monthly Temperature Data Analysis
No ratings yet
Monthly Temperature Data Analysis
28 pages
Python Data Science Exercises
No ratings yet
Python Data Science Exercises
3 pages
Titanic Dataset
No ratings yet
Titanic Dataset
9 pages
Pandas Data Structures and Operations
No ratings yet
Pandas Data Structures and Operations
9 pages
Iris - Ipynb - Colaboratory
No ratings yet
Iris - Ipynb - Colaboratory
8 pages
Python数据科学速查表 - Scikit-Learn
No ratings yet
Python数据科学速查表 - Scikit-Learn
1 page

Fraud Detection

Uploaded by

Fraud Detection

Uploaded by

7/27/25, 6:47 PM Fraud_Detection_Model.

from sklearn.model_selection import train_test_split

nameDest oldbalanceDest newbalanceDest isFraud isFlaggedFraud

print("Missing Values:\n", df.isnull().sum())

step oldbalanceOrg newbalanceOrig oldbalanceDest newbalanceDest isFraud type_encoded amount_scaled

0 1 170136.0 160296.36 0.0 0.0 0.0 3 -0.800597

1 1 21249.0 19384.72 0.0 0.0 0.0 3 -0.859746

2 1 181.0 0.00 0.0 0.0 1.0 4 -0.872230

3 1 181.0 0.00 21182.0 0.0 1.0 1 -0.872230

4 1 41554.0 29885.86 0.0 0.0 0.0 3 -0.787036

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, class_weight='balanced', random_state=42)

print("Classification Report:\n", classification_report(y_test, y_pred))

sns.heatmap(confusion_matrix(y_test, y_pred), annot=True, fmt='d', cmap='Blues')

0.0 1.00 1.00 1.00 38761

accuracy 1.00 38784

ROC AUC Score: 0.9115863883800728

feat_importances = pd.Series(model.feature_importances_, index=X.columns)

You might also like