0% found this document useful (0 votes)
45 views50 pages

FRTemplate Software

Uploaded by

mohsin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views50 pages

FRTemplate Software

Uploaded by

mohsin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

Final Project Report

Diabetes Prediction Using Classification Method

Project Supervisor

Warda Fiaz

Submitted By

Group ID: F23020A5E2

Mohsin Ali BC160402464


Muhammad Sajid BC200415580

Software Projects & Research Section,


Department of Computer Sciences,
Virtual University of Pakistan

DIABETES PREDICTION USING CLASSIFICATION METHOD 1


CERTIFICATE
This is to certify that Mohsin Ali Abdul Razzaq (BC160402464), Muhammad
Sajid (BC200415580) have worked on and completed their Software Project at
Software & Research Projects Section, Department of Computer Sciences,
Virtual University of Pakistan in partial fulfillment of the requirement for the
degree of BS in Computer Sciences under my guidance and supervision.

In our opinion, it is satisfactory and up to the mark and therefore fulfills the
requirements of BS in Computer Sciences.

Supervisor / Internal Examiner

Warda Fiaz
Supervisor,
Software Projects & Research Section,
Department of Computer Sciences
Virtual University of Pakistan

___________________
(Signature)

External Examiner/Subject Specialist


<<External Supervisor Name>>

___________________
(Signature)

Accepted By:

_____________
(For office use)

EXORDIUM
DIABETES PREDICTION USING CLASSIFICATION METHOD 1
In the name of Allah, the Compassionate, the
Merciful.

Praise be to Allah, Lord of Creation,


The Compassionate, the Merciful,
King of Judgment-day!

You alone we worship, and to You alone we pray


for help,
Guide us to the straight path

The path of those who You have favored,

Not of those who have incurred Your wrath,


Nor of those who have gone astray.

DIABETES PREDICTION USING CLASSIFICATION METHOD 2


DEDICATION

For my unwavering support system, my family, whose


encouragement, and belief in my dreams have been a
constant source of strength throughout this journey. To my
mentors, whose guidance and wisdom have shaped my
understanding and fueled my passion for this project. And to
all those whose unwavering belief in me has inspired me to
push boundaries and strive for excellence. This work is
dedicated to you, with heartfelt gratitude for your invaluable
presence in my life.

Thanks, and Regards

DIABETES PREDICTION USING CLASSIFICATION METHOD 3


ACKNOWLEDGEMENT

I would like to express my sincere gratitude to my project supervisor,


Miss. Warda Fiaz, for her invaluable guidance, support, and
encouragement throughout this project. Her expertise and constructive
criticism have been instrumental in shaping this work.

I would also like to thank my teammate for his collaboration and


teamwork during the development process. His dedication and
positive attitude made this project a rewarding experience.

Thanks for Support and Guidance.

DIABETES PREDICTION USING CLASSIFICATION METHOD 4


PREFACE

Diabetes is a chronic health condition affecting millions globally. Early diagnosis is


crucial for effective management and preventing complications. This project explores
the potential of machine learning in predicting diabetes using patient data.
This report details the development of a diabetes prediction model. The model
leverages a publicly available dataset from Kaggle [Dataset reference:
[Link]

The report outlines the following key functionalities:


 Data Acquisition and Exploration: The model imports the dataset, performs
exploratory data analysis (EDA) to understand data characteristics, and
visualizes trends and patterns.
 Data Preprocessing: The data undergoes preprocessing steps, including
splitting into training and testing sets for model evaluation.
 Model Development and Training: Different machine learning algorithms,
including Neural Networks, Support Vector Machines (SVM), Decision Trees,
and Logistic Regression, are applied to the training data. This comparative
approach allows for identifying the algorithm with the highest predictive
accuracy.
 Model Evaluation: The trained models are evaluated on the testing data set.
Metrics like accuracy, precision, recall, and F1-score are used to assess the
model's performance.
 Prediction and Results: The most accurate model is used to predict diabetes in
new patients based on their data. A confusion matrix visualizes the
performance, providing insights into true positives, negatives, and potential
errors.
 Model Persistence: The final, high-performing model is saved for future use in
predicting diabetes.

This project aims to demonstrate the feasibility of utilizing machine learning for
diabetes prediction. While not intended as a diagnostic tool, the model can act as a
preliminary screening mechanism, potentially aiding patients, and healthcare
professionals in early detection.

DIABETES PREDICTION USING CLASSIFICATION METHOD 5


TABLE OF CONTENTS

CHAPTER NO. 1
GATHERING & ANALYZING INFO...................................................08

1.1 INTRODUCTION

1.2 PURPOSE

1.3 SCOPE

1.4 DEFINITIONS, ACRONYMS AND ABBREVIATIONS

1.5 PROJECT REQUIREMENTS

1.5.1 Functional Requirements

1.5.2 Non-Functional Requirements

1.6 USE CASES AND USAGE SCENARIOS

1.6.1 Use Case Diagrams

1.6.2 Usage Scenarios

1.7 DEVELOPMENT METHODOLOGY

1.7.1 Chosen Methodology

1.7.2 Reasons for Chosen Methodology

1.7.2 Work Plan (Gantt Chart)

1.7.2 Project Schedule (Submission Calendar)

CHAPTER NO. 2
DESIGNING THE PROJECT............................................................18

2.1 INTRODUCTION

DIABETES PREDICTION USING CLASSIFICATION METHOD 6


2.2 PURPOSE

2.3 SCOPE

2.4 DEFINITIONS, ACRONYMS AND ABBREVIATIONS

2.5 ARCHITECTURAL REPRESENTATION (ARCHITECTURE DIAGRAM)

2.6 DYNAMIC MODEL: SEQUENCE DIAGRAMS

2.7 RESEARCH METHODOLOGY

2.8 DATASET MODEL (DATASET DIAGRAM)

2.9 GRAPHICAL USER INTERFACES

2.10 RESULTS AND DISCUSSION

CHAPTER NO.3
DEVELOPMENT.............................................................................40
3.1 DEVELOPMENT PLAN (ARCHITECTURE DIAGRAM)

DIABETES PREDICTION USING CLASSIFICATION METHOD 7


CHAPTER 1
Gathering & Analyzing Info

DIABETES PREDICTION USING CLASSIFICATION METHOD 8


Gathering & Analyzing Info

1.1. Introduction:

This chapter lays the foundation for the entire project by


outlining the data possession and analysis processes. A strong understanding of the
data is essential for building an effective machine learning model for diabetes
prediction.

Importance of Data Gathering and Analyzing:


In the domain of machine learning, data serves as the life force of any
project. It is the raw material from which models learn and eventually derive their
predictive power. For our diabetes prediction model, data plays a principally
critical role:

 Understanding Disease Patterns:


The model will be trained on a dataset containing patient
information possibly linked to diabetes. By analyzing this data, the model
can identify patterns and relationships between specific factors and the
presence or absence of diabetes.
 Building Predictive Ability:
Through coverage to a large and diverse dataset, the model can
learn to recognize these patterns and apply them to new, unseen patient
data. This enables the model to make predictions about whether a patient
is likely to have diabetes based on their specific characteristics.
 Refining Model Performance:
The quality and quantity of data directly affect the model's
accuracy. An informative dataset allows the model to learn more
efficiently, leading to more reliable predictions.

1.2. Purpose:

The combined purpose of data gathering, and analysis is to establish a


strong foundation for the machine learning model. By collecting related and
informative data, and then carefully analyzing it, we provide the model with the
essential "knowledge" to learn from and make precise predictions.

Think of it this way: the data serves as the training material for the model. Just as a
student wouldn't perform well on an exam without proper learning materials, a
machine learning model wouldn't achieve ideal performance without high-quality
data and a clear estimation of its characteristics.

In machine learning projects, data gathering, and analysis are fundamental steps
that lay the foundation for a successful model.

DIABETES PREDICTION USING CLASSIFICATION METHOD 9


Here's a breakdown of their purpose:

Data Gathering:
The initial step involves collecting data that relates to the problem you're
trying to solve. In our case, for the diabetes prediction model, we'll gather data on
patients that might contain factors potentially linked to diabetes. This data could
include demographics, lifestyle habits, blood test results, and other relevant
medical information.
The goal is to acquire a comprehensive dataset that encompasses a wide range of
patient profiles. This variety ensures the model meets various scenarios and can
simplify its learnings effectively.

Data Analysis:
Once gathered, the data needs to be thoroughly examined to understand its
structure, content, and possibility limitations. This consists of identifying data
types, checking for missing values, and analyzing statistical properties. Through
techniques like Exploratory Data Analysis (EDA), we can reveal patterns, trends,
and relationships within the data. This analysis helps us understand how different
factors might be linked to the presence or absence of diabetes. The analysis often
reveals the need for data preprocessing steps such as cleaning (handling missing
values or inconsistencies), normalization (scaling data to a common range), or
feature engineering (creating new features from existing ones). This ensures the
data is suitable for the chosen machine learning algorithms.

1.3. Scope:

This project aims to develop a machine learning model capable of


predicting the prospect of diabetes in patients. This model will serve as a valuable
decision-support tool for both doctors and patients. the model will predict the
likelihood of diabetes based on the relevant features such as:

● Glucose level
● Body Mass Index (BMI)
● Blood Pressure (BP)
● Age
● Family history etc. of the patient.

1.3.1. Key Functionalities:

 Data Gaining: The model will influence a comprehensive dataset


containing distinct health parameters of patients, potentially obtained from a
reputable source like Kaggle.
 Data Preprocessing and Investigation: The gathered data will experience
thorough cleaning and preprocessing steps to ensure consistency and
quality. Exploratory Data Analysis (EDA) techniques will be employed to
understand data characteristics, identify patterns, and potentially reveal
correlations between specific features and diabetes.

DIABETES PREDICTION USING CLASSIFICATION METHOD 10


 Model Selection and Training: Different machine learning algorithms,
such as Neural Networks, Support Vector Machines (SVM), Decision Trees,
and Logistic Regression, will be evaluated for their suitability in predicting
diabetes. The model operating the highest accuracy on the training data will
be selected for further evaluation.

 Model Evaluation: The chosen model's performance will be assessed using


a separate testing dataset. Key metrics like accuracy, precision, recall, and
F1-score will be used to estimate the model's effectiveness in accurately
predicting diabetes.

1.4. Definitions, Acronyms, and Abbreviations:

1.5. Project Requirements:

1.5.1. Functional Requirements:


i. Import Data:
● Enable system to import dataset files from external
sources. Like the given link of the Kaggle dataset.
ii. Display Data:
● Develop a module to generate and show the summary
statistics, trends, patterns, and insights through
Exploratory Data Analysis (EDA).
● Ensure the visual representation of data is user-friendly
and provides meaningful insights to facilitate decision-
making.

iii. Pre-process the Data:


● Implement the data pre-processing functionalities,
including:
⮚ Data wrangling to process and clean the raw
data into a usable format. For Fixing Missing
values, Duplicate Data, Invalid Data and
Noise.
⮚ Splitting the dataset into training (70%) and
testing (30%) sets.
⮚ Training the model using a neural network
machine learning algorithm.

iv. Data Analysis

DIABETES PREDICTION USING CLASSIFICATION METHOD 11


 Now the cleaned and prepared data is passed on to the
analysis step. This step involves:
 Selection of analytical techniques
 Building models
 Review the result.
 The method we will use is already defined and will be
classification method.

v. Training:
● Develop a module to train the dataset using a
Neural Network machine learning algorithm,
ensuring efficient and effective model learning.
vi. Testing:
● Create test data to apply the test data on trained
models for evaluation.
● Display the results of the trained model on test
data.

vii. Apply Models:


● Implement functionality to apply different
(Classification) machine learning models to
Ensure seamless integration and execution of
various models for comparative analysis. Such as
 Support Vector Machine (SVM)
 Logistic Regression
 Decision Tree

viii. Results:
● Generate a confusion matrix to access the model’s
performance metrics, including:
⮚ Accuracy

⮚ Precision

⮚ Recall

⮚ F1 Score

● Display the results in a clear and understandable format.


ix. Accuracy
● Develop a feature to determine which machine learning
model achieves the highest accuracy for predicting
diabetes.
● Provide a comparative analysis of model accuracies to
assist users in model selection.
DIABETES PREDICTION USING CLASSIFICATION METHOD 12
x. Model Saving:
● Implement a functionality to save the trained model for
future use, allowing users to reuse the model without
retraining.

1.5.2. Non-Functional Requirements:


i. Performance:
● Ensure the system performs efficiently, with minimal
latency during data processing, training, and testing
phases.
ii. Scalability:
● Design a system in such a way that handles varying sizes
of the dataset.
iii. User Interface:
● There is no user GUI for this project it will run in Jupiter
notebook or any other supported software.

iv. Security:
● Ensure the user’s data security across the UI to model.
v. Reliability:
● Design the system to be reliable, accurate, with less
errors, and providing accurate predictions consistency.
vi. Documentations:
● Provide comprehensive user guide documentation.
vii. Cross Platform:
● As Jupiter IDE (or other IDE) run on different OS so no
issue for OS dependence.

DIABETES PREDICTION USING CLASSIFICATION METHOD 13


1.6. Use Cases and Usage Scenarios:

1.6.1. Use Case Diagrams:

DIABETES PREDICTION USING CLASSIFICATION METHOD 14


1.6.2. Usage Scenarios:

DIABETES PREDICTION USING CLASSIFICATION METHOD 15


Author
Alternative Pre- Post-
ID Title Actions Description Exceptions
Paths Conditions Conditions

Admin Admin imports the Dataset files Dataset


Admin Dataset is
selects dataset files from are files are not
cancels the successfully
01 Import Data "Import the provided available available or
import imported into
Data" Kaggle link into the and are
process. the system.
action. system. accessible. corrupted.
Admin pre-
processes the
dataset by splitting Dataset is Data is split, Dataset is
Admin
it into training Admin successfully and the not
selects
(70%) and testing skips the imported model is imported, or
Pre-process "Pre-
02 (30%) sets. The pre- and trained using insufficient
Data process
system then processing available Neural data for
Data"
initiates training step. for pre- Network training is
action.
using a Neural processing. algorithm. present.
Network machine

F23020A5E2
learning algorithm.
Dataset is
Admin explores
successfully
and visualizes the
Admin imported Data is Dataset is
dataset, observing Admin
selects and visually empty or
EDA summary statistics, skips the
03 "EDA available displayed not
Analysis trends, patterns, and visualizatio
Analysis" for analysis. with relevant preprocesse
insights through n step.
action. Data must insights. d.
Exploratory Data
be wrangled
Analysis (EDA).
(clean).
Admin applies the
test data on the
Model has
previously trained Model is
been
Admin model for Test results not trained,
Admin successfully
selects evaluation, are displayed or
04 Testing skips the trained
"Testing" observing and and insufficient
testing step. using the
action. interpreting the interpreted. test data is
training
results of the available.
data.
model's predictions
on the test dataset.
Dataset is
Admin initiates the Dataset is
Model is not pre-
training process, successfully
Admin Admin successfully processed,
allowing the system pre-
selects cancels the trained using or
05 Training to train the dataset processed
"Training" training Neural insufficient
using a Neural and
action. process. Network data for
Network machine available
algorithm. training is
learning algorithm. for training.
present.

DIABETES PREDICTION USING CLASSIFICATION METHOD 16


Admin applies
Model is
different machine Model has
not trained,
Admin learning models been Results of
Admin or
selects (SVM, Decision successfully different
Apply skips insufficient
06 "Apply Tree, Logistic trained models are
Models applying data for
Models" Regression) on the using the displayed for
models. applying
action. training data for training comparison.
models is
comparative data.
present.
analysis.
Model is
Admin generates a
not tested,
confusion matrix to Model has Confusion
Admin Admin or
assess the been matrix and
selects skips insufficient
07 Results Accuracy, successfully performance
"Results" viewing data for
Precision, Recall, trained and metrics are

F23020A5E2
action. results. generating
and F1 Score in the tested. displayed.
results is
trained model.
present.
Admin identifies Results of
and showcases Results of different
Admin Admin Model with
which machine different models are
selects skips the highest
08 Accuracy learning model models are not
"Accuracy checking accuracy is
achieves the highest successfully available or
" action. accuracy. highlighted.
accuracy for generated. inconclusiv
predicting diabetes. e.

Admin saves the Model has


Admin Trained Model is
trained model for Admin been
selects model is not trained,
Save the future use, allowing skips successfully
09 "Save the saved and or saving
Model for reusability saving the trained and
Model" ready for process
without the need model. results are
action. future use. fails.
for retraining. available.

1.7. Development Methodology:

1.7.1. Chosen Methodology:

DIABETES PREDICTION USING CLASSIFICATION METHOD 17


1.7.2. Work Plan:

1.7.3. Project Schedule:

DIABETES PREDICTION USING CLASSIFICATION METHOD 18


CHAPTER 2
DESIGNING THE PROJECT

DIABETES PREDICTION USING CLASSIFICATION METHOD 19


Designing the Project

This chapter dives into the technical aspects of designing your diabetes prediction
model. It figures the methodologies, tools, and representations used to bring the model
to life.

2.1 Introduction:
Early detection of diabetes is critical for effective management and avoiding
complications. This chapter focuses on designing a machine learning model
capable of predicting the possibility of diabetes in patients. Think of it as a
roadmap outlining the technical choices and steps involved in building this
model. Throughout this chapter, we'll explore:

 The model's core objective: predicting diabetes based on patient data.


 The significance of this model in facilitating early detection and
possibly improving patient effects.
 The design choices and methodologies used, including the chosen
algorithms and data factors.
 Visual representations like architecture diagrams to illustrate the model's
workflow.
 The research methodology guiding the design process.

2.2 Purpose:
This chapter serves as the testing ground for our diabetes prediction model.
Here, we'll assess the model's effectiveness in achieving its core objective:
predicting the prospect of diabetes in patients. By carefully evaluating its
performance, we can gauge its potential value as a tool for early detection.

Its primary purposes include:


i. Roadmap for Development:

The design document serves as a prototype or roadmap for developers,


outlining the overall structure and organization of the system. It provides
a clear plan for how different components will relate and work together
to achieve the project targets.

ii. Information Tool:

DIABETES PREDICTION USING CLASSIFICATION METHOD 20


It acts as an information tool among various stakeholders, including
developers, designers, project managers, and clients. The document
helps in providing the vision and technical details to all team members,
confirming a shared understanding of the project.

iii. Client Understanding:

For client-facing projects, the design document serves as a tool to


provides information in context of technical aspects to non-technical
stakeholders. It helps clients understand the system's structure, features,
and functionality.

iv. In Maintenance and Updates:

As the project changes with the passage of time or requirements, the


design document becomes a valuable resource for maintaining and
updating the system. Developers can refer to the document to understand
the existing architecture and make informed decisions when introducing
new features or fixing issues.

DIABETES PREDICTION USING CLASSIFICATION METHOD 21


v. Basis for Testing:

It also noticed that in the testing phase the design document helps a lot
of testers to test the functionalities of the system. Testers use the design
document to understand the expected behavior and functionality of the
system. It helps in creating test cases, ensuring comprehensive test
coverage, and validating that the implemented features align with the
project requirements.

2.3 Scope
The design of the diabetes prediction model will focus on creating a robust and
informative system for early diabetes detection, with the following key
considerations:

Functionalities:

 Prediction: The primary function of the model is to predict the


likelihood of diabetes in a patient based on their input data (e.g., blood
sugar levels, BMI, age).
 Data Preprocessing: The model will incorporate functionalities for data
preprocessing, including cleaning, handling missing values, and
potentially scaling data to ensure efficient model training.

Algorithms:

 Exploration: We will explore different machine learning algorithms


suitable for classification tasks, such as Neural Networks, Support
Vector Machines (SVM), Decision Trees, and Logistic Regression.
 Selection: Based on evaluation metrics, we will select the algorithm
demonstrating the highest accuracy in predicting diabetes on the training
data.

Data Considerations:

 Source: The model will leverage a comprehensive diabetes dataset from


a reputable source like Kaggle.

DIABETES PREDICTION USING CLASSIFICATION METHOD 22


 Features: We will focus on utilizing relevant features within the dataset
that have established links to diabetes, such as blood sugar levels, BMI,
and family history (if available).
 Quality: Data quality checks will be implemented to identify and
address any inconsistencies or missing values within the dataset.

Limitations:

 Diagnostic Tool: It's crucial to emphasize that this model is not


intended as a definitive diagnostic tool for diabetes. A doctor's diagnosis
and clinical expertise remain essential for patient care. The model's role
is to serve as a preliminary screening mechanism to potentially flag
patients at higher risk who may require further investigation.
 Data Dependence: The model's accuracy is directly tied to the quality
and comprehensiveness of the training data. Biases or limitations within
the data could potentially affect the model's generalizability.

2.4 Definitions, Acronyms, and Abbreviations:

2.5 Architectural Representation:

Architectural Diagram:

DIABETES PREDICTION USING CLASSIFICATION METHOD 23


2.6 Dynamic Model:

Sequence Diagrams:

DIABETES PREDICTION USING CLASSIFICATION METHOD 24


DIABETES PREDICTION USING CLASSIFICATION METHOD 25
DIABETES PREDICTION USING CLASSIFICATION METHOD 26
DIABETES PREDICTION USING CLASSIFICATION METHOD 27
DIABETES PREDICTION USING CLASSIFICATION METHOD 28
DIABETES PREDICTION USING CLASSIFICATION METHOD 29
DIABETES PREDICTION USING CLASSIFICATION METHOD 30
DIABETES PREDICTION USING CLASSIFICATION METHOD 31
2.7 Research Methodology:

DIABETES PREDICTION USING CLASSIFICATION METHOD 32


i. Gathering Data:
In this initial step, relevant data related to diabetic prediction is
collected. This data could include patient records, lab results, lifestyle
information, and other relevant factors.
ii. Data Preparation:
Once the data is gathered, it needs to be prepared for analysis. This
involves cleaning, organizing, and transforming the raw data into a
suitable format for further processing.
iii. Data Wrangling:
Data wrangling refers to the process of transforming and mapping the
data. It includes tasks like handling missing values, feature
engineering, and ensuring data quality.
iv. Data Analysis:
In this step, the wrangled data is analyzed to discover patterns,
correlations, and trends. Exploratory data analysis (EDA) techniques are
commonly used here.
v. Train Model:
The machine learning model is trained using the analyzed data. Various
algorithms (such as decision trees, neural networks, or support vector
machines) are applied to learn from the data and make predictions
vi. Test Model:
After training, the model’s performance is evaluated using a separate
test dataset. Metrics like accuracy, precision, recall, and F1-score are
used to assess how well the model predicts diabetic outcomes.
vii. Deployment:
If the model performs well during testing, it can be deployed for
practical use. This means integrating it into a healthcare system or
application where it can assist in predicting diabetes risk for new
patients.

2.8 Dataset Model

DIABETES PREDICTION USING CLASSIFICATION METHOD 33


Dataset in Tabular Form:

1. Dataset Description:
 The dataset contains information related to diabetes patients.
 Each row represents a specific patient, and each column represents a
different attribute or feature associated with that patient.
2. Column Descriptions:
 preg: Number of times pregnant.
 plas: Plasma glucose concentration (measured in mg/dL).
 pres: Blood pressure (measured in mm Hg).
 skin: Skin thickness (measured in mm).
 test: Insulin test value (measured in μU/mL).
 mass: Body mass index (BMI).
 pedi: Diabetes pedigree function (a genetic risk score).
 age: Age of the patient.
 class: Indicates whether the patient has diabetes (1) or not (0).
3. Relationships:
 Each row corresponds to a specific patient, and the values in the
columns represent the patient’s characteristics.
 For example, the first row might represent a patient who has been
pregnant 6 times (preg=6), has a plasma glucose concentration of 148
mg/dL (plas=148), blood pressure of 72 mm Hg (pres=72), etc.
 The “class” column indicates whether the patient has diabetes (1) or not
(0).
4. Summary Statistics:
 The summary statistics (count, mean, standard deviation, etc.) provide
an overview of the distribution of each attribute across all patients in the
dataset.
 These statistics help us understand the central tendency and variability
of the data.

2.9 Graphical User Interfaces (GUI):

DIABETES PREDICTION USING CLASSIFICATION METHOD 34


I. Loading and Summery of Dataset

II. Trends, Pattern, and Insights

DIABETES PREDICTION USING CLASSIFICATION METHOD 35


Heat Map Correlation:

DIABETES PREDICTION USING CLASSIFICATION METHOD 36


III. Class Balancing

IV. Data Standardization

DIABETES PREDICTION USING CLASSIFICATION METHOD 37


V. Data Splitting

VI. Training And Evaluation Function

DIABETES PREDICTION USING CLASSIFICATION METHOD 38


VII. Result of Models

i. Neural Network Model

ii. SVM Model

DIABETES PREDICTION USING CLASSIFICATION METHOD 39


iii. Decision Tree Model

iv. Logistic Regression Model

DIABETES PREDICTION USING CLASSIFICATION METHOD 40


VIII. Showing the best accuracy model

DIABETES PREDICTION USING CLASSIFICATION METHOD 41


IX. Saving the best model

2.10 Results and Discussion:

Precision Recall F1 Score


Model/Classifier Accuracy
Positive Negative Positive Negative Positive Negative

Neural Network Model 0.75 0.82 0.79 0.67 0.84 0.73 0.76
SVM Model 0.81 0.82 0.79 0.80 0.81 0.81 0.80

Decision Tree Model 0.77 0.78 0.75 0.77 0.76 0.77 0.76

Logistic Regression
Model
0.79 0.80 0.78 0.80 0.78 0.80 0.78

1. Neural Network Model’s Results:

Working:
Neural Network model is performing very well, and it is predicting
diabetes fine on dataset. It trained accurately on dataset and tested on dataset
also.
Results:

DIABETES PREDICTION USING CLASSIFICATION METHOD 42


Neural Network provides 75% accuracy on the dataset. The result of the
model is following provided:

2. Support Vector Machine Model (SVM)

Working:
SVM Model is performing very well and providing accurate results on
the dataset.
Results:
SVM model provides accurate results of 81% on dataset. The result of
the model is followed by:

3. Decision Tree Model

DIABETES PREDICTION USING CLASSIFICATION METHOD 43


Working:
Decision Tree Model is providing very accurate results and trained on
dataset accurately.
Results:
Decision Tree Model provides very accurate results of 77% on dataset.
The results are provided following:

4. Logistic Regression Model

Working:
The Logistic Regression Model provides accurate results and trained on
dataset successfully.
Results:
The logistic regression model provides accurate results of 74% and its
results are followed by:

DIABETES PREDICTION USING CLASSIFICATION METHOD 44


CHAPTER 3
Development

DIABETES PREDICTION USING CLASSIFICATION METHOD 45


Development Plan:

DIABETES PREDICTION USING CLASSIFICATION METHOD 46


1. Project Overview and Scope Definition:

 Understand the purpose of the project, its goals, and the scope.
 Define the problem statement and the desired outcomes.
 Identify any constraints or limitations.

2. Requirements Gathering and Analysis:

 Collect detailed requirements from stakeholders.


 Analyse the architectural design to ensure it aligns with the
requirements.
 Identify any gaps or missing features.

3. Project Planning and Scheduling:

 Create a project schedule with milestones, deadlines, and deliverables.


 Allocate resources (team members, tools, infrastructure).
 Define the project phases (e.g., planning, development, testing,
deployment).

4. Design and Architecture Refinement:

DIABETES PREDICTION USING CLASSIFICATION METHOD 47


 Review the existing architectural design.
 Refine the design based on requirements and best practices.
 Document the updated architecture.

5. Development Phase:

 Set up the development environment (IDE, version control, databases).


 Implement the features/modules based on the architectural components.
 Follow coding standards and guidelines.
 Regularly review and test the code.

6. Testing and Quality Assurance:

 Develop test cases and scenarios.


 Conduct unit testing, integration testing, and system testing.
 Address any defects or issues found during testing.
 Ensure code quality and adherence to standards.

7. Deployment and Release:

 Prepare for deployment (server setup, database migration, etc.).


 Deploy the application to the production environment.
 Monitor performance and stability.
 Release the project to end-users.

DIABETES PREDICTION USING CLASSIFICATION METHOD 48


8. Documentation and Training:

 Create user manuals, technical documentation, and API documentation.


 Train end-users and support teams.
 Document any troubleshooting procedures.

9. Maintenance and Support:

 Provide ongoing maintenance and support.


 Address bug fixes, enhancements, and updates.
 Monitor system health and security.

10. Project Closure and Evaluation:

 Review project success against initial goals.


 Conduct a post-implementation review.
 Archive project artifacts and lessons learned.

DIABETES PREDICTION USING CLASSIFICATION METHOD 49

You might also like