Feasibility Study
6G6Z0019: Research Methods
Name: Abdulhafiz Altamimi
Student ID: 21432057
Degree Programme: AI and Data Science
Project Title: Project DR. AI-Powered Detection of Diabetic Retinopathy
Project Outline
Diabetes effects 537 million individuals globally and its symptoms range from mild such as
constant urination to more debilitating ones such as amputation. One of the most common
and severe symptoms is diabetic retinopathy (DR)(Sun et al., 2022).
This project aims to provide an AI-powered application that can help aid healthcare
professionals in detecting, classifying and predicting DR across its various stages. The goal is
to assist these professionals that may lack either the know-how or the apparatus to perform
these life changing early diagnoses.
The project focuses on training and testing a machine learning model to aid in the detection,
classification and prediction of DR throughout its stages.
Project DR. Mission: Our Why
Rise of Diabetes and Diabetic Retinopathy
• DR is a leading cause of vision impairment and blindness and has posed a significant global health concern.
A comprehensive study published in Diabetes Care estimated that in 2020, 22.27% of individuals with
diabetes globally had DR, translating to approximately 103.12 million people (Kropp et al., 2023). The
progression of DR is influenced by many factors such as the duration of diabetes, blood sugar control, and
the presence of hypertension (Klein and Klein, 2002). Early detection and management are crucial in
preventing vision loss. Regular eye examinations and maintaining optimal blood glucose levels are
recommended to mitigate the risk of developing DR.
Subjectivity in Current Diagnostic Methods:
• Fundus photography or optical coherence tomography (OCT) are the current techniques used to perform eye
exams. They rely a lot on the expertise of ophthalmologists to interpret results. This poses subjectivity bias,
meaning results are heavily skewed depending on the practitioner’s competency.
Project Fit - AI & Data Science
• This project aligns with an overall human-centered approach to computing in healthcare. It
focuses on an accessible web application implementing a machine learning model that assists
healthcare professionals in screening for diabetic retinopathy.
• This project deals directly with patient eye images and potential diagnoses, it encapsulates
medical image analysis, object recognition, and careful consideration of real-world clinical
workflows. The aim is to integrate AI-driven detection techniques into everyday practice, improving
early identification of diabetic retinopathy while making the technology intuitive for healthcare
providers.
• This project uses Convolutional Neural Networks (CNN) as they excel in learning directly from the
images provided. It will also be used as it is easy to get a baseline accuracy. Other models such
as Logistic Regression (LN) and K-Nearest Neighbour (KNN) will be used to compare to CNN’s
baseline accuracy. The models will be compares to see which is most suitable for the use in the
web application.
• Data augmentation will be used throughout this project to help deal with the different formats and
types of images provided. This is to make the model as robust as possible by training it to handle
real world variations in light, camera focus, and blur.
Project Fit - AI & Data Science
Machine Learning is one of the core fundamentals of Artificial
Intelligence. This project not only utilises Machine Learning
as a theoretical concept but applies it to a real-world
scenario to create real impact.
This project combines both the core fundamentals of AI and
data science; because it focuses on extracting insights and
predictive patterns from data.
This project in its essence uses a trained model to identify
signs of diabetic retinopathy in retinal images. Rather than
relying on static thresholds, the model trains, tests, learns
and adapts based on the data given to give a prediction
based on its new knowledge.
The data science modules have helped in understanding (Alzubi et al., 2018)
how to gather data, pre-process, tune and evaluate it. This
has made it helpful in given the model the best quality data
to learn from.
Aims & Objectives (Web App)
Overall Aim for the Web App
• Build an easy-to-use diabetes detection application where healthcare practitioners can upload
images of the patient's retina and provide a straightforward analysis of potential diabetic
retinopathy indications with recommendations (Figure 1).
Objective #1: Simple Result Interpretation
• Offer a simple classification of the Retina based on a 5-category breakdown; No DR Signs, Mild
or Early NPDR (Nonproliferative Diabetic Retinopathy), Moderate NPDR, Severe NPDR, PDR
(Proliferative Diabetic Retinopathy) (Figure 6). A short explanation of why the model arrived at
this result can further aid non-specialists in understanding the risk level (Figure 2).
Objective #2: Annotation
• Provide visual annotations or highlighted sections on the retina picture to help identify potential
haemorrhages and aneurysms. This helps in identifying suspicious areas without extensive
clinical training.
Aims & Objectives (Web App)
Objective #3: Quick Tagging and Sorting
• Provide customizable tags (e.g., “Urgent Review,” “Routine Check,” “Needs Further Analysis”)
that can be appended to each image.
Objective #4: Image Upload and Handling
• The platform must accept single or multiple retina images, ensuring each is processed
consistently.
Aims & Objectives (ML Model)
Overall Aim for the ML Model
• Develop a competent Machine learning-based classifier capable of categorising five DR stages
(No DR, Mild NP-DR, Moderate NP-DR, Severe NP-DR, and PDR) with the ability to provide
acceptable medical accuracy.
Objective #1: Data Collection
• Gather a sizeable amount of retina images evenly spread through out all category. Currently, the
database is 1005 images however the goal is to reach 10,000 images to better train and test the
ML model. Data collected and compiled from publicly available repositories (Diabetic Retinopathy
Detection | Kaggle, n.d.).
Objective #2: Data Preparation & Pre-Processing
• Cleaning: Eliminate any low-quality or corrupted images.
– Augmentation: Apply rotations, flips, and brightness changes to increase training variety,
automated reframing & resizing of pictures to 100x100 units for standardisation.
Aims & Objectives (ML Model)
Objective #3: Selecting appropriate algorithm
• CNN - chosen neural network to establish a baseline accuracy and incorporating other ML
models; KNN, Logistic regression, and Random Forest for accuracy comparisons.
• Training & Testing on Google Colab
Objective #4: Hyperparameter Tuning
• Experimentation stage to establish a baseline accuracy to enhance performance
• Initial Accuracy 50.83% (Figure 3).
Objective #5: Accuracy of Detection
• High detection of potential DR cases; no false positive DR cases.
Objective #6: ML Accuracy
• Aim for 90+% model accuracy
• Current Accuracy post training 85.65% (Figure 4).
Project DR. MVP Walkthrough
Research Context
• Due to the nature of the project being human-centered and specifically focus on the eye and its pathology in
diabetes. I mainly use two resources in aiding me to understand the science behind diabetes and more
importantly diabetic retinopathy (Banday et al., 2020)(Wong et al., 2016).
• I emphasised heavily on the use of peer reviewed research as well as academic databases such as IEEE
Xplore, PubMed, Google Scholar. I used Boolean operators in my search queries to navigate the noise and
filter any unrelated articles (Figure 5).
• An effective strategy when starting was the method of ”Back-Tracking”. I began with one well‐researched
review or journal article and examined the references it cited. Once I compiled a list of all of the citations, I
began to cross examine and research more into those papers. As medicine and AI are both continuously
evolving fields, I made sure to review the newer papers to keep the information as updated and relevant as
possible
• For my data, Kaggle Datasets hosted a large collection of DR datasets and annotations which helped in
laying the foundations for my ML work
• Understanding what the purpose of my project was whilst also comparing and contrasting my work to others
in the Field.
• Communicating with doctors and experts also helped in understanding and grasping the medical aspects of
the project.
Research Context – Search Terms
("artificial intelligence"[Title/Abstract]
Total number of articles relevant to
AI and diabetic retinopathy.
OR "AI"[Title/Abstract]) AND ("diabetic
n=695 retinopathy"[Title/Abstract] OR
"diabetic eye disease"[Title/Abstract])
Total number of articles excluded
(Any studies performed non-humans and
under 18y.o)
Boolean Operators: n=594
The strategy performed to
Total number of articles included AND "adult"[MeSH Terms] AND
n=101
"humans"[MeSH Terms]
ascertain the articles that are
relevant to the project.
Total number of articles excluded
(Any studies older than 5 years)
The Boolean operators used are n=27
next to the relevant box.
Total number of articles included
n=74 AND ("2019/01/01"[Date -
Publication] : "3000"[Date -
Publication])
Research Methodology
• I decided to conduct a feasibility study to understand if my proposed concept was workable in the timeframe
I have and what the appropriate resources needed were.
• Pilot Testing both the Qualitative and Quantitative findings. As of now, we have more quantitative findings
(ML accuracy, error rates, specificity etc.). The aim is to focus on the qualitative aspect once we’ve
increased our dataset (user impressions, usability data, expert feedback).
• Comparative Study: A research methodology that seemed to be effective was bringing opposing peer
reviewed work and highlighting their points of disagreements to further understand the rationale. This also
helped me draw my own conclusions as well as bring my own thoughts from an AI background
• Previously, I mentioned the significance of understanding the ”Known” and ”Unknown” variables in this
project which is also known as the Literature Survey. It helped me establish the foundations of the project as
well as sifting through the vast amount of information and situating new research in its correct context and
selecting suitable techniques and models.
• Being factually coherent in all aspects and deploying a context-aware strategy whether it concerned data
availability or medical reporting
• I began to realise that these methodologies were chosen based on the questions asked, the resources
available, and what kinds of evidence or data are most relevant for validating a solution.
Implementation Methodology
Data Collection
• Systematic search in publicly available repositories (Kaggle, etc.).
• Data cleaning: remove poor-quality images, label images by DR severity.
• Data augmentation: rotations, flips, and brightness adjustments to improve model.
Model Development
• Baseline CNN architecture: compile train/test pipeline on Google Colab.
• Hyperparameter tuning: optimize learning rate, epoch count, batch size.
• Comparative approach: optionally test simpler ML algorithms (KNN, Random Forest) for
performance bench-marking.
Implementation Methodology
Testing & Iteration
• Quantitative analysis: measure overall accuracy, confusion matrix, sensitivity/specificity for each
DR stage.
• Qualitative feedback: Test run with healthcare professionals to examine if the classification
scheme is clear and clinically coherent.
Deployment Approach
• Integrate final ML model into a web-based interface.
• Maintain logs of predictions for additional auditing or analysis.
Project Plan - Proposed Timeline (12 WEEKS)
Weeks 1-2: Initial Planning & Data Preparation
• Establish project goals and confirm dataset sources.
• Clean and organise retina images, removing duplicates or low-quality scans.
• Outline the baseline directory structure for your training and testing sets.
Weeks 3-4: Model Setup & Baseline Training
• Build a baseline CNN architecture.
• Train on a small, representative subset of the data.
• Document initial results (accuracy, loss curves) to refine hyperparameters (Figure
3).
Project Plan
Weeks 5-6: Model Refinement & Feature Enhancements
• Conduct hyperparameter tuning (learning rate, batch size).
• Implement data augmentation (rotations, brightness changes).
• Improve the classifier’s performance using advanced techniques like transfer
learning if feasible.
Weeks 7–8: Web Application Development
• Develop the web interface for image upload and processing.
• Integrate the trained model into the backend so that uploaded images receive
near-real-time classification.
• Create basic annotation tool to highlight any DR areas.
Project Plan
Weeks 9-10: Testing & Iterations
• Conduct user-acceptance testing on a small group of peers or designated testers.
• Gather feedback on utility, interface clarity, and speed.
• Fix any identified bugs (e.g., slow response, misaligned annotations).
Week 11: Final Review & Fine-Tuning
• Finalize your classification thresholds or labelling logic if certain DR stages look
misclassified.
• Validate system performance on a held-out set of images.
• Compile documentation and walk-throughs (e.g., how GPs should interpret results).
Project Plan
Week 12: Presentation & Handover
• Prepare final presentation materials or reports summarising the project.
• Demonstrate the system’s workflow (upload, classification, annotation) in a simplistic
manner. Both UI and ML representation (Project DR. MVP Walkthrough)
• Mention the successes, challenges, and recommended next steps for future
improvements.
• Mention all sources and citations
Week 13: Synoptic Project Preparation
• Review progress and begin planning for Synoptic Project
• Create a new proposed timeline
• Define new Objectives and Aims
• Review preliminary research
Project Impact Considerations
Ethical Considerations
• The UK GDPR - Data Protection Act 2018 outlines key principles for data
protection, including lawfulness, fairness, transparency, purpose limitation, data
minimisation, accuracy, storage limitation, integrity, confidentiality (security), and
accountability. Violation of these principles can result in data privacy problems.
• If research participants are not adequately informed about how their data will be
used and stored, or if they do not give their explicit consent, this can lead to data
privacy issues.
• If data is not properly secured, it can be vulnerable to breaches, which can lead
to significant data privacy problems.
Project Impact Considerations
Legal & Societal Considerations
• Researchers may face liability issues if a study causes harm to participants. This can lead to
malpractice claims, particularly if the research is deemed negligent or if proper safety protocols
were not followed.
• Medical research must adhere to various ethical guidelines and regulations, which can vary by
country. Non-compliance with these regulations can result in legal action and damage to the
reputation of the researchers and their institutions.
• Some communities particularly those in poorer neighbourhoods or countries might have a limited
access to the necessary medical equipment or reliable internet to utilise Project DR’s AI-powered
detection procedures. This presents societal issues and poses difficulty in bridging the gap
between our application and a poorer demographic. We want to ensure there is a possibility to
mitigate any societal gap, to do so we can consider developing an offline application to families
with no/poor internet connection. Furthermore, creating portable retinopathy kits as well as local
screening days in the area could serve as an alternative for less privileged families.
Potential Risks and Solutions
Data Confidentiality & Anonymity
• Due to the significance of the data we are handling, it is incumbent that we
integrate data privacy and protection to all patient information. Integrate End-to-
End Encryption to prevent any interference with any medical images or
information
• All patient examinations will remain anonymous to further protect patient
privilege.
Project Evaluation Plan
• Objective #2 Annotation: Evaluate whether the highlighted suspicious areas on the retina
correlate with known haemorrhages or aneurysms, as double-checked by a qualified
ophthalmologist. Existing literature suggests side-by-side comparisons with experts’ manual
annotations for validation.
• Objective #3 Quick Tagging and Sorting: Monitor how efficiently healthcare staff can apply and
sort images using custom tags (e.g., “Urgent Review”), measured by average tagging time and
correctness of priority categorization. This type of workflow study aligns with standard software
usability assessments, as recommended by health IT guidelines.
• Objective #4 Classification Consistency with Five Categories: Since mild, moderate, severe
NPDR, and PDR are clinically recognized categories, we can cross-check the system’s
classification with an ophthalmologist’s official diagnosis to confirm correctness. Research on DR
screening notes the importance of domain expert cross-checking to ensure real-world reliability.
(Figure 6)
• Objective #5 Image Upload Handling: Track the number of upload errors, average upload time
per image, and user satisfaction with the upload process. References on DR software emphasize
the need for robust, user-friendly data handling to minimize screening time and reduce
frustration
In Conclusion
• Project DR. displays a feasible and significant solution in using AI and Data Science to detect
diabetic retinopathy, blending medical practice with machine-learning methods.
• The approach is one that involves a detailed CNN ML model, a simple to use Web app for Health
practitioners and a customised suggestions section.
Next steps:
• Increase the dataset's size and diversity (aim for approximately 10,000 retinal pictures) to
enhance classification accuracy.
• Perform more pilot testing with medical professionals, incorporating methods of feedback for
interface and model enhancement.
• Expand more sophisticated ML algorithms such Logistic regression & KNN to achieve a higher
accuracy rate
• Perform a trial test on the public to validate real world performance
• Backend ML integration with Supabase https://supabase.com/
Table & Figures
Figure 1, (P, 2018) Figure 2 (Web App MVP)
Figure 6 (Koetting, n.d.)
Figure 3 (Google Colab)
Figure 4 (Google Colab)
Resources
• Google Colab Hardware: RAM=12.7 GB, HD=112.6GB GPU RAM=15GB Runtime T4 GPU
• Google Colab Machine Learning Training & Testing File:
https://colab.research.google.com/drive/1Y99CN4v3ODaRSpZjFaxbxv-plB-D_78z?usp=sharing
• Diabetic Retinopathy Detection | Kaggle (n.d.). [Online] [Accessed on 1st January 2025]
https://www.kaggle.com/competitions/diabetic-retinopathy-detection/data?select=test.zip.001.
• Diabetic Retinopathy Detection (n.d.). [Online] [Accessed on 1st January 2025]
https://kaggle.com/diabetic-retinopathy-detection.
• React - Next.js (App router)
• Frontend: Shadcn Ui https://ui.shadcn.com/
References
• Tan, T.-E. and Wong, T. Y. (2022) ‘Diabetic retinopathy: Looking forward to 2030.’ Frontiers in Endocrinology, 13 p. 1077669.
• Leasher, J. L., Bourne, R. R. A., Flaxman, S. R., Jonas, J. B., Keeffe, J., Naidoo, K., Pesudovs, K., Price, H., White, R. A., Wong, T. Y., Resnikoff, S., Taylor, H. R., and Vision Loss Expert Group of the Global Burden of Disease Study
(2016) ‘Global Estimates on the Number of People Blind or Visually Impaired by Diabetic Retinopathy: A Meta-analysis From 1990 to 2010.’ Diabetes Care, 39(9) pp. 1643–1649.
• Kropp, M., Golubnitschaja, O., Mazurakova, A., Koklesova, L., Sargheini, N., Vo, T.-T. K. S., de Clerck, E., Polivka, J., Potuznik, P., Polivka, J., Stetkarova, I., Kubatka, P. and Thumann, G. (2023) ‘Diabetic retinopathy as the leading
cause of blindness and early predictor of cascading complications—risks and mitigation.’ The EPMA Journal, 14(1) pp. 21–42.
• Alzubi, J., Nayyar, A. and Kumar, A. (2018) ‘Machine Learning from Theory to Algorithms: An Overview.’ Journal of Physics: Conference Series. IOP Publishing, 1142(1) p. 012012.
• Diabetic Retinopathy Detection | Kaggle (n.d.). [Online] [Accessed on 5th January 2025] https://www.kaggle.com/competitions/diabetic-retinopathy-detection/data?select=test.zip.001.
• P, D. R. R. R. (2018) ‘Understanding Diabetic Retinopathy and how to reverse it.’ Neoretina Blog. 11th December. [Online] [Accessed on 5th January 2025] https://neoretina.com/blog/diabetic-retinopathy-can-it-be-reversed/.
• Dong, L., He, W., Zhang, R., Ge, Z., Wang, Y. X., Zhou, J., Xu, J., Shao, L., Wang, Q., Yan, Y., Xie, Y., Fang, L., Wang, Haiwei, Wang, Yenan, Zhu, X., Wang, J., Zhang, C., Wang, Heng, Wang, Yining, Chen, R., Wan, Q., Yang, J., Zhou,
W., Li, H., Yao, X., Yang, Z., Xiong, J., Wang, X., Huang, Y., Chen, Y., Wang, Z., Rong, C., Gao, J., Zhang, H., Wu, S., Jonas, J. B. and Wei, W. B. (2022) ‘Artificial Intelligence for Screening of Multiple Retinal and Optic Nerve Diseases.’
JAMA network open, 5(5) p. e229960.
• Ipp, E., Liljenquist, D., Bode, B., Shah, V. N., Silverstein, S., Regillo, C. D., Lim, J. I., Sadda, S., Domalpally, A., Gray, G., Bhaskaranand, M., Ramachandra, C., Solanki, K., and EyeArt Study Group (2021) ‘Pivotal Evaluation of an
Artificial Intelligence System for Autonomous Detection of Referrable and Vision-Threatening Diabetic Retinopathy.’ JAMA network open, 4(11) p. e2134254.
• Sun, H., Saeedi, P., Karuranga, S., Pinkepank, M., Ogurtsova, K., Duncan, B. B., Stein, C., Basit, A., Chan, J. C. N., Mbanya, J. C., Pavkov, M. E., Ramachandaran, A., Wild, S. H., James, S., Herman, W. H., Zhang, P., Bommer, C., Kuo,
S., Boyko, E. J. and Magliano, D. J. (2022) ‘IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045.’ Diabetes Research and Clinical Practice, 183, January, p. 109119.
• Klein, R. and Klein, B. E. K. (2002) ‘Blood pressure control and diabetic retinopathy.’ The British Journal of Ophthalmology, 86(4) pp. 365–367.
• Banday, M. Z., Sameer, A. S. and Nissar, S. (2020) ‘Pathophysiology of diabetes: An overview.’ Avicenna Journal of Medicine, 10(4) pp. 174–188.
• Wong, T. Y., Cheung, C. M. G., Larsen, M., Sharma, S. and Simó, R. (2016) ‘Diabetic retinopathy.’ Nature Reviews. Disease Primers, 2, March, p. 16012.
• Koetting, C. (n.d.) ‘What you need to know to optimize patient care.’