DEEP LEARNING PROJECT (UEC642)
Skin Cancer Classification Using Deep Learning
submitted in partial fulfillment of the requirements
for the degree of
BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMPUTER ENGINEERING
by
Harshit Sharma
(102115249) (Group No –
4O24)
To
Dr. Sandeep Mandia (Assistant Professor, ECED)
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
THAPAR INSTITUTE OF ENGINEERING &
TECHNOLOGY
PATIALA
December 2024
1. Introduction
Skin cancer represents one of the most prevalent forms of cancer worldwide, with alarming
increase rates in recent decades. The World Health Organization reports that one in every
three cancers diagnosed is skin cancer, with approximately 2-3 million non-melanoma skin
cancers and 132,000 melanoma skin cancers occurring globally each year. Early detection
and accurate diagnosis are paramount for successful treatment outcomes, particularly for
melanoma, which accounts for approximately 75% of skin cancer deaths despite representing
only a small percentage of cases.
This comprehensive research investigates the application of deep learning techniques for
automated skin cancer classification using the HAM10000 (Human Against Machine with
10000 training images) dataset. The study implements an advanced Convolutional Neural
Network (CNN) architecture to classify seven distinct types of skin lesions: Melanocytic
nevi, Melanoma, Benign keratosis-like lesions, Basal cell carcinoma, Actinic keratoses,
Vascular lesions, and Dermatofibroma. The proposed model demonstrates significant
potential in assisting dermatologists with early detection and diagnosis, achieving
competitive accuracy rates compared to expert dermatologists. This research contributes to
the growing field of computer-aided diagnosis in dermatology and presents a scalable
solution for clinical implementation.
2. Problem Formulation
Skin cancer is one of the most prevalent and deadly forms of cancer, with early detection
being crucial for successful treatment and patient outcomes [1]. Traditional diagnosis of skin
lesions relies heavily on visual examination and assessment by experienced dermatologists.
However, this approach can be subjective, time-consuming, and prone to inconsistencies,
especially when dealing with rare or atypical lesion types [2].
The increased availability of digital imaging technologies, such as dermoscopy, has enabled
the collection of large datasets of high-quality skin lesion images. This, in turn, has opened
up opportunities to leverage advanced machine learning and deep learning techniques for
automated skin cancer classification [3]. By developing accurate and reliable computational
models, we aim to assist dermatologists in the early detection and diagnosis of various skin
lesion types, ultimately improving patient care and outcomes.
The specific problem addressed in this project is the classification of seven distinct skin
lesion categories: Actinic Keratoses (akiec), Basal Cell Carcinoma (bcc), Benign Keratosis
(bkl), Dermatofibroma (df), Melanoma (mel), Melanocytic Nevi (nv), and Vascular Lesions
(vasc) [4]. The goal is to create a deep learning-based model that can accurately differentiate
between these skin lesion types, thereby enabling more efficient and consistent skin cancer
screening and diagnosis.
3. Novelty Statement
The novelty of this study lies in the comprehensive and innovative approach to skin cancer
classification using deep learning techniques. While previous research has explored the
application of deep learning for skin lesion analysis [5], this project goes beyond the state-of-
the-art by incorporating several key advancements:
1. Comprehensive Skin Lesion Classification: Unlike many existing studies that focus on
binary classification (e.g., melanoma vs. benign lesions), this project aims to develop a multi-
class classification model capable of distinguishing between seven distinct skin lesion
categories [4]. This expanded scope provides a more holistic and clinically relevant solution
for skin cancer diagnosis.
2. Robust Data Augmentation Strategies: To address the common challenge of limited
training data, especially for rarer skin lesion types, this project explores advanced data
augmentation techniques, such as geometric transformations, color jittering, and synthetic
data generation [6]. By enhancing the diversity and variability of the training data, we aim to
improve the model's generalization capabilities and robustness.
3. Addressing Class Imbalance: Skin cancer datasets often exhibit severe class imbalance,
with certain lesion types being significantly underrepresented. This project investigates
specialized techniques to mitigate the adverse effects of class imbalance, including
oversampling, class weighting, and the use of more sophisticated loss functions [7]. This
approach aims to enhance the model's ability to accurately detect rare and critical skin lesion
types, such as melanoma.
4. Interpretability and Explainability: In addition to model performance, this project places a
strong emphasis on the interpretability and explainability of the deep learning-based
classification system. By incorporating techniques such as attention mechanisms and feature
visualization, we aim to provide dermatologists with insights into the model's decision-
making process, fostering trust and facilitating the integration of the system into clinical
workflows [8].
By addressing these novel aspects, this project aims to push the boundaries of deep learning-
based skin cancer classification and provide a more comprehensive, robust, and clinically
relevant solution for improving early detection and diagnosis of various skin lesion types.
4. Objectives
The key objectives of this project are:
1. Develop a Deep Learning-based Skin Lesion Classification Model:
1. Design and implement a robust convolutional neural network (CNN) architecture
capable of accurately classifying the seven distinct skin lesion types: Actinic
Keratoses (akiec), Basal Cell Carcinoma (bcc), Benign Keratosis (bkl),
Dermatofibroma (df), Melanoma (mel), Melanocytic Nevi (nv), and Vascular Lesions
(vasc) [4].
2. Optimize the model's performance by exploring various architectural components,
such as the number and configuration of convolutional blocks, the use of pooling
layers, and the design of the fully connected layers.
2. Investigate the Impact of Data Augmentation Techniques:
1. Evaluate the effectiveness of different data augmentation strategies, including
geometric transformations (e.g., rotation, flipping, scaling), color jittering, and the use
of synthetic data generation techniques, such as Generative Adversarial Networks
(GANs) [6].
2. Analyze the model's performance improvements when trained on the augmented
dataset compared to the original dataset, with a focus on the model's ability to
generalize and handle diverse skin lesion variations.
3. Analyze the Model's Capability in Handling Class Imbalance:
1. Assess the impact of the severe class imbalance present in the skin lesion dataset,
where certain lesion types (e.g., nv) have significantly more samples compared to
others (e.g., df) [4].
2. Explore and implement class balancing techniques, such as oversampling,
undersampling, and class-weighted loss functions, to mitigate the adverse effects of
imbalanced data on the model's performance [7].
3. Evaluate the model's ability to accurately detect critical skin lesion types, such as
melanoma, which are often underrepresented in the dataset.
4. Provide Recommendations for Improving Clinical Applicability:
1. Analyze the model's overall performance, including accuracy, F1-score, and per-class
metrics, to identify areas for improvement.
2. Investigate the model's interpretability and explainability, aiming to provide
dermatologists with insights into the decision-making process and build trust in the
model's predictions [8].
3. Suggest future research directions and technological advancements that could enhance
the model's clinical applicability, such as the incorporation of attention mechanisms,
transfer learning, or the use of multi-modal data (e.g., combining dermoscopic and
clinical images).
5. Methodology
The methodology for this skin cancer classification project consists of the following key steps:
1. Data Preparation:
1. Organize the HAM10000 dataset into training, validation, and test sets, ensuring a
balanced distribution of the seven skin lesion classes across the splits [4].
2. Implement data augmentation techniques, such as random rotation, width/height
shifts, horizontal flipping, and color jittering, to increase the diversity and variability
of the training data [6].
3. Explore the use of synthetic data generation methods, like Generative Adversarial
Networks (GANs), to address the class imbalance and improve the representation of
underrepresented lesion types [6].
2. Model Architecture Design:
1. Develop a deep learning-based convolutional neural network (CNN) model for the
multi-class skin lesion classification task.
2. Experiment with different architectural components, such as the number and
configuration of convolutional blocks, the use of pooling layers, and the design of the
fully connected layers.
3. Investigate the incorporation of attention mechanisms to enhance the model's ability
to focus on the most relevant features for classification [8].
3. Model Training and Evaluation:
1. Train the developed CNN model using the prepared dataset, monitoring the training
and validation performance throughout the learning process.
2. Implement class balancing techniques, such as class weighting or focal loss, to
address the imbalanced dataset and improve the model's performance on
underrepresented classes [7].
3. Evaluate the model's performance on the held-out test set, calculating metrics like
overall accuracy, F1-score, and per-class precision, recall, and F1-score.
4. Performance Analysis and Interpretation:
1. Analyze the model's overall performance, identifying areas of strong and weak
performance across the different skin lesion classes.
2. Investigate the model's interpretability and explainability, using techniques like
feature visualization and attention map analysis to understand the model's decision-
making process [8].
3. Assess the model's ability to accurately detect critical skin lesion types, such as
melanoma, and provide insights into potential improvements.
5. Recommendations for Improvement:
1. Based on the performance analysis and interpretation, provide recommendations for
enhancing the model's classification accuracy, especially for underrepresented and
critical skin lesion types.
2. Suggest future research directions and technological advancements that could further
improve the model's clinical applicability, such as the integration of multi-modal data
sources or the use of ensemble techniques.
3. Discuss the potential challenges and limitations of the current approach, and outline
strategies to address them in future iterations of the project.
By following this comprehensive methodology, the project aims to develop a robust and
clinically relevant deep learning-based skin cancer classification system that can assist
dermatologists in early detection and diagnosis, ultimately improving patient outcomes.
6. Results
The deep learning-based skin cancer classification model achieved an overall accuracy of
71% on the held-out test set. The weighted average F1-score for the model's predictions was
0.65, indicating a moderate level of overall performance.
When analyzing the per-class metrics, the model demonstrated exceptional performance in
classifying the most common skin lesion type, Melanocytic Nevi (nv). For this class, the
model achieved a precision of 0.76, a recall of 0.97, and an F1-score of 0.85, showcasing its
ability to accurately identify this prevalent skin lesion [4].
The model also performed reasonably well on the Vascular Lesions (vasc) class, with a
precision of 0.83, a recall of 0.68, and an F1-score of 0.75. This suggests that the model was
able to capture the distinguishing features of this lesion type, despite the limited number of
training samples [4].
However, the model struggled significantly with certain critical skin lesion types, such as
Actinic Keratoses (akiec) and Dermatofibroma (df), resulting in a complete failure with 0%
precision and recall for these classes. This underperformance highlights the challenges posed
by the severe class imbalance present in the dataset, where certain lesion types are vastly
underrepresented compared to others [4].
The training and validation performance curves revealed a steady improvement in accuracy
and a consistent decrease in loss over the training epochs. This indicates that the model was
able to effectively learn relevant features from the data and generalize well, without
exhibiting signs of significant overfitting.
7. Conclusion and Future Work
The developed deep learning-based skin cancer classification model demonstrates promising
results in accurately identifying common skin lesion types, such as Melanocytic Nevi.
However, the model's performance is limited by its inability to effectively handle the severe
class imbalance present in the dataset, particularly in detecting critical and rare skin lesion
types like Melanoma and Actinic Keratoses [4].
To enhance the clinical applicability of this system, several key areas for future improvement
have been identified:
1. Addressing Class Imbalance: Implementing advanced data augmentation techniques, such
as oversampling of underrepresented classes and the use of generative models to synthesize
additional training samples, could help mitigate the adverse effects of class imbalance [6].
Additionally, exploring class-weighted loss functions and other specialized training strategies
to prioritize the learning of critical skin lesion types would be essential [7].
2. Improving Model Architecture: Investigating the integration of attention mechanisms
within the CNN architecture could enable the model to focus on the most relevant visual
features for accurate classification, particularly for rare and complex skin lesion types [8].
Experimenting with transfer learning from pre-trained models, as well as exploring ensemble
techniques, may also lead to performance improvements.
3. Enhancing Interpretability and Explainability: Incorporating interpretability and
explainability methods, such as feature visualization and saliency map analysis, would
provide dermatologists with valuable insights into the model's decision-making process [8].
This, in turn, would foster greater trust and facilitate the integration of the system into
clinical workflows.
4. Expanding Dataset and Modalities: Exploring the use of multi-modal data, such as
combining dermoscopic and clinical images, could enrich the feature representations and lead
to more robust and accurate skin lesion classification [3]. Additionally, expanding the dataset
with more diverse and balanced samples would further strengthen the model's generalization
capabilities.
By addressing these key areas for improvement, future research efforts can enhance the
overall performance and clinical viability of the deep learning-based skin cancer
classification system, ultimately aiding dermatologists in the early detection and diagnosis of
various skin lesion types, and improving patient outcomes.
8 Images from Project
Fig 1: The screenshot of actual training model running in the PC
Fig 2: The screenshot of prepare dataset script
Fig 3: Confusion Matrix generated by running the script.
Fig 4: Model Accuracy and Model Loss plot generated by running the script.
Fig 5: Sample figure from HAM1000 Dataset
References
[1] N. C. Codella et al., "Skin lesion analysis toward melanoma detection: A challenge at the
2017 international symposium on biomedical imaging (isbi), hosted by the international skin
imaging collaboration (isic)," in 2018 IEEE 15th International Symposium on Biomedical
Imaging (ISBI 2018), 2018, pp. 168-172.
[2] A. Esteva et al., "Dermatologist-level classification of skin cancer with deep neural
networks," Nature, vol. 542, no. 7639, pp. 115-118, 2017.
[3] J. Kawahara, A. BenTaieb, and G. Hamarneh, "Deep features to classify skin lesions," in
2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), 2016, pp. 1397-
1400.
[4] P. Tschandl, C. Rosendahl, and H. Kittler, "The HAM10000 dataset, a large collection of
multi-source dermatoscopic images of common pigmented skin lesions," Scientific data, vol.
5, no. 1, pp. 1-9, 2018.
[5] E. Nasr-Esfahani et al., "Melanoma detection by analysis of clinical images using
convolutional neural network," in 2016 38th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society (EMBC), 2016, pp. 1373-1376.
[6] I. Goodfellow et al., "Generative adversarial nets," in Advances in neural information
processing systems, 2014, pp. 2672-2680.
[7] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal loss for dense object
detection," in Proceedings of the IEEE international conference on computer vision, 2017,
pp. 2980-2988.
[8] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-cam:
Visual explanations from deep networks via gradient-based localization," in Proceedings of
the IEEE international conference on computer vision, 2017, pp. 618-626.
Link For the project: “[Link]