0% found this document useful (0 votes)
18 views9 pages

Skin Lesion Detection Using Deep Learning

This study presents a deep learning approach for the early detection and classification of skin lesions using convolutional neural networks (CNNs), specifically DenseNet and Inception V3 models. The research emphasizes the importance of data augmentation and multimodal imaging to enhance the accuracy of automated skin cancer diagnosis. The proposed method aims to assist clinicians by providing visualizations that explain model predictions, ultimately improving the reliability of skin lesion classification.

Uploaded by

mohan562356
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views9 pages

Skin Lesion Detection Using Deep Learning

This study presents a deep learning approach for the early detection and classification of skin lesions using convolutional neural networks (CNNs), specifically DenseNet and Inception V3 models. The research emphasizes the importance of data augmentation and multimodal imaging to enhance the accuracy of automated skin cancer diagnosis. The proposed method aims to assist clinicians by providing visualizations that explain model predictions, ultimately improving the reliability of skin lesion classification.

Uploaded by

mohan562356
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

VOLUME 16, N° 3 2022

Journal of Automation, Mobile Robotics and Intelligent Systems

Skin Lesion Detection Using Deep Learning

Submitted: 12th May 2022; accepted: 28th July 2022

Rajit Chandra, Mohammadreza Hajiarbabi

DOI: 10.14313/JAMRIS/3-2022/24 in this area to ­facilitate clinicians and to contribute


to human health. Machine learning is sub field of ar-
tificial intelligence and it is proved to outperform in
Abstract: various fields. With the ­enhancement in the computa-
Skin lesion can be deadliest if not detected early. Early tional power and the huge data availability, it became
detection of skin lesion can save many lives. Artificial ­possible to use deep learning models. Deep learning
Intelligence and Machine learning is helping health- models have the ­power to take in the complex struc-
care in many ways and so in the diagnosis of skin ­lesion. ture of i­mages and to learn the pattern out of it. The
Computer aided diagnosis help clinicians in detecting process in m ­ aking the deep learning model includes
the cancer. The study was conducted to classify the collecting the data, pre-processing it, the image data
seven classes of skin lesion using very powerful convo- is then segmented and features are extracted. These
lutional neural networks. The two pre trained models i.e features are then fed into the model and probabili-
DenseNet and Incepton-v3 were employed to train the ties are calculated. The class label having the highest
model and accuracy, precision, recall, f1score and ROC- probability is predicted. Data is the most important
AUC was calculated for every class prediction. Moreover, factor for machine learning algorithms. Experts uses
gradient class activation maps were also used to aid the various strategies to collect the data. The two types
clinicians in determining what are the regions of image of images are used in medical AI, i.e. dermoscopic
that influence model to make a certain decision. These images and macroscopic images. For the study, the
visualizations are used for explain ability of the model. dataset provided by the International Skin Imaging
Experiments showed that DenseNet performed better Collaboration is used. The ISIC has provided various
then Inception V3. Also it was noted that gradient class versions of the dataset. The ISIC-2018 dataset is used
activation maps highlighted different regions for predict- for the making the model. The 2018 archive contains
ing same class. The main contribution was to introduce seven d ­ ifferent classes of skin lesion. So it was a mul-
medical aided visualizations in lesion classification mod- ticlass ­classification problem. The images that are
el that will help clinicians in understanding the decisions provided by ISIC are the d ­ ermoscopic images of the
of the model. It will enhance the reliability of the model. lesion. Convolutional ­Neural Networks are neural net-
Also, different optimizers were employed with both mod- works that are primarily used for the computer vision
els to compare the accuracies. tasks. The r­ eason is that CNNs are able to understand
the complex structure of images.
Keywords: Skin lesion, DenseNet, Inception V3 Dermoscopy is the state-of-the-art procedure for
skin cancer screening, with a diagnosis accuracy that
is higher than the naked eye [2]. In this p ­ aper, the
1. Introduction ­researchers offered a method for improving the accu-
Dermatologists use technological approaches for de- racy of automated skin lesion identification by combi-
tecting skin cancer to facilitate in the early detection ning different imaging modalities with the metadata of
of skin cancer. Such lesions are produced by aber- patients. Only those cases were kept that had metada-
rant melanocyte cell formation and it usually hap- ta of patients, a macroscopic image, a dermatoscopic
pens when skin is exposed to sun more than neces- image, and a histological diagnosis details. Moreover,
sary. Melanocytes cells generates “melanin”. Melanin only instances where input images are of adequate
is the substance that is responsible for producing quality and untainted by any identifying traits the
pigmentation in the skin. Moreover, the amount of were picked by repeated hand scanning of all ima-
skin cancer cases has risen dramatically, resulting ges (ie, eyes, facial landmarks, jewellery or garment).
in a growth in the mortality rate from the condi- ResNet-50 was used to extract the features of the ima-
tion, notably from melanoma instances. That is why ges. Three kinds of experiments were conducted.
the skin lesion is a big concern in all over the world.
Skin lesion has many different kinds, and some kinds 1.1. Full Multimodality Classification
if not detected early can become skin cancer and When all mentioned three modes (macroscopic ­image
so it is important to detect this disease in the early of lesions, dermatoscopic images, and metadata of pa-
stage. Like every other field, technology is also used tients) were provided, the researchers built a network

56 2022 ® Chandra and Hajiarbabi. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0)
(https://creativecommons.org/licenses/by-nc-nd/4.0)
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

with two image feature extractions, one for dermato- within the image; therefore, its ultimate classification
scopic input images and the other for macroscopic is unaffected. As a result, input images were randomly
input images. cropped and horizontal and vertical flips were used
to produce new samples under the same label as the
1.2. Partial multimodality classification original.
The researchers excluded the other two from the
complete network when only one image modality 1.5. Color augmentation
(macroscopic images or dermatoscopic images) and The images of skin lesions were gathered from vari-
information were supplied for classifying the images. ous sources and made using various devices. As a re-
Before passing it through the embedding network, sult, while using photographs for training and testing
the researchers generated only one feature vector any system, it is critical to scale the colors of the imag-
of image and combined it with the feature vector of es to increase the classification system’s performance.
­metadata.
1.6. Data warping based on the knowledge of
1.3. Single image classification specialist
When there was only one image type for classifica- The clinicians diagnose the melanoma by seeing the
tion and there was no metadata, the image was sent patterns that surrounds the lesion. So, affine trans-
through the image feature extraction network, and the formations including distorting, shearing and scaling
extracted features were then transmitted via the net- the data can be helpful in classifying the images. As
work. In the testing phase, it came out that the meta- a ­result, warping is an excellent way to supplement
data variables of patients like age, sex and location data in order to improve performance and reduce
did not enhance precision for pigmented skin lesions overfitting in melanoma classification.
appreciably. As a result, it was concluded that avail- In [5] three classifiers named SVM, Random forests
able models rely substantially on tight image criteria and Neural Networks were used to classify the image
and may be unstable in clinical practice. Furthermore, dataset. The results showed that different augmenta-
selecting datasets may contain unintended biases for tions performed differently in this case. The neural
specific input patterns. networks performed best for classification task.
Using image representations produced from In image recognition nowadays, two basic types
Google’s Inception-v3 model, the proposed automa- of feature sets are routinely used [5]. The traditional
ted approach intends to detect the kind and cause of kind is based on what are known as “hand-crafted
cancer directly [3]. The researchers used a feed for- features”, which are created by academics with the
ward neural network having two layers with softmax goal of capturing visual aspects of a picture, such as
activation function in the output layer to perform texture or color. A new sort of feature set was just pre-
two-phase classification based on the representa- sented that was motivated by how brain decode ima-
tion vector. Two separate neural networks with the ges and derived from powerful Convolutional Neural
same representation vector were used to perform Networks. These new features beat “hand-crafted”
the ­two-phase classification. In phase one, the rese- features when combined with deep learning, and as
archers determined the type of cancer, whether it was a result, they are increasingly popular in computer vi-
malignant or benign, and in phase two, the resear- sion. The researchers proposed in this study to utilise
chers determined whether the cancer was caused by a mix of both sorts of features to classify skin lesions.
melanocytic or nonmelanocytic cells. The training da- “RSurf features” was extracted by the researchers
taset includes 2000 JPEG dermoscopic images of skin for image description. This feature set’s concept is
lesions, as well as ground truth values. The validation to divide the input image into “parallel sequences of
set had 150 photos, whereas the testing set contained intensity values from the upper-left corner to the bot-
600. The method identifies the images automatically tom-right corner”. The concept behind such extrac-
using Google’s inspection model and the image repre- tion technique is based on the texture unit model, in
sentation produced from the dermoscopic images. which an input image’s texture spectrum is defined.
This paper had two major contributions: first, The support vector machine with Gaussian kernel
the researchers offered a classification model that and standardized models was used in the first catego-
used Deep Convolutional Neural Network and rization. It estimated the class for a given input image
Augmentation of data to evaluate the classification of using RSurf features and LBPR=1,3,5. CNN characte-
skin lesion images [4]. Second, the researchers sho- ristics were used in the second SVM classifier, which
wed how data augmentation could be used to overco- had a Gaussian kernel and standardized predictors.
me data scarcity, and the researchers looked at how The researchers used the AlexNet to extract the featu-
varying numbers of augmented data samples affect res. The researchers chose the label with the greatest
the performance of different models. The researchers absolute score value for each image that was tested.
used three methods of data augmentation in melano- As a result, the final classifier incorporated both ap-
ma classification. proaches, including hand-crafted characteristics
as well as features acquired from the deep learning
1.4. Geometric augmentation method.
The semantic interpretation of the skin lesion is It’s critical to distinguish malignant form of skin
­preserved by the position and scale of lesion mark lesions from benign form of lesions like “seborrheic

Articles 57
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

keratosis” or “benign nevi”, and good computerized the lesion kinds. The researchers examined linear
classification of skin lesion imagess can help with kernel as well as radial basis function (RBF) kernels
diagnosis [6]accurate discrimination of malignant and found that the RBF kernel performed marginally
skin lesions from benign lesions such as seborrheic better. In the final models, the researchers used 265
keratoses or benign nevi is crucial, while accurate one-vs-all multiclass SVM classifiers with radial basis
computerised classification of skin lesion images is function kernels. The major participation of the me-
of great interest to support diagnosis. In this paper, thod is that it proposed a hybrid deep neural network
we propose a fully automatic computerised method method for classifying the skin lesion that extracted
to classify skin lesions from dermoscopic images. deep features from data images using multiple DNNs
Our approach is based on a novel ensemble scheme 395 and assembles features in a support vector ma-
for convolutional neural networks (CNNs. The re- chine classifier that produced very accurate results
searchers offer a completely automated method for without needing exhaustive pre-processing or lesion
classifying skin lesions from dermoscopic pictures area segmentation. The results demonstrated that
in this study. For tasks like object detection and na- combining information in this way improves discri-
tural picture categorization, deep neural network al- mination and is complimentary to the 525 individual
gorithm, particularly convolutional neural networks, networks.
outperformed alternative methods. The well-establi- The “attention residual learning convolutional
shed CNN architectures were used to attain great ac- neural network (ARL-CNN)” model for skin lesion
curacy. Transfer learning had been applied in medical categorization is proposed in this research[7]. The
field for other tasks too. The pipeline of the model researchers combined a residual learning framework
includes the data pre-processing, fine-tuning of neu- for training a deep convolutional neural network
ral networks and then the features were extracted, with a small number of data images with an attention
these features were fed into the SVM model. Then ­learning mechanism to improve the DCNN’s particu-
the outputs of the model were assembled together. lar representation capacity by allowing it to object
To facilitate improved generalization ability when te- more on “semantically” important regions of dermo-
sted on additional datasets, the researchers kept the scopy images (i.e. lesions). The suggested attention
data pre-processing minimum in suggested pipeline. learning mechanism made full usage classification-
Only one task-specific ­pre-processing step (related to -trained DCNNs’ innate and impressive self-attention
skin lesion categorization) was included in the tech- capacity, and it could work under any deep convolu-
nique, while the rest were typical ­pre-processing sta- tional neural network framework without appending
ges to prepare the pictures before fed them to model. any additional “attention” layers, which was impor-
Normalization, resizing, and color standardization tant for the learning problems having small dataset as
were employed. VGG16, which included 16 weight in the problem in hand for classifying the images. In
layers, the number of convolutional layers were 13, terms of implementing this technique, each s­ o-called
and 3 FC layers were employed. In addition to vgg16, ARL block might include both “residual learning”
the powerful ResNet-18 and ResNet-101, which have and “attention learning”. By stacking numerous ARL
varying depths, were used for extracting the featu- blocks and training the model end-to-end, an ARLCNN
res. To solve the three class classification (Malignat model with any depth could be created. The rese-
Melanoma /Sabrohtic Kerosis/ benign nevi) classi- archers tested the suggested ARLCNN model using
fication, the 190 final fully connected layers and the the ISIC-skin 2017 dataset, and it outperformed the
last layer which was output layer of all pre-trained competition. The research contributed in many
networks were eliminated and replaced by two new aspects. The researchers proposed a novel ARLCNN
fully connected layers of 64 nodes and 3 nodes. The model for accurate skin lesion categorization,
new fully connected layers’ weights were chosen at which incorporates both residual learning and at-
random using a normal distribution with average tention ­learning methods. The researchers created
value of zero and a standard deviation of [195 0.01]. an ­effective attention framework that took full ad-
The researchers froze the weight values of the ear- vantage of DCNNs’ inherent “self-attention” abili-
liest layers of the deep models. By freezing the we- ty, i.e., instead of learning the attention mask with
ights, the issue of overfitting was addressed. Also extra layers, the researchers used the feature maps
freezing the weights can be helpful in decreasing the acquired by ­upper layer as the attention mask of
training time. The researchers froze the early lay- a lower level layer; and the researchers achieved
ers up to the 4th layers and up to the 10th layers for “state-of-the-art” lesion classification accuracy on
AlexNet and VGG16, respectively, and up to the 4th the ISIC-skin 2017 dataset by using only one model
residual block and 30th residual blocks for ResNet-18 with 50 layers, which was foremost for CAD of skin
and ResNet-101 respectively. To avoid overfitting of cancer.
the little training dataset, the researchers used data Researchers addressed two problems in the paper.
augmentation to boost the training size artificially. The first task entailed classifying skin lesions using
As key data augmentation approaches, the resear- dermoscopic pictures. “Dermoscopic” images and the
chers used rotation of 90 degrees, 180 degrees and metadata of patients were used for the second task
270 degrees and they also employed horizontal flip- [1]. For the first job, the researchers use a variety of
ping. A ternary SVM classifier was trained using the CNNs to classify dermoscopic images. The deep lear-
collected deep features and the related labels defining ning models for task 2 are divided into two sections:

58 Articles
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

a convolutional neural network for dermoscopy 1.6.3. Adam


images and a “dense neural network” for processing It is an optimization algorithm that is used in place
the patients’ metadata. In the beginning, the resear- of the standard stochastic gradient descent process
chers just trained the convolutional neural network to iteratively update weights in neural network us-
on image data (task 1). The weight values of CNN ing training data. Diederik Kingma of “OpenAI” and
are then frozen, and the metadata neural network Jimmy Ba of the “University of Toronto” presented
is attached. Only the weights of the metadata neu- Adam in their 2015 ICLR paper (poster) titled “Adam:
ral network and the classification layer are trained A Stochastic Optimization Method.”
in the second step. The researchers rely heavily on Adam, the authors explain, integrates the benefits of
EfficientNets (EN), which were pre-trained on a very two stochastic gradient descent enhancements. More
large dataset called ImageNet. These models consist precisely, an “Adaptive Gradient Algorithm” (AdaGrad)
of eight separate models that are architecturally si- is responsible for managing the ­per-parameter lear-
milar and follow particular principles for adjusting ning rate and hence increases the efficiency on issues
the image size if it is larger. The version B0 which is with sparse gradients (e.g. computer vision problems
also smallest of all, uses [224 *224] as the input size. and natural language ­processing problems).
In bigger versions, up to B7, the input size is raised To experiment the skin lesion classification ­model,
while the network breadth and network depth are Python 3.6 were used as progemming languaue.
scaled up. The researchers use efficient net versions Tensorflow and Keras were used for frameworks.
of B0 to B6. The researchers also trained SENet154
and the two versions of powerful ResNet for the
training. 2. Methods
In developing the model, three optimizers were 2.1. Method 1
used to compare the results. The following optimizers The model was trained from scratch; the framework
were used was rained for epochs after being initialised with ran-
dom weights. The algorithm learnt attributes from
1. Stochastic gradient descent input and calculates weights by backpropagation
2. RMSprop ­after every epoch. If the dataset is not very large, this
3. Adam strategy is unlikely to yield the most accurate results.
However, it can still be used as a comparison point for
1.6.1. Stochastic gradient descent the two other methods.
It is an ‘iterative method’ that optimizes the loss func-
tion with differentiable properties. The goal of ma- 2.2. Method 2
chine learning is to optimize the loss function or ob- For the second experiment, ConvNet were used as a
jective function. Mathematically, feature extractor because most dermatological data-
sets have a small number of photos of skin lesions,
1 this method used the weights from the available pre
( ) n ∑i
( )
n
Qw = Qi w trained model VGG16 which was trained on a bigger
−n
dataset (i.e. ImageNet), this practice is titled as “trans-
Here “w” is estimated which minimizes Q. Because it fer learning”. This pre-trained model had previously
is the iterative method so it performs following itera- learnt features that could be relevant for the classify-
tions to minimize the objective function. ing the skin lesion images, it is the core idea under-
pinning transfer learning.
n
h
( )
w := w − h ∇ Q w = w −
n ∑ ∇Q i (w ) 2.3. Method 3

i n Another frequent transfer learning strategy entails
not only training the model by assigning pre-trained
η is learning rate. weights, but also fine-tuning the model by solely
training the upper layers of the convolutional net-
1.6.2. RMSProp work and using the backpropagation. The research-
Root mean square propagation is also an optimiza- ers recommended freezing the lower layers of the
tion algorithm in which learning rate is adjusted for network in this paper since they contain more generic
parameters. The ‘running average’ is calculated as dataset properties. Because of their ability to extract
follows: more particular features, they were mainly interested
in training the model’s top layers. The parameters

)( ( )) from the ImageNet dataset were used to initialise the


( ) ( ) (
2
ÐÐÐ, := gn , −1 + 1−g ∇ i first four layers of convolution neural network in the
final framework in this method. The model weights
The learning parameters are updated as follows: that were saved was loaded from the matching con-
volutional layer in Method 1 were used to initialise
h
w := w − ∇ Qi w ( ) the fifth and final convolutional block. The evaluation
( )
n w ,t metrics showed that the third method performed bet-
ter than Method 1 and Method 2.

Articles 59
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

3. Results It shows the balance between recall and precision.


The data was divided into train, validation and test The formula of F1 Score is as follows:
split.

Train set images Validation set images Test set images F1 Score =
(
2 * precision * recall )
9714 100 201
precision + recall

The training set was augmented with the images 3.2. L2 Regularization
generated by introducing the changes into original L2 regularization is applied to models to combat over-
dataset. The images were horizontally flipped, the fitting. Overfitting is a term used to describe a situa-
­rotation range was 90 degrees and the zoom range tion where training loss decreases but the validation
was kept 0.2. the images were also rescaled before loss increases. In other words, the model is well fitted
­feeding into the model. on training data but it is not predicting accurately for
validation data. The model is not able to generalize.
3.1. Evaluation Metrics This is serious because If model is not generalizing
Following evaluation metrics were used to evaluate then it will not produce accurate results when it will
the models. be implemented in real world scenario. There are dif-
The Receiver Operator Characteristic (ROC) curve ferent techniques that can be used to control overfit-
is metric that is used to evaluate the classification ting. Regularization is used to control the complexity
models of machine learning. It presents a probabili- of model. When regularization is added, the model
ty curve that plots the true positive rate against false not only minimize the loss, but it also minimizes the
positive rates at many threshold values. It basically di- complexity of model. So, the goal of machine learning
stinct the ‘signal’ from the ‘noise’. The formula of true model after adding regularization is,
positive rate and false positive rate are as follows:
minimize(Loss(Data|Model)) + complexity(Model))
true positive
True positive rate = The complexity of the models used in paper was
true positive + false negative
minimized by using L2 regularization. The formula
of L2 regularization is the sum of square of all the
false positive ­weights,
ÐÐÐÐÐÐÐÐÐ =
false positive + true negative
2
L2 regularization term = w = w 12 + w 22 +  + w n2
2
The Area Under the Curve (AUC) measures the
performance of the classifier by evaluating its abi- In the models, two layers of L2 regularization was
lity to differentiate between classes. It is utilized as used before the final softmax layer.
the summary of Receiver Operator Characteristic A total of 12 experiments were conducted by using
(ROC) curve. The higher value of AUC means that the different optimizers. The three optimizers Adam,
classification model is performing accurately in diffe- RMSprop, Stochastic Gradient Descent were used in
rentiating the negative and positive classes. DenseNet and inception v3. Moreover, experiments
Accuracy is also an evaluation metric that is used were conducted with augmentations and without au-
for evaluation of classification models. The accura- gmentations to see whether the augmentations are
cy value represents the fraction of predictions that useful in our case or not. The details of the experi-
­model predicts correctly. The formula of accuracy is: ments are given below

3.2.1. With Augmentation


total number of correct predictions Different augmentations were applied to the data-
Accuracy =
total predictions set to increase the image data to avoid overfitting.
If the model is trained on less data, it will learn the
Precision indicates the fraction of positive predic- pattern but will not generalize it. In other words,
tions that were actually correct. The formula of pre- the training accuracy is more than testing accu-
cision is racy. The model does not generalize for unseen
data. Different augmentations i.e. rotation range,
horizontal flip and zoom range was applied on
true positive
Precision = the dataset. Six experiments were performed with
true positive + false positive augmentations.

Recall indicates fraction of actual positives that 1. DenseNet [RMSPROP]


were predicted correctly. 2. DenseNet [ADAM]
3. DenseNet [SGD]
4. Inception v3 [RMSPROP]
true positive
Recall = 5. Inception V3 [ADAM]
true positive + false negative 6. Inception V3 [SGD]

60 Articles
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

3.2.2. Without Augmentation to increase the data because deep learning models
These experiments were also conducted without aug- requires huge data to learn. The training accuracies
mentations to see if the model can generalize well of experiments without augmentations were more
without augmentations. than 90%. Although L2 regularization were also ap-
plied to overcome the issue of overfitting. In case of
1. DenseNet [RMSPROP] Inception V3, very interesting figures were produced.
2. DenseNet [ADAM] Adam optimizer achieved 75% test accuracy in 22
3. DenseNet [SGD] epochs while stochastic gradient descent produced
4. Inception v3 [RMSPROP] same accuracy in 60 epochs. Moreover, the RMSprop
5. Inception V3 [ADAM] optimizer produced 76% accuracy in 30 epochs. So
6. Inception V3 [SGD] for the given problem, stochastic gradient descent
optimizer with inception V3 is not a suitable choice.
The experiments without augmentations showed that
4. Discussion RMSprop is a better choice. It gave 81% accuracy in
Early detection of skin lesion can save many lives and 38 epochs. While Adam and SGD run for same number
Artificial Intelligence is helping the medical science in of epochs and gave 80% and 79% accuracies respec-
serving this purpose. Convolutional Neural Networks tively. Another interesting thing was to see the per
are useful in medical imaging. The two state of the art class AUC-ROC of Dermatofibroma class. It showed
architectures of convolutional neural network were AUC-ROC around 60% in experiments without aug-
experimented in this paper and they both showed mentations. And in experiments with augmentations,
good results overall. It turned out that DenseNet it showed AUC-ROC scores around 70%. While this
performed better then Inception V3 in classifying was not the pattern in DenseNet experiments. All
the images into different classes. In order to evalu- the AUC-ROC scores are around 90%. It shows that
ate the model performance, AUC-ROC curves, preci- Inception V3 architecture did not learn the pattern of
sion, recall, F1 score and accuracy were employed. Dermatofibroma class very efficiently.
The reason of choosing multiple metrics was that the The loss function that was used for experiments
data was highly imbalance. So, accuracy metric alone was focal loss which performed well. It was used to
might be a deceiving metric. The data imbalance issue overcome the class imbalance issue. In deep learning,
was resolved by using focal loss. The per class ROC it is important to have equal distribution of the clas-
curves of classes in the DenseNet model are better ses. If data entries of one class are more than others,
than the Inception V3 model. Also the overall accura- the model will learn efficiently the class with more
cy, precision, recall and F1 Score figures are better in examples. And when the model is deployed, it pre-
DenseNet model. The models were run for 60 epochs dicts every image belong to that class. The data was
and early stopping criteria was applied. The reason highly imbalance. There are multiple ways to solve
of applying early stopping was to ensure that model this issue. One method is to use weighted loss. But
does not overfit. If the model is trained on too many ­recently, another loss function as introduced called
epochs, there are chances that model will overlearn focal loss. it focuses the class with few examples more
the pattern. And if the model is run for few epochs, than the class with more number of examples. It sho-
the model can underfit i.e. it won’t learn the pattern wed good performance overall. In the given problem,
completely. Since number of epochs is a hyperparam- the Vascular class had very few examples in training
ter, so it has to be tuned. Normally, the model is run dataset. focal loss focused on this class and on test da-
with huge number of epochs and when it stops learn- taset almost all experiments accurately classified the
ing, it is stopped. In keras, the early stopping callback Vascular class.
is provided and that was used in experiments. In the The accuracies are better in DenseNet then
result tables, termination epoch is also provided. The Inception V3. Moreover, the grad activation maps
purpose of mentioning termination epoch was to see show that the two models have seen different places
which optimizer converge on what epoch. The idea to classify the same image. The focus region of incep-
was to see that which optimizer converge relatively tion V3 is different from the focus region of DenseNet.
fast. In Dense Net model, Adam converged on 39th Inception V3 model misclassified Vascular class as it
epoch and gave accuracy of 79% but stochastic gra- is shown in figure. While we cannot know from grad
dient descent converged on 35th epoch and was 81% activation maps the reason of focusing the certain
accurate. It means that stochastic gradient descent region, this is the black box to understand. But the-
performed better in both perspectives. It gave higher se visualizations can help medical staff in knowing
accuracy with less epochs. In the experiments where that why the model is predicting the certain image
augmentations were not applied, the accuracies were to belong to certain class. Because the explainability
comparatively better than experiments with augmen- of the machine learning models is important espe-
tations. But the experiments without augmentations cially in the sensitive area of medical science. It will
faced overfitting problem. this is because the data help medical staff to understand the model predic-
was very less and the model learnt the training data tion without knowing much about artificial intelli-
but did not generalize well on testing data. The pur- gence, machine learning and convolutional neural
pose of applying augmentations in deep learning is networks.

Articles 61
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

5. Future Work The per class AUC-ROC is highly accurate. The results
In future the focus would be to improve the model of other experiments are following,
­accuracy by experimenting other models like AlexNet
and vgg-16. The accuracy of the models will be com- Tab. 1. DenseNet Comparison Table
pared and the best accurate model will be chosen.
Also, the skin lesion follows a certain hierarchy that Optimizer Accuracy Precision Recall F1- Termination
SCORE epoch #
can be incorporated in future research. The hierarchy
of skin lesion goes like: Adam 0.79 0.82 0.79 0.79 39
In this paper, the seven classes from the third level RMSProp 0.80 0.80 0.80 79 35
are incorporated. Total of eight classes belongs to the SGD 0.81 0.82 0.81 0.81 34
third level but in the dataset of skin lesion 2018, the
seven classes are given. In future the focus would be
to consider the complete hierarchy. In the first stage, Tab. 2. Per class AUC-ROC [DenseNet, RMS Prop, focal
the first level will be classified, in second phase, the Loss, with Augmentations]
second level will be classified and in the third level all
Class AUC-ROC
the seven classes will be classified by the model.
Actinic 0.957
Carcinoma 0.98
6. Figures and Tables
Dermatofibroma 0.985
Melanoma 0.921
Nevs 0.962
Seborrheic 0.958
Vascular 1.0

Tab. 3. Per class AUC-ROC [DenseNet, SGD, focal Loss,


with Augmentations]
Class AUC-ROC
Actinic 0.918
Carcinoma 0.981
Dermatofibroma 0.93
Melanoma 0.879
Fig. 1. Skin Lesion Hierarchy
Nevs 0.965
Seborrheic 0.973
Vascular 1.0
DenseNet model:

Tab. 4. Focal Loss – Without Augmentation, [DenseNet]


Class AUC-ROC
Actinic 0.971
Carcinoma 0.977
Dermatofibroma 0.915
Melanoma 0.864
Nevs 0.959
Seborrheic 0.945
Vascular 1.0

Tab. 5. DenseNet without Augmentation


Optimizer Accuracy Precision Recall F1- Termination
Score epoch #
Adam 0.81 0.79 0.81 0.80 38
RMSprop 0.82 0.82 0.82 0.82 38
Fig. 2. ROC Curve for RMSProp SGD 0.81 0.80 0.81 0.80 29

62 Articles
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

Tab. 6. Per class AUC-ROC [DeseNet, Adam, focal Loss, Tab. 9. Inception V3 comparison table
without Augmentations]
Optimizer Accuracy Precision Recall F1- Termination
Class AUC-ROC Score epoch #
Actinic 0.965 Adam 0.75 0.78 0.75 0.75 22
Carcinoma 0.979 RMSprop 0.76 0.71 0.76 0.73 30
Dermatofibroma 0.869 SGD 0.75 0.74 0.75 0.74 60
Melanoma 0.924
Nevs 0.94
Seborrheic 0.957
Tab. 10. Per class AUC-ROC [Inception Adam, focal Loss,
Vascular 1.0
with Augmentations]
Class AUC-ROC
Tab. 7. Per class AUC-ROC [DenseNet, RMSProp , focal
Actinic 0.887
Loss, without Augmentations]
Carcinoma 0.959
Class AUC-ROC
Dermatofibroma 0.859
Actinic 0.946
Melanoma 0.791
Carcinoma 0.986
Dermatofibroma 0.982 Nevs 0.92
Melanoma 0.905 Seborrheic 0.911
Nevs 0.96 Vascular 0.99
Seborrheic 0.956
Vascular 1.0

Tab. 11. Per class AUC-ROC [Inception RMSprop, focal


Tab. 8. Per class AUC-ROC [DenseNet, SGD , focal Loss, Loss, with Augmentations]
without Augmentations]
Class AUC-ROC
Class AUC-ROC Actinic 0.912
Actinic 0.944 Carcinoma 0.953
Carcinoma 0.975
Dermatofibroma 0.719
Dermatofibroma 0.975
Melanoma 0.751
Melanoma 0.928
Nevs 0.958 Nevs 0.935
Seborrheic 0.964 Seborrheic 0.914
Vascular 1.0 Vascular 0.985

Tab. 12. Per class AUC-ROC [Inception, SGD, focal Loss,


with Augmentations]
Class AUC-ROC
Actinic 0.929
Fig. 3. Grad-CAM of DenseNet model
Carcinoma 0.953
Dermatofibroma 0.786
Melanoma 0.826
Nevs 0.94
Seborrheic 0.905
Vascular 0.998

Tab. 13. Focal Loss – Without Augmentations,


[Inception v3]
Optimizer Accuracy Precision Recall F1- Termination
Score epoch #
Adam 0.80 0.80 0.80 0.80 43
RMSprop 0.81 0.81 0.81 0.80 38
SGD 0.79 0.79 0.79 0.79 43

Fig. 4. Focal Loss – With Augmentations, [Inception v3]

Articles 63
Journal of Automation, Mobile Robotics and Intelligent Systems VOLUME 16, N° 3 2022

Tab. 14. Per class AUC-ROC [Inception, Adam, focal References


Loss, without Augmentations] [1] N. Gessert, M. Nielsen, M. Shaikh, R. Werner,
Class AUC-ROC and A. Schlaefer, “Skin sion classification using
Actinic 0.921 ­ensembles of multi-resolution EfficientNets
Carcinoma 0.937 with meta data,” MethodsX, vol. 7, p. 100864,
Dermatofibroma 0.613 2020, DOI: 10.1016/j.mex.2020.100864.
Melanoma 0.868 [2] J. Yap, W. Yolland, and P. Tschandl, “Multimodal
Nevs 0.947 skin lesion classification using deep learning,”
Seborrheic 0.928 Exp. Dermatol., vol. 27, no. 11, pp. 1261–1267,
Vascular 0.998 2018, DOI: 10.1111/exd. 13777.
[3] P. Mirunalini, A. Chandrabose, V. Gokul, and
Tab. 15. Per class AUC-ROC [Inception, RMSProp, focal S. M. Jaisakthi, “Deep Learning for Skin Lesion
Loss, without Augmentations] Classification,” 2017, [Online]. Available: http://
Class AUC-ROC arxiv.org/abs/1703.04364.
Actinic 0.903 [4] T. C. Pham, C. M. Luong, M. Visani, and V. D.
Carcinoma 0.933 Hoang, “Deep CNN and Data Augmentation for
Dermatofibroma 0.673 Skin Lesion Classification,” Lect. Notes Comput.
Melanoma 0.864 Sci. (including Subser. Lect. Notes Artif. Intell.
Nevs 0.946 Lect. Notes Bioinformatics), vol. 10752 LNAI, no.
Seborrheic 0.906 June, pp. 573–582, 2018, DOI: 10.1007/978-3-
Vascular 0.997
319-75420-8_54.
[5] T. Majtner, S. Yildirim-Yayilgan, and J. Y.
Tab. 16. Per class AUC-ROC [Inception, SGD, focal Loss, Hardeberg, “Combining deep learning and
without Augmentations] ­hand-crafted features for skin lesion classifica-
Class AUC-ROC tion,” 2016 6th Int. Conf. Image Process. ­Theory,
Actinic 0.909
Tools Appl. IPTA 2016, no. December, 2017,
DOI: 10.1109/IPTA.2016.7821017.
Carcinoma 0.946
Dermatofibroma 0.671 [6] A. Mahbod, G. Schaefer, I. Ellinger, R. Ecker, A.
Melanoma 0.863 ­ itiot, and C. Wang, “Fusing fine-tuned deep
P
Nevs 0.954
­features for skin lesion classification,” Comput.
Med. Imaging Graph., vol. 71, pp. 19–29, 2019,
Seborrheic 0.932
DOI: 10.1016/j.compmedimag.2018.10.007.
Vascular 0.997
[7] J. Zhang, Y. Xie, Y. Xia, and C. Shen, “­Attention
Residual
­ Learning for Skin ­
Lesion
­Classification,” IEEE Trans. Med. Imag-
ing, vol. 38, no. 9, pp. 2092–2103, 2019,
doi: 10.1109/TMI.2019.2893944.
Fig. 5. Grad-CAM of Inception V3

AUTHORS
Rajit Chandra – Computer Science Department,
Purdue Fort Wayne, Fort Wayne, 46805, USA,
E-mail: [email protected].

Mohammadreza Hajiarbabi* – Computer Science


Department, Purdue Fort Wayne, Fort Wayne, 46805,
USA, E-mail: [email protected].

*Corresponding author

64 Articles

You might also like