Unet For Brain Image Segmentation
Unet For Brain Image Segmentation
Healthcare Analytics
journal homepage: [Link]/locate/health
Using U-Net network for efficient brain tumor segmentation in MRI images
Jason Walsh a , Alice Othmani b , Mayank Jain a,c , Soumyabrata Dev a,c ,∗
a
School of Computer Science, University College Dublin, Ireland
b
Université Paris-Est Créteil Val de Marne - Université Paris 12, France
c
ADAPT SFI Research Centre, Dublin, Ireland
∗ Corresponding author at: School of Computer Science, University College Dublin, Ireland.
E-mail addresses: jason.walsh3@[Link] (J. Walsh), [Link]@[Link] (A. Othmani), mayank.jain1@[Link] (M. Jain),
[Link]@[Link] (S. Dev).
1
To facilitate the reproducibility of this research, the code of this paper is made available at: [Link]
[Link]
Received 27 June 2022; Received in revised form 15 August 2022; Accepted 17 August 2022
2772-4425/© 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license
([Link]
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
2. Brain tumor segmentation techniques The operation of clustering is very similar to the operation of clas-
sification algorithms. Classifiers use a target dataset to perform image
Image segmentation is a crucial area of research in the broad segmentation, whereas clustering is an unsupervised learning algo-
domains of image processing and computer vision, with applications rithm [28]. In other words, there is no target labeled dataset in this case
in varied fields [15,16]. The challenge is to classify each pixel as a to learn the features from. Hence, unsupervised algorithms essentially
part of different objects in the image. Over the years, many algorithms train themselves by identifying the possible underlying patterns [29].
have been proposed for this task [17,18]. The process of segmentation Commonly used clustering algorithms include k-means and the fuzzy c-
has seen widespread use in the field of medical imaging as well [11]. means algorithm [30]. K-means, in the domain of image segmentation,
Particularly for MRI scans, most of the previous studies modifies the is used to segment an area of interest from the background of an
existing techniques of image segmentation to manipulate the three- input image. This method will partition the features of the dataset
dimensional volumetric images [19,20]. These modified networks are (pixels/voxels) into several clusters. Here, each feature will belong to
further fine tuned to improve the performance over the given task. a cluster with the closest mean. Fuzzy c-means or soft k-means is a
Based on the complexity of the algorithm they use to extract the variation of k-means where each feature can belong to more than one
ROI from input images, segmentation techniques can be classified into cluster based on the degree of membership [31].
4 broad categories. In the subsequent sub-sections, we will explain these Prior research has explored the use of both algorithms in their
categories and some of the corresponding algorithms that fall in each ability to segment tumors from an MR image. In a particular research,
category. Kabade and Gaikwad used advanced k-means and fuzzy c-means algo-
rithms for segmentation [32]. Along with this, they used thresholding
2.1. Thresholding for feature extraction and edge detection for approximate reasoning to
recognize the characteristics of the tumor. A crucial extension of this
Thresholding is a very simplistic segmentation technique that is procedure was the addition of ‘skull stripping’ where the outer cranium
used to convert a gray-scaled or a red–green–blue (RGB) image into is extracted using the watershed algorithm to improve the segmentation
a binary image [21]. Although this technique is simplistic in nature, it accuracy of the model [33]. The results obtained using such methods
is very effective in extracting the ROI from an image. Prior research were adequate for the dataset collected, However, they did not use
on this particular segmentation technique explores the use of Otsu’s three-dimensional imagery. Instead, the MRI slices were converted to
method [22] and global thresholding for medical image segmentation. a two-dimensional format and the algorithms were then applied to this
Otsu’s method is a popular segmentation algorithm used in the area new set of images.
of pattern recognition where features of an ROI are extracted from an
image for further processing. 2.4. Deep learning
However, more recent success in this domain was obtained using
simple binary thresholding coupled with the watershed algorithm to A Convolutional Neural Network (CNN) is a class of deep neural
segment brain tumors from MRI brain scans [23]. The proposed method network which is prominently used for the analysis of two-dimensional
first uses median filtering to remove various noise levels that may be imagery [34]. Deep learning has seen great success in medical image
damaging the quality of each image. Binary thresholding is then applied analysis where researchers have focused primarily on using deep learn-
to the image, followed by the watershed algorithm to extract the ROI ing to create systems which can accurately detect if an image contains
from the brain scan. Finally, small batches of morphological operations an apparent health-related issue. MRNet is one such system [19]. It is
are used to refine the segmentation which results in a more accurate a convolutional neural network used to detect knee-related abnormal-
segmentation. ities.
MRNet had excellent accuracy in classifying the three knee abnor-
2.2. Classifiers malities which were presented as inputs to the network. This network is
an example of how deep learning can be used to produce systems which
Classifiers are common techniques used in segmentation which can accurately identify health issues from medical images. MRNet is
share a similar approach with methods used in supervised learning an image classification system which is primarily used to classify if
2
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
a particular MRI image contains one of the three abnormalities. The number three which includes 14 MR images taken before and after
network computes this via an output probability and uses logistic the patient’s surgery. The dataset is stored in Medical Imaging NetCDF
regression to determine which class the probability may belong to. (MINC) format the standard format at the Montreal Neurological Insti-
MRNet was a particular architecture designed for image classi- tute for image processing. MINC data can contain signed and unsigned
fication. Applying similar principles to a convolutional neural net- integer, float and complex data types accompanied by a prepackaged
work, other network architectures, like U-Net and V-Net, were designed extensible binary format header containing all relevant information
which could segment an ROI from an input two or three-dimensional regarding the corresponding patient, tumor region and cancer type.
image. In a convolutional neural network, there are multiple con- Group three of the BITE dataset contains images related to 14 patients
volutional layers which gradually extract the features of the image all of which were identified to have brain cancer, see Fig. 2. Unfor-
through a variety of kernels and pooling layers. These networks are tunately, no pre-operative scan was performed on patient number 13
of the dataset meaning that no manual binary map of the ROI was
more actively developed when researchers explored the opportunity of
manually extracted as the patient’s pre-operative scan does not exist,
using deep learning architectures for advanced image segmentation.
hence, patient 13 will be excluded from the future analysis that this
LinkNet proposes a deep neural network architecture similar to U-
paper will propose.
Net. The network allows for learning without any significant increase
To extract meaningful information from this dataset we primarily
in the number of trainable/non-trainable parameters [35]. LinkNet’s used the programming language python3 coupled with neuroimaging
success was based on its lightning-fast speed which was due to its libraries such as Nibabel4 and Nilearn5 which allowed for easy
lightweight architecture. The architecture which LinkNet utilizes holds manipulation of the images. Nibabel was used to load the images in
at least 11.5 million parameters, it is similar to U-Net, where sev- as Nibabel objects which could then be easily converted into numpy
eral encoder and decoder blocks slowly break down an image and arrays to manipulate the images affine matrix. Nilearn a popular
rebuild the outcome through a series of final convolutional layers. python neuroimaging library comes pre-packaged with plots suitable
LinkNet’s structure was purposely designed to minimize the total num- for the visualization of MR images. Fig. 2 shows a basic anatomical
ber of parameters the network contains. This allowed segmentation to plot to show the three image perspective planes of a specific MR image.
be performed in real-time. This is why the network has produced a This plot can be altered to visualize different slices of each image plane
state-of-the-art performance on the Cambridge-driving Labeled Video by manipulating the coordinates of the MR image. By manipulating
Database (CamVid) [36]. the coordinates of the image, plots can be designed around the images
shape where functions intake a basic slice number as a parameter and
3. Magnetic Resonance Image (MRI) data increment through the slices of the image displaying each slice as the
function increments through the numpy array.
MR images can come in several different file formats, each of which The ground truth accompanying this dataset has several issues that
shares their characteristics regarding the type of data stored within can impact the segmentation performance an algorithm may propose.
the three-dimensional image file. Digital Imaging and Communications Most notably, the binary mask of each patient varies in image resolu-
in Medicine (DICOM) and Neuroimaging Informatics Technology Ini- tion to the patient’s pre-operative scan. The resolution of each binary
tiative (Nifti) files are some of the most common medical imaging mask does not match the resolution of each pre or post-operative image.
Why does this difference in resolution matter? U-Net intakes an MR
formats widely available for research purposes. These medical images
two-dimensional image and outputs a segmentation of the ROI supplied
are not two-dimensional, they do not share the same characteristics
to the network. The ROI and the output of the network must be the
as a typical two-dimensional image format such as Portable Network
same resolution of the input image. With the difference in resolution,
Graphics (PNG) or Joint Photographic Experts Group (JPEG). Medical
we are unable to utilize either of these networks until the resolution of
images are three-dimensional and can be thought of as a single file
the tumor masks is resolved. The difference in image resolution is due
which contains multiple slices of the brain over three perspective to the images being stored in MINC format. This format enables analysts
planes, where each individual slice is a collection of voxel’s where each to store cropped regions without losing any coordinate information,
of which defines a point in three-dimensional space. making it possible to overlay the full pre-operative MR image with
For this analysis, datasets which only contain ‘ground truth’ were its corresponding small tumor mask without suffering any error in
chosen for this particular image segmentation task. The cancer imaging alignment. To overlay a tumor mask with a pre-operative MRI scan we
archive [37] is a massive repository containing thousands of images can use one of Nilearns built-in plotting functions to create an ROI
all related to cases where the cancer was found from a pre-operative plot.
MRI scan. These datasets are normally accompanied by a post-operative Fig. 3 represents an ROI plot, where the tumor mask has been over-
scan of the affected region to show that the abnormal growth was suc- laid onto the corresponding pre-operative scan. This figure provides
cessfully removed during surgery. Without manually extracted ground us with crucial information regarding the size, shape and location of
truth images it is a difficult process to perform image segmentation on the patient’s tumor. Using Nilearn we can resample the mask to a
a dataset containing brain tumors. There are alternatives to this issue target dimension. A useful feature which expands our original binary
which can be used to generate artificial pathological ground truth from mask to match the image dimensions of each pre-operative scan. The
a simulated dataset such as BrainWeb.2 re-sampled mask now meets its corresponding pre-operative mask in
One of the most common datasets widely available for brain tumor terms of height and width, however, this function also reshapes the
segmentation is the BRATS dataset for multimodal brain tumor segmen- MINC files slices. As patient 1 has 29 slices compared to 180 the
tation. This dataset is widely used for biomedical image analysis and function will add the extra slices to the binary mask so that each scan
significant amounts of research have been performed using this dataset. can be equal. Thus, with this new re-sampled dataset we can begin to
explore tumor segmentation using relevant algorithms. Most notably,
As extensive research has been performed using this dataset we decided
U-Net works with two-dimensional data. To reformat the MINC data
to look elsewhere and use the BITE’s dataset [27]. This dataset contains
we can extract these 180 slices as two-dimensional images, where each
pre-operative, post-operative and ground truth images for 14 patients
slice represents one image. These slices are then classified as the three
acquired by the Montreal Neurological Institute in 2010.
perspective planes of the brain, ie., coronal, sagittal, and transversal.
Each pre-operative and post-operative scan is a T1-weighted MRI
Repeating this procedure on 13 patients will create a large dataset for
accompanied by B-mode (Ultrasound) images. The images of the BITE our analysis.
database are split into four groups based on the specific analysis
performed by the neurosurgeons. Our analysis is centered on group 3
van Rossum, G., 1990. Python. [Link].
4
Markiewicz, C., 2006 Nibabel. [Link]/nibabelNiPy.
2 5
McGill University, 1997. BrainWeb. [Link]/brainweb/. 2019. Nilearn. [Link] NiPy.
3
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
Fig. 2. Anatomical plot of patient 2’s pre-operative MRI scan showing all three perspective planes.
Fig. 3. ROI overlay plot, showing patient 2’s binary mask overlaid onto their pre-operative scan showing the size, shape and location of their tumor.
4. Methodology words, if the input image is of shape (512, 512), the corresponding
binary mask must match this shape. The input image is compressed
4.1. Method to fit into a latent-space representation. The decoder will reconstruct
this compressed image back into its original shape of (512, 512). U-Net’s
U-Net is a fully connected CNN used for efficient semantic seg- performance depends solely on the quality of its input image. U-Nets
mentation of images. Such U-Net deep neural network fits in various segmentation performance can be evaluated by monitoring its warping,
analytical tasks of wide ranging application. This is particularly useful rand and pixel error. Originally, U-Net outperformed a sliding-window
where the input data is the form of images. This architecture has several convolutional network and produced the best warping error in the EM
applications ranging from consumer videos [25,38], earth observa- segmentation challenge. Later iterations of U-Net expanded on U-Nets
tions [39] and medical imaging [14,40,41]. The U-Net architecture performance in image segmentation [12].
is based on an autoencoder network where the network will copy A very recent addition was U-Net++ which modified U-Net archi-
its inputs to its outputs [12]. An autoencoder network functions by tecture to include a series of nested dense skip pathways, which is used
compressing the input image into a latent-space representation which is to reduce the gap between feature maps and the pathways (encoder &
simply a compressed representation of the images indicating which data decoder) of the network [20]. U-Net++ was also proposed to work with
points are closest together. The compressed data is later reconstructed deep supervision. Basically, the loss is now calculated at interim levels
to produce an output. An autoencoder network contains two paths, an alongside the traditional output layers. This has been noted to help with
encoder and a decoder. The encoder compresses the data into a latent- the problem of vanishing gradients during loss backpropagation [43].
space representation while the decoder is used for the reconstruction As U-Net works with two-dimensional data, the acquired dataset
of the input data from its latent-space representation. U-Net uses a was converted from a three-dimensional image plane to a
convolutional autoencoder architecture where the convolutional layers
two-dimensional dataset using a ‘slice extractor’ The extractor extracts
are used to encode and decode the input images.
each slice of an MRI scan (MNC file) and saves the slice as a PNG
Similarly to an autoencoder network, U-Net contains two paths, a
file. Using the extractor we created four new datasets from the original
contraction path (encoder) and a symmetric expanding path (decoder).
14 patients MR images. Three of these datasets will correspond to the
The encoder path of U-Net captures the context of the input image,
perspective planes of the brain (Coronal, Sagittal, Transversal) while
this path is simply a pipeline of convolutional and pooling layers. The
the fourth dataset will contain all the available images which adds up
decoder path uses transposed convolutions enabling precise localiza-
to 846 images in total.
tion. There is no fully connected feedforward layer (or dense layer)
Our proposed implementation of U-Net was achieved using Ten-
in the U-Net, and it only contains the stacks of convolutional layers
and max-pooling layers. Although U-Net was originally designed for
sorflow6 . Where the network has 4 convolutional blocks. Each con-
volutional block of the network contains 2 convolutional layers with
572 × 572 images, it can be easily modified to work with any image
a kernel size of 3 × 3 and zero padding at each layer to control
dimension [12]. Several stacked convolutional layers can enable the
network to learn more precise features from the compressed input the shrinkage of the object dimension after applying filters. The filter
images, see Fig. 4 [42]. size per convolutional block varies after each layer where the filter
U-Net operates on the assumption that the input image and the
6
corresponding binary map are equal in image dimensions. In other Google Brain, 2015. Tensorflow. [Link]/.
4
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
size increments in steps of 16. Each layer of a convolutional block is increasing the networks number of filters we are essentially increasing
activated by a Rectified Linear Unit (ReLU) while in between these the number of trainable parameters in the model. This particular ex-
layers a batch normalization step is then applied. At the encoder layer periment was undertaken to observe the performance increase when
of the network, we apply a 2 × 2 max pooling layer after a function call the total trainable parameters of the network are increased via the
to add a convolutional block to further reduce the spatial dimensions convolution blocks filter range. Several iterations were undertaken
of the input image. While max pooling is also applied at the decoder using a variety of filter values for the networks convolutional block. The
layer its application here is too up-sample the feature map using the optimization was done by reflecting on the stepwise IoU values. During
memorized max-pooling indices [12]. the experiments, filter values were incremented by 16 per convolutional
block after each iteration.
4.2. Training & Optimization
5. Results
During the training process, we decided to stick with a simple cost
function for this network. We chose binary cross-entropy loss (say, 5.1. Subjective evaluation
𝐿𝐵𝐶𝐸 ) for our cost function as the network is simply being trained to
segment one particular region, the cancerous section of an input MR Training the network began in small iterations where we monitored
image. This loss function expects a sigmoid outcome (𝑦̂𝑖 ) as it is a binary the segmentation performance based on iterations alone. Early results
predictor with target values (𝑦𝑖 ) as 1 or 0. Given the output size of 𝑁, using 10 epochs produced poor results across two of the image planes
𝐿𝐵𝐶𝐸 is defined as per Eq. (1). Prior consideration was taken to decide notably the sagittal and coronal planes. Poor segmentation results
on fixed image size for our inputs. As the image size varies across the were expected based on the small epoch range used during training,
three perspective planes we decided on an image size of 128 × 128 however, adequate results were recorded on the transversal plane. This
as the images extracted from both the coronal and sagittal planes are is likely due to the size of each dataset as the transversal dataset had
substantially smaller than those extracted from the transversal plane. the most images as the patient’s tumor is most prominent from this
With this image size in mind, the networks architecture was designed particular perspective. To further improve these results we increased
to be lightweight and swift so a prediction image could be reproduced. the networks epoch range to 50 and monitored the results to see if the
This image size was chosen based on the variation of dimensions of the segmentation performance had improved. Using 50 epochs significantly
original patient’s image data. improved the networks segmentation performance across all three per-
spective planes. Fig. 5 shows the results across all three perspectives
1 ∑[
𝑁
( ) ( ) ( ( ))]
𝐿𝐵𝐶𝐸 = 𝑦 ⋅ log 𝑦̂𝑖 + 1 − 𝑦𝑖 ⋅ 𝑙𝑜𝑔 1 − 𝑦̂𝑖 (1) using only 50 epochs and the proposed network architecture from
𝑁 𝑖=1 𝑖
Section 4.1. Here, model 1 (or first model) is the standard U-Net
Before the training of each network could start, metric callbacks architecture, whereas the model 2 (or the final model) is the proposed
were introduced to control the performance of the network. Early U-Net with optimized filter values.
Stopping and Model Checkpointing were used to ensure the To further improve the networks segmentation performance we be-
performance of the network did not degrade if extreme values were gan undertaking small experiments using the entire dataset of images.
introduced into the networks epoch range. To monitor the performance This dataset contains all images extracted from all three perspective
of the network, precision, recall and intersection-over-union (IoU) was planes. The purpose of this study is to experiment on U-Nets ability
logged using CSV Logger. This log helped us to analyze the stepwise to extract features from images from different perspectives. Running
results that were produced by the network during the training process. the network using this new dataset for 50 epochs only produced very
The model was trained with Adam optimizer with a learning rate of promising results. Increasing the number of epochs to an extreme size
0.0001 which was chosen after rigorous experimentation. does increase the networks overall segmentation accuracy but only by
The epoch range used for training varied based on the results a very slight amount.
recorded from prior experiments. We decided that running small ex- Overall, we observed that with a very small number of trainable
periments with only 10 epochs would allow us to quickly grasp how parameters and a very small epoch size, our network can provide
our network performs per perspective plane. After a small number of accurate segmentation of the cancerous region of a T1-weighted MR
iterations were performed we gradually increased this parameter range image. Using a small filter range while training on the entire dataset,
to 50 epochs to monitor the networks convergence rate. One particular the network produced a final stepwise IoU of 43%. To improve the
parameter, we also experimented with, was the networks filter size. By networks segmentation performance we performed several experiments
5
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
Fig. 6. Direct comparison of the stepwise IoU and stepwise loss recorded by the first and final model during the training phase.
on the convolutional block’s filter range. As previously mentioned we The networks stepwise IoU had drastically increased to 71% which
incremented via steps of 16 and monitored the stepwise IoU after each is significantly higher then the previous results recorded using the
epoch until each training iteration was complete. The results from this standard network. By observing Fig. 7 we can see the effects of an
experiment which can be observed in Fig. 6 drastically improved the increased filter range on our proposed network. We observed that an
networks IoU and also allowed for a faster convergence rate of our increased filter value increased the networks segmentation performance
models loss. by providing a more accurate structure to the network’s segmentation
By increasing the filter values three times and by training the results. Thus, using a larger filter range allows our implementation of
network with 50 epochs using the entire dataset we recorded a direct U-Net to extract more meaningful features from the input images on
increase in the networks segmentation performance and stepwise IoU. either of the four proposed datasets.
6
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
Fig. 7. Comparison of results from U-Net by increasing the number of trainable parameters the network contains.
7
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
Fig. 8. The ground truth images used to evaluate our chosen benchmarking algorithms.
Fig. 9. Comparison of the results obtained during the benchmarking process to those predicted via U-Net.
fluctuates as the perspective of the image changes, for instance in Fig. 9 5.4. Discussion
on the coronal and sagittal plane we observed mixed results as the
network successfully captures the location of the tumor but cannot The purpose of these experiments was to explore how a lightweight
directly generate its size or structure. Whereas on the transversal plane, network could perform real-time automatic image segmentation on
comparing the generated prediction with its corresponding ground- MRI brain scans. Image segmentation on medical images can provide
truth from Fig. 8, we observe an increase in segmentation performance accurate properties such as the size, shape and location of an appar-
as the prediction is very accurate. This performance increase is likely ent mass discovered by medical physicians. As previously discussed
due to the size of the dataset as the transversal plane contains more in earlier sections of this paper, our implementation of U-Net is a
images than both the coronal and sagittal planes combined. lightweight version trained specifically to segment brain tumors from
MRI brains scans. Performing several training iterations allowed us to
LinkNet was trained for 50 epochs to match the training phase of
gradually fine-tune the network so we could monitor and improve U-
the proposed implementation of U-Net. The proposed method yielded
Net’s performance on each perspective plane and the entire dataset
a mean IoU which was 4% higher than that recorded with LinkNet
as a whole. By undertaking such a cautious approach to training it
when trained on the entire dataset. Improvements were also recorded
allowed us to quickly observe the convergence and overall performance
on all the three perspective planes. We can visualize this difference
our network produced, thus there was an unessential need for data
using our evaluation metrics by comparing the predictions generated by
augmentation.
the two networks. For instance, in Fig. 10 we observe the segmentation The use of systematic benchmarking allowed us to compare our
of U-Net when compared to the prediction generated by LinkNet. The model’s performance to four other widely used methods in the do-
predictions are both very accurate, however, there is a 10% mean accu- main of image processing. These methods although not entirely suited
racy difference between these two models. This difference in accuracy to medical image segmentation were previously used in applications
shows how LinkNet produces a correct prediction but cannot define the where either a single or multi-class segmentation occurred. By includ-
structure of its prediction on the small epoch range which it was trained ing both manual and automatic segmentation techniques we could
on. Our proposed model outperformed each of the four widely used directly compare the performance obtained using our proposed method
algorithms on all four evaluation metrics. This means that the model to that obtained by one of the four widely used benchmarking methods.
we have proposed is a lightweight network which is also very accurate These experiments revealed that our proposed method outperformed
at segmenting brain anomalies from MRI two-dimensional scans. each of the four methods used for benchmarking on each of the
8
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
Fig. 10. Comparison of the performance difference obtained via LinkNet and our proposed model trained on 50 epochs.
Table 3 advanced rapidly with the application of deep learning, however, more
Comparison of the results obtained during the benchmarking process.
studies are needed to further improve the performance of a proposed
Method Pixel Acc (%) Mean Acc (%) Mean IoU (%) FWIoU (%) network as the ratio between the predicted images False Negatives and
Coronal False Positives is crucial in biomedical image analysis. We intend to
Thresholding 91 70 47 90 benchmark our proposed lightweight U-Net with the original U-Net
K-Means 79 72 41 78 structure and other statistical [46] and deep learning networks [18].
Fuzzy C 84 73 44 83
Future work could improve on this performance by investigating the
LinkNet 99 78 76 99
U-Net 99 88 84 99 use of data augmentation to artificially increase the size of the dataset
Sagittal
using an augmentor pipeline. Nevertheless, this study has shown how
deep learning and computer vision can be applied in a medical domain
Thresholding 93 58 48 92
K-Means 82 60 42 81 to accurately segment brain tumors from two-dimensional MR brain
Fuzzy C 86 56 44 85 images using a lightweight variant of a well known architecture.
LinkNet 99 60 59 98
U-Net 99 76 75 99
Declaration of competing interest
Transversal
Thresholding 97 59 51 96 The authors declare that they have no known competing finan-
K-Means 91 67 48 90
cial interests or personal relationships that could have appeared to
Fuzzy C 91 70 48 91
LinkNet 99 83 81 99 influence the work reported in this paper.
U-Net 99 85 84 99
Full Acknowledgments
Thresholding 95 61 49 93
K-Means 86 65 45 85 This research was conducted with the financial support of Science
Fuzzy C 88 65 45 87 Foundation Ireland under Grant Agreement No. 13/RC/2106_P2 at
LinkNet 99 87 84 99
the ADAPT SFI Research Centre at University College Dublin. ADAPT,
U-Net 99 91 89 99
the SFI Research Centre for AI-Driven Digital Content Technology, is
funded by Science Foundation Ireland through the SFI Research Centres
Programme.
perspective planes and the entirety of the dataset, indicating that the
network we propose is incredibly accurate at segmenting brain tumors References
from two-dimensional MRI brain scans.
[1] A. Patel, Benign vs malignant tumors, JAMA Oncol. 6 (9) (2020) 1488.
6. Conclusion [2] A. Işın, C. Direkoğlu, M. Şah, Review of MRI-based brain tumor image seg-
mentation using deep learning methods, Procedia Comput. Sci. 102 (2016)
317–324.
Due to the purpose of our study, we have seen how U-Net an
[3] A. Md. Sattar, M. Kr. Ranjan, Automatic cancer detection using probabilistic
existing deep learning architecture for biomedical image segmentation convergence theory, in: Computational Intelligence in Oncology: Applications in
can be altered and fine-tuned for brain tumor segmentation. By using Diagnosis, Prognosis and Therapeutics of Cancers, Springer, 2022, pp. 111–122.
the BITE’s dataset and converting these three-dimensional MRI brain [4] M.S. Pathan, A. Nag, M.M. Pathan, S. Dev, Analyzing the impact of feature
scans to two-dimensional images we have been able to use this data selection on the accuracy of heart disease prediction, Healthc. Anal. 2 (2022)
100060.
to evaluate the performance of a very lightweight implementation of
[5] C.S. Nwosu, S. Dev, P. Bhardwaj, B. Veeravalli, D. John, Predicting stroke from
U-Net which can accurately segment anomalies from two-dimensional electronic health records, in: 2019 41st Annual International Conference of the
images. The network outperforms any of the standard benchmarking IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2019, pp.
algorithms used to evaluate the performance of the network, the net- 5704–5707.
work yields an average mean IoU of 84% when trained on the entire [6] S. Dev, H. Wang, C.S. Nwosu, N. Jain, B. Veeravalli, D. John, A predictive
analytics approach for stroke prediction using machine learning and neural
dataset, interestingly the mean IoU does not stagnate when the network
networks, Healthc. Anal. 2 (2022) 100032.
is trained only on one the three perspective planes, a study which was [7] M.S. Pathan, Z. Jianbiao, D. John, A. Nag, S. Dev, Identifying stroke indicators
undertaken to observe U-Nets approach to segmenting anomalies on using rough sets, IEEE Access 8 (2020) 210318–210327.
small datasets containing less than one hundred images. [8] G. Sivapalan, K. Nundy, S. Dev, B. Cardiff, D. John, ANNet: a lightweight neural
Our implementation of U-Net is lightweight and can perform ac- network for ECG anomaly detection in IoT edge sensors, IEEE Transactions on
Biomedical Circuits and Systems 16 (1) (2022) 24–35.
curate segmentation’s, without the need for aggressive data augmen-
[9] M.I. Modalities, The three perspective planes of the brain, 2013, URL https:
tation. This proposed network could be used in a medical setting for //[Link]/37Ci6R7.
trained physicians to have a second evaluator to a patients MR image. [10] J. Hu, X. Gu, X. Gu, Mutual ensemble learning for brain tumor segmentation,
Research on the particular topic of brain tumor segmentation has Neurocomputing 504 (2022) 68–81.
9
J. Walsh, A. Othmani, M. Jain et al. Healthcare Analytics 2 (2022) 100098
[11] X. Liu, L. Song, S. Liu, Y. Zhang, A review of deep-learning-based medical image [28] M. Jain, M. Jain, T. AlSkaif, S. Dev, Which internal validation indices to use
segmentation methods, Sustainability 13 (3) (2021) 1224. while clustering electric load demand profiles?, Sustainable Energy, Grids and
[12] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for Networks 32 (2022) 100849.
biomedical image segmentation, in: International Conference on Medical Image [29] M. Jain, T. AlSkaif, S. Dev, A clustering framework for residential electric
Computing and Computer-Assisted Intervention, Springer, 2015, pp. 234–241. demand profiles, in: Proc. International Conference on Smart Energy Systems
[13] Y. Li, J. Liu, L. Wang, Lightweight network research based on deep learning: and Technologies (SEST), IEEE, 2020, pp. 1–6.
A review, in: 2018 37th Chinese Control Conference (CCC), IEEE, 2018, pp. [30] M. Jain, T. AlSkaif, S. Dev, Validating clustering frameworks for electric load
9021–9026. demand profiles, IEEE Trans. Ind. Inform. 17 (12) (2021) 8057–8065, http:
[14] O. Ali, H. Ali, S.A.A. Shah, A. Shahzad, Implementation of a modified U-net for //[Link]/10.1109/TII.2021.3061470.
medical image segmentation on edge devices, IEEE Trans. Circuits Syst. II: Exp. [31] S. Dev, Y.H. Lee, S. Winkler, Systematic study of color spaces and components for
Briefs (2022). the segmentation of sky/cloud images, in: Proc. IEEE International Conference
[15] S. Dev, A. Nautiyal, Y.H. Lee, S. Winkler, CloudSegNet: A deep network for on Image Processing (ICIP), IEEE, 2014, pp. 5102–5106.
nychthemeron cloud image segmentation, IEEE Geosci. Remote Sens. Lett. 16 [32] R.S. Kabade, M. Gaikwad, Segmentation of brain tumour and its area calculation
(12) (2019) 1814–1818. in brain mr images using K-mean clustering and fuzzy C-mean algorithm,
[16] S. Dev, Y.H. Lee, S. Winkler, Color-based segmentation of sky/cloud images from International Journal of Computer Science & Engineering Technology 4 (05)
ground-based cameras, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10 (1) (2013) 524–531.
(2016) 231–242. [33] E. Abdel-Maksoud, M. Elmogy, R. Al-Awadi, Brain tumor segmentation based on
[17] M. Jain, C. Meegan, S. Dev, Using GANs to augment data for cloud image a hybrid clustering technique, Egypt. Inf. J. 16 (1) (2015) 71–81.
segmentation task, in: 2021 IEEE International Geoscience and Remote Sensing [34] H. Wang, Y. Li, S. Xi, S. Wang, M.S. Pathan, S. Dev, AMDCNet: An attentional
Symposium (IGARSS), IEEE, 2021, pp. 3452–3455. multi-directional convolutional network for stereo matching, Displays (2022)
[18] S. Minaee, Y.Y. Boykov, F. Porikli, A.J. Plaza, N. Kehtarnavaz, D. Terzopoulos, 102243.
Image segmentation using deep learning: A survey, IEEE Trans. Pattern Anal. [35] A. Chaurasia, E. Culurciello, Linknet: Exploiting encoder representations for
Mach. Intell. (2021). efficient semantic segmentation, in: 2017 IEEE Visual Communications and Image
[19] N. Bien, P. Rajpurkar, R.L. Ball, J. Irvin, A. Park, E. Jones, M. Bereket, B.N. Processing (VCIP), IEEE, 2017, pp. 1–4.
Patel, K.W. Yeom, K. Shpanskaya, et al., Deep-learning-assisted diagnosis for [36] G.J. Brostow, J. Fauqueur, R. Cipolla, Semantic object classes in video: A
knee magnetic resonance imaging: development and retrospective validation of high-definition ground truth database, Pattern Recognit. Lett. xx (x) (2008) xx.
MRNet, PLoS Med. 15 (11) (2018) e1002699. [37] K. Clark, B. Vendt, K. Smith, J. Freymann, J. Kirby, P. Koppel, S. Moore, S.
[20] Z. Zhou, M.M.R. Siddiquee, N. Tajbakhsh, J. Liang, Unet++: A nested u-net Phillips, D. Maffitt, M. Pringle, et al., The Cancer Imaging Archive (TCIA):
architecture for medical image segmentation, in: Deep Learning in Medical Image maintaining and operating a public information repository, J. Digit. Imaging 26
Analysis and Multimodal Learning for Clinical Decision Support, Springer, 2018, (6) (2013) 1045–1057.
pp. 3–11. [38] S. Dev, M. Hossari, M. Nicholson, K. McCabe, A. Nautiyal, C. Conran, J.
[21] D.L. Pham, C. Xu, J.L. Prince, Current methods in medical image segmentation, Tang, W. Xu, F. Pitié, Localizing adverts in outdoor scenes, in: 2019 IEEE
Annu. Rev. Biomed. Eng. 2 (1) (2000) 315–337. International Conference on Multimedia & Expo Workshops (ICMEW), IEEE,
[22] N. Otsu, A threshold selection method from gray-level histograms, IEEE Trans. 2019, pp. 591–594.
Syst. Man Cybern. 9 (1) (1979) 62–66. [39] S. Dev, S. Manandhar, Y.H. Lee, S. Winkler, Multi-label cloud segmentation using
[23] A. Mustaqeem, A. Javed, T. Fatima, An efficient brain tumor detection algorithm a deep network, in: 2019 USNC-URSI Radio Science Meeting (Joint with AP-S
using watershed & thresholding based segmentation, Int. J. Image Graph. Signal Symposium), IEEE, 2019, pp. 113–114.
Process. 4 (10) (2012) 34. [40] X.-X. Yin, L. Sun, Y. Fu, R. Lu, Y. Zhang, U-net-based medical image
[24] Z.Y. Ho, M. Jain, S. Dev, Multivariate convolutional LSTMs for relative humidity segmentation, J. Healthc. Eng. 2022 (2022).
forecasting, in: Proc. Photonics & Electromagnetics Research Symposium (PIERS), [41] Y. Deng, Y. Hou, J. Yan, D. Zeng, ELU-net: An efficient and lightweight U-net
IEEE, 2021, pp. 2317–2323. for medical image segmentation, IEEE Access 10 (2022) 35932–35941.
[25] S. Dev, H. Javidnia, M. Hossari, M. Nicholson, K. McCabe, A. Nautiyal, C. Conran, [42] O. Ronneberger, P. Fischer, T. Brox, U-net architecture, 2015, URL [Link]
J. Tang, W. Xu, F. Pitié, Identifying candidate spaces for advert implantation, org/abs/1505.04597.
in: Proc. IEEE 7th International Conference on Computer Science and Network [43] C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, Z. Tu, Deeply-supervised nets, in:
Technology (ICCSNT), IEEE, 2019, pp. 503–507. Artificial Intelligence and Statistics, PMLR, 2015, pp. 562–570.
[26] S. Sharma, M. Rattan, An improved segmentation and classifier approach based [44] S. Batra, H. Wang, A. Nag, P. Brodeur, M. Checkley, A. Klinkert, S. Dev,
on HMM for brain cancer detection, Open Biomed. Eng. J. 13 (1) (2019). DMCNet: Diversified model combination network for understanding engagement
[27] L. Mercier, R.F. Del Maestro, K. Petrecca, D. Araujo, C. Haegelen, D.L. Collins, from video screengrabs, Syst. Soft Comput. 4 (2022) 200039.
Online database of clinical MR and ultrasound images of brain tumors, Med. [45] X. Liu, Z. Deng, Y. Yang, Recent progress in semantic image segmentation, Artif.
Phys. 39 (6-Part1) (2012) 3253–3261. Intell. Rev. 52 (2) (2019) 1089–1106.
[46] S. Dev, F.M. Savoy, Y.H. Lee, S. Winkler, High-dynamic-range imaging for cloud
segmentation, Atmos. Meas. Tech. 11 (4) (2018) 2041–2049.
10