C H A P T E R
14
Deep learning in medical image
analysis
Tarun Jaiswal1 and Sujata Dash2
1
Department of Computer Applications, NIT, Raipur, Chhattisgarh, India; 2Department of
Information Technology, School of Engineering and Technology, Nagaland University, Meriema
Campus, Meriema, Nagaland, India
1. Introduction
Medical imaging plays a crucial role in modern health care by offering clinicians important
information about the body’s structure, function, and pathology. Medical imaging is crucial
for diagnosing diseases and guiding surgical interventions, playing a key role in healthcare
decision-making and patient management. Yet, the analysis of medical images frequently
demands specialized knowledge and can be influenced by differences in interpretation,
posing obstacles in terms of precision, speed, and uniformity. Deep learning, a subject of arti-
ficial intelligence (AI) that draws inspiration from the human brain’s anatomy as well as its
operation, has transformed the field of medical image interpretation [1]. Convolutional neural
networks (CNNs) are proficient in automatically extracting features and interpreting images,
allowing for tasks like image classification, segmentation, and detection to be performed
without the use of manually designed features or predetermined rules [2,3]. This ability
has greatly improved the effectiveness and precision of analyzing medical images, providing
physicians with sophisticated tools for diagnosing, planning treatment, and monitoring dis-
eases [4]. Recently, there has been a significant increase in study and advancement in deep
learning methods for analyzing medical images. This growth has been driven by access to
extensive medical imaging datasets, improvements in computer technology, and innovations
in deep learning algorithms [5]. Much research has shown that deep learning models are
effective in various medical imaging techniques such as MRI, CT, X-ray, ultrasound, and
histopathological images. For instance, CNNs have demonstrated exceptional performance
in tasks like tumor detection and segmentation in MRI and CT scans [6]. Recurrent neural
networks (RNNs) have been used for analyzing sequential data, such as interpreting
electrocardiogram (ECG) signals and generating medical reports [7]. Furthermore, generative
Mining Biomedical Text, Images and Visual Features for Information © 2025 Elsevier Inc. All rights are reserved, including those for text
Retrieval 287 and data mining, AI training, and similar technologies.
[Link]
288 14. Deep learning in medical image analysis
adversarial networks (GANs) have been utilized to create artificial medical images for the
purpose of data augmentation and anomaly detection. Although significant advancements
have been made, obstacles persist in the extensive implementation and utilization of deep
learning models in therapeutic settings. The obstacles encompass data scarcity, model inter-
pretability, generalization to varied patient populations, and ethical concerns such as patient
privacy and algorithmic prejudice [8]. To tackle these issues, physicians, academicians, and
technologists must work together to create strong, dependable, and ethically sound deep
learning solutions for analyzing medical images. This study presents an overview of deep
learning methods used in medical image processing, such as CNNs, RNNs, and GANs.
We examine how deep learning is used in different medical imaging techniques and address
the obstacles, potential developments, and ethical issues related to incorporating deep
learning in health care.
2. Related work
Deep learning in medical image processing has experienced significant expansion in the
past few years, resulting in a surge of research investigations and applications in several med-
ical imaging modalities. In this section, we review pertinent research and highlight significant
advancements in the field of deep learning-based medical image processing. Cui et al. [9]
carried out an extensive survey on deep learning in medical image analysis, offering valuable
insights into the development of deep learning architectures and their utilization in different
clinical tasks. The survey encompassed various medical imaging modalities like X-ray, MRI,
CT, ultrasound, and histopathology, showcasing the effectiveness of deep learning
techniques in tasks like image segmentation, classification, registration, and reconstruction.
Winkler et al. [10] illustrated the capabilities of deep neural networks in achieving
dermatologist-level identification of skin cancer, with performance on par with expert
dermatologists. Their research highlighted the capacity of deep learning models to examine
dermatoscopic images and effectively distinguish between both malignant and benign skin
lesions, opening up possibilities for automated skin cancer diagnosis and screening. In the
domain of neurological imaging, Alirr et al. [11] introduced a 3D convolutional neural
network designed for automated segmentation of brain tumors in MRI scans. Their advanced
learning method excelled in tumor segmentation, surpassing conventional techniques and
showing consistency across various tumor forms and imaging protocols.
Furthermore, recent progress in deep learning has expanded to encompass not only
traditional medical imaging methods but also new approaches like digital pathology and
molecular imaging. For example, Kundrotas et al. [12] created advanced deep learning
models to automatically analyze histopathological images. This allowed for precise
identification and categorization of cancerous tissues in biopsy samples.
In addition, Ahmad et al. [13] performed a thorough analysis of deep learning in the
context of population health management. They emphasized the capability of deep learning
methods to process extensive medical image datasets and derive valuable insights for disease
prevention, early detection, and personalized treatment approaches.
II. Computational intelligence for medical image processing
3. Deep learning Architectures for Medical Image Analysis 289
Moreover, deep learning techniques have been utilized to deal with particular issues in
medical imaging, including image denoising, super-resolution, and image registration. For
instance, Peng et al. [14] introduced a deep learning method to improve the quality of
low-dose CT images by reducing noise and radiation exposure. In a study by Huang et al.
[15] a sophisticated deep learning model was created to enhance super-resolution micro-
scopy, allowing for detailed imaging of cellular structures that were previously impossible
to observe. The studies mentioned show significant advancements in deep learning for
analyzing medical images. However, challenges remain such as the requirement for exten-
sively annotated datasets, model interpretability, and the ability to generalize across various
patient groups and imaging conditions. Ongoing research efforts in the field of deep learning
in medical image processing continue to address these problems.
3. Deep learning Architectures for Medical Image Analysis
Medical image analysis has been transformed by deep learning architectures, which utilize
neural networks to autonomously acquire hierarchical characteristics from unprocessed pic-
ture data. This section explores different deep learning architectures used in medical imaging
problems, emphasizing their merits and practical uses.
3.1 Convolutional neural networks
CNNs, which are well known for their capacity to collect spatial hierarchies of features
through convolutional processes, are the cornerstone of deep learning in medical image anal-
ysis [16]. Convolutional, pooling, and fully linked layers are among the layers that make up a
CNN. Local elements like edges, textures, and forms are extracted from the input image by
the convolutional layers by applying filters throughout it. The convolution operation can be
stated mathematically as follows:
X
n
f ðx; wÞ ¼ xi ,wi þ b (14.1)
i¼1
where f ðx; wÞ represents the output feature map, xi denotes the input pixel values, wi denotes
the filter weights, and b denotes the bias term.
3.2 Recurrent neural networks (RNNs)
RNNs are special kinds of architectures for analyzing sequential data. They can find tem-
poral relationships in time-series data or medical image sequences. RNNs keep secret states
to remember past inputs, unlike feedforward networks. Because of this, they work well for
jobs like analyzing ECGs, where understanding how time changes is important for making
II. Computational intelligence for medical image processing
290 14. Deep learning in medical image analysis
a correct diagnosis [7]. Mathematically, the hidden state ht of an RNN at time step t is
computed as:
ht ¼ sðWih xt þ Whh ht1 þ bh Þ (14.2)
where xt represents the input at time step t; Wih and Whh denote input-to-hidden and hidden-
to-hidden weight matrices, respectively, bh is the bias vector, and s denotes the activation
function.
3.3 Attention mechanisms
Attention methods are now very useful in deep learning because they let models focus on
important parts of the input data and hide less important parts. In medical image analysis,
attention architectures help find important traits and make models easier to understand,
which improves the accuracy of diagnosis and clinical decision-making. This part goes
into detail about how attention works in deep learning and how it can be used in medical
picture analysis [17]. It also talks about their mathematical roots and what they have
brought to the field. Attention models enable models to assess the significance of various
components in the input sequence. Mathematically, the attention weight aij for the ith
element attending to the jth element can be computed as:
exp ðen Þ
aij ¼ P (14.3)
t 1 exp ðek Þ
where eij represents the attention score between the ith and jth elements, computed using a
compatibility function such as dot product or scaled dot product.
Various architectural approaches have been utilized in a range of medical imaging appli-
cations, such as identifying lesions, segmenting organs, and classifying diseases. For instance,
in their study, Chen and colleagues (2020) introduced an attention mechanism to detect le-
sions in mammograms. This approach excelled by pinpointing suspicious areas for more
in-depth examination, resulting in top-notch performance. Furthermore, attention mecha-
nisms have been utilized for organ segmentation in MRI and CT scans, allowing for accurate
delineation of anatomical structures for treatment planning and surgical advice. As an
example, Kuang et al. [18] introduced an attention-based network for liver segmentation in
CT images, showcasing better performance in comparison to conventional segmentation
techniques.
3.4 Capsule networks
Capsule networks are a new type of structure created to record hierarchical connections
between features within a picture. Capsule networks utilize vectorized feature representa-
tions to encode spatial hierarchies with greater efficacy compared to CNNs, which rely on
scalar activations [19]. Capsule networks have demonstrated potential in tasks, including
object recognition, position estimation, and image restoration, providing enhanced
generalization and resilience to spatial transformations.
II. Computational intelligence for medical image processing
3. Deep learning Architectures for Medical Image Analysis 291
3.5 Hybrid architectures
Combining various deep learning components like CNNs, RNNs, and attention mecha-
nisms in hybrid architectures helps to utilize the unique strengths of each component and
tackle specific challenges in medical image analysis. These architectures combine spatial and
temporal information to enhance the analysis of medical images with increased accuracy
and reliability [20]. Within this section, we explore the mathematical foundations and practical
uses of hybrid architectures in medical image analysis, showcasing their impact on the field.
3.5.1 Convolutional-recurrent networks
Convolutional-recurrent networks (CRNs) integrate CNNs for spatial feature extraction
with RNNs for sequential data analysis, enabling models to grasp spatial patterns and tem-
poral dynamics in medical images. From a mathematical perspective, the result of a CRN can
be calculated as follows:
ht ¼ RNNðCNNðxt Þ; ht1 Þ (14.4)
where xt represents the input image at time step t; CNN denotes the convolutional neural
network, RNN represents the recurrent neural network, and ht denotes the hidden state at
time step t.
In a nutshell, deep learning architectures provide adaptable instruments for a range of tasks
and modalities of medical picture analysis. Researchers and clinicians can improve diagnosis,
treatment planning, and patient care by utilizing CNNs, RNNs, attention mechanisms, capsule
networks, and hybrid architectures to push the boundaries of medical image processing.
3.6 Challenges and limitations of deep learning in medical image analysis
Despite the impressive advancements and positive outcomes seen in deep learning for med-
ical image analysis, there are still obstacles and restrictions that continue to impede its broad
acceptance and use in clinical environments. This section will delve into these issues and
put forward possible remedies, integrating mathematical concepts and equations as needed.
3.7 Data scarcity and quality
One of the main obstacles in deep learning-based medical image analysis is the limited
availability of annotated data, particularly for rare diseases or specific imaging modalities.
Having a small amount of training data may result in overfitting, causing the model to strug-
gle with new, unseen data. From a mathematical perspective, overfitting is often represented
by the following equation:
Losstotal ¼ Losstraining þ l , PenaltyðqÞ (14.5)
where Losstotal denotes the total loss function and Losstraining denotes the loss on training set, l
is the regularization parameter, and Penalty ðqÞ is the penalty term on the model parameters q.
II. Computational intelligence for medical image processing
292 14. Deep learning in medical image analysis
3.8 Interpretability and explainability
Deep learning models are frequently opaque, which complicates the interpretation of their
conclusions and the comprehension of the rationale behind forecasts [21]. Insufficient inter-
pretability might erode confidence in the model’s results and impede its use in therapeutic
settings. Saliency maps are explainability approaches designed to reveal the specific areas
in an image that have the biggest impact on the model’s prediction.
3.8.1 Saliency maps
Saliency maps visually explain model predictions by emphasizing the most influential
areas of the input image [22]. Saliency maps can be calculated mathematically by gradient-
based techniques like guided backpropagation or gradient-weighted class activation
mapping (Grad-CAM).
vScorec
SaliencyðxÞ ¼ (14.6)
vx
where x represents the input image, Scorec is the model’s output score for class c, and vScore
vx
c
denotes the gradient of the output score with respect to the input image. Saliency maps high-
light image regions that have the greatest impact on the model’s prediction for a particular
class.
3.8.2 Layer-wise relevance propagation
LRP is a method that breaks down the model’s prediction into contributions from individ-
ual neurons or layers, offering a thorough comprehension of how each component of the
network influences the final output [23]. LRP can be defined as the retrograde propagation
of relevance scores from the output layer to the input layer in a mathematical context.
X zij ,Rj
Ri ¼ P (14.7)
j
zik
k
where Ri represents the relevance score of neurons i; zij denotes the connection weight be-
tween neuron i and neuron j, and Rj is the relevance score of neuron j. LRP assigns relevance
scores to each neuron based on its contribution to the model’s prediction.
4. Future directions and opportunities
Deep learning in medical image analysis is set to go further, offering many chances for
creativity and influence. This section explores potential future advancements and prospects
to enhance the capabilities of deep learning in medical imaging, based on insights from recent
research and technology trends.
II. Computational intelligence for medical image processing
5. Conclusion 293
4.1 Multimodal integration
Future research will concentrate on combining data from several imaging techniques,
including MRI, CT, PET, and histology, to offer thorough and supplementary perspectives
on disease pathology. Utilizing multimodal integration techniques such as fusion networks
and attention mechanisms might enhance the precision and effectiveness of diagnostic and
treatment planning [24].
4.2 Federated learning and privacydPreserving AI
Federated learning methods provide joint model training among various institutions while
maintaining patient privacy and data security. Federated learning enables decentralized
healthcare systems to enhance models by combining information from many datasets
without disclosing sensitive data, thus utilizing the collective intelligence of distributed net-
works [25].
4.3 Continual learning and adaptive models
Continuous learning frameworks will allow deep learning models to adjust and develop
over time in reaction to shifting data distributions and clinical situations. Dynamic adaptive
models can improve the scalability, adaptability, and robustness of AI systems in real-world
clinical situations, as stated by Zheng et al. [26].
In summary, the future of deep learning in medical image analysis has great potential to
revolutionize health care, enhance patient results, and progress our comprehension of disease
pathology. By embracing interdisciplinary collaboration, technological innovation, and
ethical considerations, researchers and clinicians may leverage the power of deep learning
to solve urgent difficulties in medical imaging and pave the way for a more customized
and accurate method for health care.
5. Conclusion
Deep learning is a crucial tool in medical image analysis, transforming health care by inter-
preting, diagnosing, and treating different medical diseases. This work has examined the pre-
sent state of deep learning in medical image analysis, demonstrating its adaptability and
efficiency in various tasks. Deep learning models such as CNNs, RNNs, and GANs have facil-
itated the automated interpretation of medical pictures from many sources, but obstacles
remain. Challenges such as limited data availability, difficulties in interpreting results, pri-
vacy issues, and high computational requirements require collaboration across many fields
and the establishment of ethical guidelines for responsible use. Promising potential in multi-
modal integration, explainable AI, federated learning, continual learning, domain-specific ar-
chitectures, real-time point-of-care applications, and collaborative research initiatives are
II. Computational intelligence for medical image processing
294 14. Deep learning in medical image analysis
anticipated in the future. Deep learning has the ability to greatly improve healthcare delivery,
boost patient outcomes, and increase our knowledge of human health and disease by making
use of these opportunities and overcoming current restrictions. Deep learning is a powerful
tool in medical image analysis that provides healthcare professionals, researchers, and
patients with new and advanced insights and skills. Through ongoing innovation and
collaborative endeavors, deep learning will persist in influencing the future of health care,
advancing toward more efficient, effective, and fair medical practices.
References
[1] G. A. Cheikh, A. B. Mbacke, S. Ndiaye, Deep learning in medicalimaging survey, CEUR Workshop Proceedings,
2647 (2020) 111e127, [Online]. Available: [Link]
[2] C. Jiang,G. Goldsztein, Convolutional Neural Network Approach to Classifying the CIFAR-10 Dataset, Journal
of Student Research 12(2) (May, 2023) [Link]
[3] P. Shruti, R. Rekha, A review of convolutional neural networks, its variants and applications, in: 2023 Interna-
tional Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), February, 2023,
pp. 31e36, [Link]
[4] D. Chicco, R. Shiradkar, Ten quick tips for computational analysis of medical images, PLoS Computational
Biology 19 (1) (January, 2023) e1010778, [Link]
[5] S. Morrison, C. Gatsonis, A. Eloyan, J.A. Steingrimsson, Survival analysis using deep learning with medical
imaging, International Journal of Biostatistics 0 (June) (2023), [Link]
[6] X. Jiang, Z. Hu, S. Wang, Y. Zhang, Deep learning for medical image-based cancer diagnosis, Cancers 15 (14)
(Jul. 2023) 3608, [Link]
[7] D.Y. Alyunov, Recurrent neural network for controlling the spectrum width of a non-stationary random signal,
Vestnik Chuvashskogo Universiteta (2) (June, 2023) 5e17, [Link]
[8] F. Zola, J.L. Bruse, X.E. Barrio, M. Galar, R.O. Urrutia, Generative adversarial networks for bitcoin data augmen-
tation, in: 2nd Conference on Blockchain Research and Applications for Innovative Networks and Services
(BRAINS), September, 2020, pp. 136e143, [Link]
[9] H. Cui, L. Hu, L. Chi, Advances in computer-aided medical image processing, Applied Sciences 13 (12) (2023),
[Link]
[10] J.K. Winkler, et al., Assessment of diagnostic performance of dermatologists cooperating with a convolutional
neural network in a prospective clinical study: human with machine, JAMA Dermatology 159 (6) (2023)
621e627, [Link]
[11] O. Alirr, R. Alshatti, S. Altemeemi, S. Alsaad, A. Alshatti, Automatic brain tumor segmentation from MRI scans
using U-net deep learning, in: BioSMART 2023 - Proceedings of the 5th International Conference Bio-
Engineering Smart Technology, 2023, pp. 1e5, [Link]
[12] M. Kundrotas, E. Mazonien_e, D. Se sok, Automatic tumor identification from scans of histopathological tissues,
Applied Sciences 13 (7) (2023), [Link]
[13] A. Ahmad, A. Tariq, H.K. Hussain, A. Yousaf Gill, Revolutionizing healthcare: how deep learning is poised to
change the landscape of medical diagnosis and treatment, Journal of Computer Networks, Architecture and
High Performance Computing 5 (2) (2023) 458e471, [Link]
[14] S. Peng, et al., Noise-conscious explicit weighting network for robust low-dose CT imaging, in: Medical Imaging,
2023, p. 127, [Link]
[15] B. Huang, et al., Correction: enhancing image resolution of confocal fluorescence microscopy with deep learning,
PhotoniX 4 (2) (2023), [Link] PhotoniX, vol. 4, no. 1, pp. 1e22, 2023,
[Link]
[16] S. Nazir, M. Kaleem, Federated learning for medical image analysis with deep neural networks, Diagnostics 13
(9) (2023), [Link]
[17] J. Du, K. Guan, Y. Zhou, Y. Li, T. Wang, Parameter-free similarity-aware attention module for medical image
classification and segmentation, IEEE Transactions on Emerging Topics in Computational Intelligence 7 (3)
(2023) 845e857, [Link]
II. Computational intelligence for medical image processing
References 295
[18] H. Kuang, D. Yang, S. Wang, X. Wang, L. Zhang, Towards simultaneous segmentation of liver tumors and intra-
hepatic vessels via cross-attention mechanism, in: ICASSP, IEEE International Conference on Acoustics, Speech
and Signal Processing - Proceedings, 2023, pp. 1e5, [Link]
[19] J. Hollósi, Á. Ballagi, C.R. Pozna, Simplified routing mechanism for capsule networks, Algorithms 16 (7) (2023)
336, [Link]
[20] Y. Liu, et al., An improved hybrid network with a transformer module for medical image fusion, IEEE Journal of
Biomedical and Health Informatics 27 (7) (2023) 3489e3500, [Link]
[21] P. Barbiero, et al., Interpretable neural-symbolic concept reasoning, ArXiv abs/2304.14068 (2023) [Online].
Available: [Link]
[22] Q. Li, Saliency prediction based on multi-channel models of visual processing, Machine Vision and Applications
34 (2020) [Online]. Available: [Link]
[23] X. Wu, Z. Fan, T. Liu, W. Li, X. Ye, D. Fan, LRP: predictive output activation based on SVD approach for CNN s
acceleration, in: 2022 Design Automation & Test in Europe Conference and Exhibition, 2022, pp. 831e836
[Online]. Available: [Link]
[24] H. Martin-Leo, et al., Imaging bridges pathology and radiology, Journal of Pathology Informatics 14 (2023),
[Link]
[25] X. Wang, J. Hu, H. Lin, W. Liu, H. Moon, M.J. Piran, Federated learning-empowered disease diagnosis
mechanism in the internet of medical things: from the privacy-preservation perspective, IEEE Transactions on
Industrial Informatics 19 (2023) 7905e7913 [Online]. Available: [Link]
252649700.
[26] G. Zheng, S. Lai, V. Braverman, M.A. Jacobs, V.S. Parekh, A framework for dynamically training and adapting
deep reinforcement learning models to different, low-compute, and continuously changing radiology deploy-
ment environments, ArXiv abs/2306.05310 (2023) [Online]. Available: [Link]
CorpusID:259108791.
II. Computational intelligence for medical image processing