0% found this document useful (0 votes)
29 views16 pages

ANNand Its Applications

This paper discusses Artificial Neural Networks (ANNs), their core concepts, training processes, and applications in Computer Science, Security, and Health Care. It highlights the challenges in designing optimal ANNs and explores their impact across various fields, including image recognition, natural language processing, and healthcare diagnostics. The paper also outlines future directions for advancements in ANN architecture and applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views16 pages

ANNand Its Applications

This paper discusses Artificial Neural Networks (ANNs), their core concepts, training processes, and applications in Computer Science, Security, and Health Care. It highlights the challenges in designing optimal ANNs and explores their impact across various fields, including image recognition, natural language processing, and healthcare diagnostics. The paper also outlines future directions for advancements in ANN architecture and applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Artificial Neural Network and Its Applications

Alexandros Vasileiadis, Eirini Alexandrou, Lydia Paschalidou, Maria Chrysanthou, Maria Hadjichristoforou

Abstract—This paper focuses on Artificial Neural Networks backpropagation (BP) training algorithm programmer. Despite
(ANNs) and their applications. Initially, it explores the core the numerous training techniques, establishing an optimal
concepts of a neural network (NN), including their inspiration, ANN for a particular application remains a notable challenge.
basic structure, and training process, along with an overview of This challenge persists from compelling evidence from both
the most commonly used models. Additionally, the paper delves biological and technical perspectives, suggesting that the
into the three fields that ANNs play an important role: (1) effectiveness of an ANN in manipulating knowledge is
Computer Science, (2) Security, and (3) Health Care. These fields impacted by its design. [1]
are marked as significant since they hold great impact on various
aspects of society. For each one field, the paper discusses ways
that NNs have been utilised to unravel problems, the This research will focus on neural network applications in
architectures employed, notable applications of NN within the computer science, security, and healthcare. It will explore how
domain and challenges faced because of NNs implementation. ANNs can be used in these fields, delve into their impact and
Lastly, it discusses the future directions of ANNs, exploring challenges, and discuss their potential future.
potential advancements in architecture, models, and applications
across diverse domains. Neural Networks are used in computer science for problem-
solving across various disciplines. Through algorithms, they
Index Terms—Artificial Neural Networks, Neural Networks, can execute various tasks, including image recognition, natural
Core Concepts, Training, models, Applications, Computer language processing (NLP), machine translation, speech
Science, Security, Health Care, Challenges, Architecture, Future
Direction
recognition, and help with developing language translation
systems. Building their success in Computer Science, ANNs
extended their applications into the Security sector. Various
types of ANNs, such as Convolutional Neural Networks
I. INTRODUCTION (CNNs), Graph Neural Networks (GNNs), and Recurrent
Neural Networks (RNNs), play an important role in addressing
Artificial Neural network (ANNs) is a machine learning security matters, such as Fraud Detection, cybersecurity
model, designed to emulate human decision-making processes threads, and facial recognition. Moreover, ANNs architectures
by simulating how biological neurons work. They consist of have expanded their use into the realm of healthcare.
interconnected layers of units, where data flows through them Capitalising on their abilities, they can help analyse medical
in an orderly sequence. Specifically, it can be categorised into images such as MRIs, CT scans, X-rays and ultrasounds which
three neural layers: (1) an input layer, (2) a hidden layer, and helps us to make early clinical diagnosis. ANNs can also
(3) an output layer. Even though ANNs are the simplified predict epidemic outbreaks, organise patients' health records
version of how our brain works, they are adept at learning to and personalise their medicine.
solve difficult problems through training, using experiments
and observations. Meaning, they are proficient at
comprehending intricate patterns and connections. [1]
II. CORE CONCEPTS
Particularly, ANNs function by manipulating inputs and
adjusting connections between neurons. They execute multiple The concept of ANN’s starts its inspiration from the human
pattern recognitions and mapping tasks. They can rebuild brain, particularly its building blocks, the neurons. The human
stored patterns using partial or noisy inputs, associate a given brain is a powerful tool that can do tasks such as thinking,
pattern with another associated pattern in temporal sequence, recognizing, and solving hard and complex problems, and to
create new patterns for complex problems, and group similar do all of that, the neuron - an electrically excitable cell - plays
patterns into clusters by creating new pattern representatives a big part. Our brain consists of an estimated 90 to 100 billion
for them. [2] neurons, where each neuron is connected with 1 to 10
thousand others, which makes up to 10 interconnections.
There are various types of neural networks, including Neurons communicate by sending electrical and chemical
Feedforward Neural Networks (FNN), Convolutional Neural signals to their neighbours. Signals are sent through the
Networks (CNN), and Recurrent Neural Networks (RNN). neuron's branch, the axon, which further extends into smaller
ANNs find applications across various domains, the greater segments called collaterals. At the end of these collaterals,
part of which engage feedforward architecture ANNs and the neuromuscular junctions known as synapses, form connections
with neighbour neurons, allowing the neuron to transfer a
signal. Meanwhile, on the receiving end, the neuron’s
dendrites receive those signals via the synapses and merge
them within the soma, the neuron body. Neurons with stronger neuron, two of which are also located in the multilayer
synaptic connections have a greater impact on each other. This perceptron architecture (MLP). In the context of these
massive web of billions of interconnected neurons working intermediary layers (known as hidden layers), where hidden
together allows the brain to achieve its amazing abilities. [3] nodes are located, inside which the information transformation
occurs without direct access to the external environment. By
A. Structure of a neural network the same pattern of neuron dynamics, the hidden neurons
In 1943, McCulloch and Pitts, the founding fathers of AI, examine the information, which was transmitted by the input
developed the first mathematical model of a neuron. Through nodes and send it to the output layer. However, MLP has a
an analogy between a nerve cell and an artificial neuron, learning behaviour, which is a lot more complex than a single
where dendrites and dendrites symbolise input and output of a perceptron learning. Despite the increased complexity, the
neuron, synapses portray the weight of a neuron and the learning process is built upon the basis of the simple
activity in soma represents the threshold. perceptron algorithm, and therefore MLP is able to handle the
non-linearities more efficiently. [3]
That is based on the experiences of McCulloch and Pitts, the
concept of a perceptron was introduced by Rosenblatt in 1958.
The breakthrough that marked the early artificial neural
network structures which can monitor, learn, and operate
imitating the human-like learning through example. An
algorithm that gives neurons the ability to learn and at the
same time process information efficiently, that helps them to
learn independently.
The Perceptron is an algorithm which is made by the concept
of having numerical inputs together with the weights and bias.
This produces a weighted summation of the input multiplied
with the weight. It is achieved by the introduction of weighted
bias in the products. The activation function applies
programming to compute and deliver the final value.
Figure 2. MLP showing input, hidden and output layers and nodes
with feedforward links.

B. Training process of neural networks


Output y of the perceptron clarifies whether a weighted sum of
To achieve a network that produces accurate outputs, it must
inputs and the bias exceeds a certain value. if y = 1 is the
first go through a training process. Like a human, an ANN
output, the model will predict that an input belongs to class 1,
learns from examples, so it is important to provide a large
while when y = 0 is the output, it is predicted that the input
amount of data. There are three main approaches to training an
belongs to class 0.
ANN - supervised, unsupervised, and reinforcement.
Even of the fact that the Perceptron represented real progress
Supervised training requires well-defined data with
in the development of artificial neural networks, it had its
corresponding labels, and it is being used to make networks
drawbacks. Perceptrons were capable to learn just linearly
that are capable of making predictions, image classification,
separable data, where one class of objects is positioned on one
market forecasting, and more. One of the algorithms is linear
side of the plane and the other class on the opposite side, as
regression and k-nearest neighbours represent supervised
shown in the figure 1 below, while data in the real world is
learning. Linear Regression is a way of modelling a
usually not linearly separable. As a consequence, perceptrons
relationship between various independent variables and a
have been fraught with the problems of solving the essential
dependent variable by fitting a straight line to observed data.
issues that are important to the society.
While a k-Nearest Neighbours (kNN) is a non-parametric
method which is useful for both classification and regression
tasks and is a way of predicting the output of an instance
based on the majority class of its k neighbours in the feature
space. [4] [5]

Unsupervised training involves unlabelled data that is self-


trained to find the class structure within the data. Such an
approach is especially handy as a tool for tasks, e.g. targeted
marketing, and anomaly detection. The most common
algorithms for this type of learning are the Hierarchical
Figure 1. Linear vs. nonlinear separation.
Clustering and t-distributed Stochastic Neighbour Embedding
(t-SNE). Hierarchical Clustering considers the level of
For non-linearly separable problems, the additional neurons
similarity between the data points and creates a nested
layers are placed after the input layer and before the output
hierarchy of clusters based on a dendrogram. t-SNE, unlike challenges like improved speech recognition and language
Principal Component Analysis (PCA), is a nonlinear technique translation. [10]
of dimensionality reduction that shows aptness for
visualization of high dimensional data well while preserving 4. Radial Basis Function Networks (RBFN):
the neighbourhood structure of data in the lower dimensional Radial Basis Function Networks are being used mostly in
space. [5] [6] numerical analysis. RBF Networks' hidden layer employs
radial basis functions as activation ones. These types of
Reinforcement learning is a process where the network learns connections are good at representing functional forms in
from the interactions with the data set and modifies its multidimensional spaces, and very often are used as the basis
behaviour accordingly in order to get rewards or punishments. in regression and interpolation tasks. RBFNs have applications
This facilitates the network to adapt with the complicated in different areas, and these include function approximation,
tasks it does that need it to guide robots, do game playing, and time series predictions, and financial projections. [11]
make real time decisions. The algorithms used in this area are
Q-Learning and Policy Gradient Methods. Q-learning is a
model-free reinforcement learning algorithm without a III. APPLICATIONS IN VARIOUS SECTORS
specific model that relies on an evaluation of the value of
carrying out a given action in a set state. Meanwhile, the other A. In Computer Science
method, Policy Gradient, directly optimizes the policy
function, whose role is to determine how the agent will act This section covers the usage of artificial neural networks
through reward gradients that are with regard to policy (ANNs) in the field of computer science, highlighting their
parameters. [7] roles in key areas such as image recognition, natural language
processing (NLP), machine translation, automatic speech
recognition, and language translation systems. We'll also go
C. Basic types of neural networks through the algorithms that power them, including the
In this section, we provide an overview of four commonly Backpropagation (BP) algorithm, the Recurrent Neural
used types of neural network architectures: Feedforward Network-Based Optimization Algorithm (RNN-OA), and
Neural Networks (FNN), Convolutional Neural Networks advanced algorithms designed for optimising computer
(CNN), Recurrent Neural Networks (RNN), and Radial Basis network routing. These discussions highlight the remarkable
Function Networks (RBFN). ability of ANNs extraordinary capacity to convert complicated
data into useful insights and choices, demonstrating their
1. Feedforward Neural Networks (FNN): significance on a variety of computational challenges.
Feedforward Neural Networks, or multilayer perceptrons, are
the simplest type of neural network. The structure of FNN A1. Image Recognition
consists of interconnected layers of neurons, where the The development of ANNs has recently taken a crucial role in
information flows only in one direction, starting from the the computer science department, especially in the sense of
input layer, through one or more hidden layers, and to the creating image recognition systems. Due to their extraordinary
output layer. FNN shows a good performance in regression capacities to capture and process high numbers of visual data
and classification in various types of fields, due to its capacity with flawless precision, visual inspection can now be applied
to understand complex relationships between input and output to multiple fields. The implementation of ANNs for image
data. [8] recognition within computer science is primarily based on the
ability of the algorithms to detect, classify, and interpret
2. Convolutional Neural Networks (CNN): images very quickly.
Convolutional Neural Networks are a great type of neural
network for image, audio, and video processing. CNN uses The creation of sophisticated picture recognition systems is
convolutional layers that apply filters to the input data to learn one of the main advances made by ANNs in computer science.
different patterns such as edges, textures, or more complex These systems employ either two or three sided layered neural
models. Due to its ability to recognize patterns, CNN networks, such as Convolutional Neural Networks (CNN), that
revolutionized computer vision tasks, including image independently extract the image's features at various levels of
classification, object detection, and image segmentation. CNN abstraction. For instance, conceiving in the first layers of the
is widely used in healthcare, robotics, and recently in self- network, the edges, and colours could be identified, and as the
driving cars. [9] network goes deeper it could start identifying more complex
shapes and objects within the image.
3. Recurrent Neural Networks (RNN):
Recurrent Neural Networks, being a subset of FNN, are a The ANNs’ strength is their capability to generalise through
perfect type of neural network to work with sequential data, examples. This really illustrates their advantage since they can
where the order of the information is important. RNNs have quantify characteristics that are not ordinarily perceived by a
circular links that allow them to store a complete state in an mere human eye in applications such as imaging, where they
internal module, hence they can deal with temporal can improve the quality of images being captured, identify
dependencies and context. The variant of RNNs, such as objects, and even detect patterns. For example, the speed and
LSTM (Long Short-Term Memory), have addressed accuracy of robots is a feature that is key in manufacturing
development, in which automated visual inspection systems The joint network of ANNs with NLP has transformed the
require high speed operation and precision. very concept of how machines understand and generate human
language, providing immense practical applicability across a
As to the ANNs, they are one of the most important reasons whole range of domains. Integration of ANNs with NLP has
why the computer graphics and vision field has developed so led to the designing of systems for advanced language
fast. One may point to this by mentioning that they are used to understanding that can do anything you can ever think of with
enhancing advanced video games and virtual reality language. They use highly advanced algorithms and
environments. ANNs can comprehend and decode the natural architectures of neural networks to do most precisely and
settings around at every instance, then craft the graphic parts efficiently what is, for the very first time, process, scrutinise,
accordingly. The users will experience the graphic as highly and synthesise human language.
vivid, dynamic, and interactive with their actions.
A prominent example is the Boltzmann machine, created by
Moreover, ANNs not only solve these technical issues, but Terrence Sejnowski and Geoffrey Hinton, which introduced
also provide needed optimization of the current computer new learning mechanisms that can handle the complex
systems to improve their performance. For example, in this linguistic patterns of the domain.
case, they can also expedite the effectiveness of search
algorithms armed with large databases of images by only With symmetric connections inspired by physical phenomena
displaying the ones that satisfy certain conditions which are such as spin-glass, the Boltzmann machine offered a
uncomplicated to deal with manually. mechanism for unsupervised learning that could be applied to
machines recognizing and replicating relationships between
In the area of software development, ANNs can be used to linguistic elements. Topping this development was the Back-
develop intelligent user interfaces whose visual inputs are Propagation algorithm by David Rumelhart, Geoffrey Hinton,
processed and turned into usable information. This is and R. J. Williams, which made it possible for multi-layer
specifically significant in cases when a framework of gesture perceptrons to solve complex linguistic problems. These
acknowledgment is required, in which the system interprets networks, through stimulus and response iteration, could also
physical performance into valuable commands. account for the subtle linguistic differences that issues such as
exclusive OR and the T/C problem pose.
As research progresses, existing ANNs in computer science
can be made to become stronger and more innovative by Practical impacts of these capabilities can be seen in several
taking advantage of the fact that artificial systems can be real-world applications of NLP. For instance, Sejnowski and
designed to gather information and interact optimally with the Charles Rosenberg have trained a network to pronounce
visual world. Such a move will go far beyond bringing the English words correctly and, thus, have shown some potential
computational power to the next level, but also will generate that neural networks have for speech recognition. That is what
new methods of making machines intelligent and responsive opens the promising realm of using ANN for developing
to human-like situation perceptions. [12] [13] automatic speech recognition systems to be used in virtual
assistants, home-based voice-controlled devices, amongst
A2. Natural Language Processing others.
The journey of NLP started with the early theoretical models
by pioneers such as Warren McCulloch, Walter Pitts, and Moreover, Teuvo Kohonen offered another insight about
Frank Rosenblatt. These early researchers laid down the topographical networks for map reading, which provided other
theoretical foundation for the recognition and classification of ways concerning analysing and interpreting linguistic data.
patterns in text data by using computation. Rosenblatt's Some of these improvements include the creation of intelligent
invention of the 'Perceptron' in 1958 was quite revolutionary chatbots, language translation systems, sentiment analysis
because the Perceptron could train neural networks to classify tools, and named entity recognition systems, amongst others.
text patterns. It became the indispensable one for a task like Essentially, practical applications of NLP, artificial neural
text classification and sentiment analysis. networks, have revolutionised how the interaction of machines
In 1961, further development in the field was the proposition with human language allows them to make sense of and
of the 'back-propagating error correction algorithm' by Frank produce text data in ways that would have otherwise been
Rosenblatt. It underscored that in case the level of accuracy considered impossible. ANNs continue to grow and evolve,
was to increase in the recognition of patterns, then the training creating the potential to harness advances in the understanding
of the neural network must be sophisticatedly done. This early of languages and interaction for just about any industry.
work opened the doors to the use of Artificial Neural
Networks for supporting complex tasks in NLP, such as ii. NN Architectures in NLP
named entity recognition or sentiment analysis. The advancement of Natural Language Processing (NLP)
systems, which allows machines to comprehend and interpret
i. Applications of NLP human language more efficiently, is greatly dependent on the
The field of NLP has, in the past decade, tremendously development of neural network architectures. These
evolved with tremendous advancements in ANN to process architectures provide considerable category disparities in
linguistic data. terms of time, manner, and place, in an attempt to mimic the
unusual computational needs of neural networks.
A3. Potential of RNN-OA
Neural network architectures have to transition from The introduction of recurrent neural networks in recent years
conventional serial processing to parallel processing in the has led to a transformative impact in computer science,
realm of time. Neural net computers are capable of processing ushering in a new era of computational efficiency and
many pieces of information at once, but traditional computers problem-solving within this dynamic field. RNNs are
can only process one piece at a time. This capability enables particularly effective in processing sequential data, providing
interactions and corrections where "future affects the past." substantial improvements in areas such as optimization,
This modification is necessary to handle the simultaneous recommendation systems, image processing, and natural
information changes that come with language processing language processing (NLP).
activities.
The success of RNNs in enhancing algorithmic efficiency and
Furthermore, computation in the neural network architectures model performance when dealing with sequential data requires
proceeds in a way to have a differentiation between the digital careful consideration of several factors. In particular, the
(or binary) and analogue processing. The most significant architecture of the model, the tuning of hyperparameters, and
difference: While computational computers produce binary the quality of training data are extremely important for
answers (true or false), the neural net machines take in inputs achieving optimal results.
and give outputs along a range to allow for the full range of
variation in the linguistic stimuli. This is what makes the A significant advancement in RNN-based optimization is the
subtleties and nuances of human language become captured in development of the Recurrent Neural Network-Based
analogue processing, hence making learning natural and Optimization Algorithm (RNN-OA). This method enhances
flexible. the dimensions of processing algorithms through the
application of attention mechanisms, regularisation
In addition, the distributed connectivity seems to call for techniques, and improvements in interpretability. By
information processing in the various classes of neural processing input selectively and maintaining model stability,
network architectures. Unlike traditional computers, where the RNN-OA significantly boosts the efficiency and adaptability
information processing is isolated at unique addresses, neural of algorithms to various problem-solving scenarios.
net machines happen to distribute information connectively
between many addresses, both wholly and partly. This This algorithmic approach also benefits the overall
distributed architecture allows the representation of complex computational process by incorporating fine-tuning, transfer
patterns in the linguistic data, hence improving the learning learning, and other techniques that reduce the computational
ability of the machine in understanding and interpreting load and expedite algorithm development. The efficiency,
language. scalability, and robustness of RNN-OA have been rigorously
tested, showing that it offers significant benefits and has
Such architectural considerations follow three important potential for further improvements.
characteristics set by Hopfield for neural network computers:
large connectivity, analogue response, and reciprocal or re- In practical terms, RNN-OA is applicable to a broad range of
entrant connections. These characteristics give rise to computer science functions, including voice recognition,
computations qualitatively different from those performed by machine translation, and time series forecasting. The
Boolean logic. evaluation of its efficiency and scalability involves the use of
specially developed frameworks and mathematical models.
In practical applications, several neural network architectures These models take into account dynamic learning, model
have impressively addressed a wide range of natural language stability, adaptability to various data sources, and sensitivity to
problems, from low-level phonology to high-level syntax. For input fluctuations.
example, Rumelhart did experiments in the prediction of
English verb morphology, and Sejnowski and Rosenberg The integration of RNNs into computer science marks a
developed a model of phonology. Similarly, companies such significant leap forward, and innovative approaches like RNN-
as Nestor Corporation have manufactured tablets that OA have boundless potential to expand their application
recognize handwritten input, while Neural Tech has further. Continuous improvements in RNN-based methods are
introduced products that can recognize teaching and learning setting high expectations for computational advancements that
input in more than one natural language. promise extensive benefits for academia and industry. [15]
While these are certainly indicative of what may be promised
by neural network architectures within NLP, it has to be A4. The Back-Propagation Algorithm in Computer
understood that they remain experimental and years away Science and Applications
from realising wide usage. However, this ongoing experiment Back-Propagation (BP) is an algorithm used to train artificial
and innovation within the research of neural networking neural networks through a method of error correction based on
signals is a significant advancement in the understanding of the previously computed errors. From a computer science
natural language and interaction that may revolutionise perspective, BP modifies weights allocated to a multilayer
teaching, translation, and communication. [14] network according to the actual computation of error that
happened during the previous iteration. The role of BP in
computer science applications is mainly to minimise the error
rate when predicting the outputs and is largely used in solving Hidden Layer Neurons: For neurons in the hidden layers, the
more complex problems like image recognition, autonomous error is propagated back from the output layer, and the error
vehicles, and natural language processing. term is calculated using:

Back-Propagation (BP) is an algorithm used to train artificial


neural networks through a method of error correction based on
the previously computed errors. From a computer science
perspective, weight modification is executed with the basic Where 𝑤 are the weights connecting neuron 𝑗 in a hidden
help of BP in the process of network iteration. The role of BP layer to neuron 𝑘in the subsequent layer, 𝛿 is the error term
in computer science applications is mainly to minimise the
for neuron k in the layer above, and 𝑓′ 𝑧 is the derivative of
error rate when predicting the outputs and is largely used in
solving more complex problems like the large-scale process of the activation function at the output of neuron 𝑗, 𝑧 .
image recognition, autonomous vehicles, and natural language
processing. Weight Update Rule: The weights are updated by moving
against the gradient of the error function, which is computed
i. How BP works
The BP algorithm consists of two main phases: the forward
pass and the backward pass. During the forward pass, the
input data passes through the network layer after another,
starting from the input layer to the network’s output layer with for each weight as follows:
some initialised weight in the matrices and vector form. Where 𝜂 is the learning rate, 𝛿 is the error term for the neuron
Further, the resultant values of each layer are then passed to 𝑗, computed as shown above, and 𝜊 is the output of the
the next layer. Subsequently, some predicted output of the previous layer's neuron 𝑖, which is connected to the neuron 𝑗
actual output data is made from the output layer generated by the weight 𝑤 .
during the forward pass. The next step is the backward pass,
which is just a direct influence of the forward pass process. These formulas are essential because they direct the iterative
The error is to be used for iteration; hence it is made against weight modifications that enable neural networks to learn
the actual output so as the error is made using the back pass. from their mistakes and gradually increase their accuracy.
The weights are fine-tuned in that they minimise the error
value. Some differentiation is involved concerning the partial iii. Applications of Back-Propagation Neural Networks
margin of change in the weights. This is generally calculated in Computer Science
using calculus, or partial derivatives to be more precise. a. Image and Speech Recognition:
The most promising application of BP networks in computer
science is image and speech recognition. Since BP networks
can handle large amounts of data and patterns, they can be
used for image recognition and interpretation technologies and
spoken word recognition. BP networks are often used in
classifying images into categories, recognizing people’s faces,
or interpreting scenes. In spoken language recognition, the
network helps develop service assistants or real-time
translators that hear and learn from a large number of
Figure 3. Back-Propagation algorithm workflow for neural network phonemes and intonation with excellent accuracy.
training.
b. Natural Language Processing
ii. Key Formulas in BP
BP neural networks are essential for processing natural
The main computation in BP involves adjusting the weights,
language. They assist computers in comprehending and
which is done using the gradient descent optimization
interpreting human language so that when the computers
algorithm. The error term for each neuron is calculated during
produce human language, it is meaningful and appropriate for
the backward pass, starting from the output layer, and moving
the context. Sentiment analysis, in which networks examine
backward through the network. This calculation for each
text from social media or reviews to assess the sentiment
neuron depends on its role (output layer vs. hidden layer):
expressed, is one application. Furthermore, BP networks
facilitate machine translation by enabling translation without
Output Layer Neurons: For each output neuron, the error
the need for rule-based programming, as they leverage
term (δ) is calculated by:
extensive datasets from pre-existing translations.

Where 𝑦 is the actual output for neuron 𝑗, 𝑦 is the predicted c. Game Development and Strategy Planning
BP networks are used to build stronger, more adaptable AIs in
output for neuron 𝑗, and 𝑓′ 𝑧 is the derivative of the
the gaming industry. The network learns from huge databases
activation function applied at the output of neuron 𝑗, 𝑧 . of people playing; it can easily predict human behaviour and
offer the challenge or counteraction in the game without the
AI “cheating”. BP networks assist in real-time decisions learning algorithm, which, in turn, operates on the principle of
within the game, helping optimise the engine performance via artificial reproduction of the brain’s structural units. Such
predictive modelling. networks allow simulating many neurons’ joint work and
choosing between several options based on historical data.
d. Autonomous vehicles
The science behind autonomous vehicles is one of the most iii. Methodology
critical areas where BP neural networks have led to significant The designed ANN model includes a layered network
technological advancement. BP neural networks take input structure in which each node represents a decision location
from sensors and cameras fitted on a vehicle and make real- during the routing of data packets along the node paths. A
time decisions about how to navigate and, at a higher level, network is trained off samples of the appropriate network
recognize road signs and avoid obstacles such as other conditions and routing selections that enable it to discover the
vehicles, facilitating safe driving. Because BP networks can most effective routing patterns. Figure 2 below portrays the
learn from different circumstances and scenarios, without this neural network’s architecture which is featured with input,
technique, the innovation and improvement of autonomous hidden, and output layers. The neural network will implement
driving is impossible. the decision-making process based on these layers.

e. Robotics
Another obvious application in which BP neural networks
have expanded the frontier of computational science is in
robotics, particularly for robots performing complex tasks in
which the environment changes continuously. Assemblers,
most missions, hazard sensing, space exploration and even
surgery are some of the required tasks because these robots
interact with their surroundings in real-time and gain
experience by a neural network model to execute tasks with
improved precision and efficiency.

The above scenarios are just a few examples that indicate how
BP algorithms can be used to model intricate patterns and
make informed patterns. Given their history of innovation and
the optimal efficiency of their models, BP models are
unquestionably going to be the cornerstone of modern
computational sciences. [16]
Figure 4. Architecture of the Neural Network for Routing
A5. Computer Network Routing Optimization Optimization
Algorithm
The development of Internet technologies has not only The ANN employs a probabilistic model to dynamically form
changed the way we live, but also led to the emergence of a network connections, as described by the following equation:
particularly urgent problem – the need for high-quality
network infrastructure. Due to the steady increase in network
requirements, one of the most pressing concerns is the
optimization of the network routing process. Traditional
technologies rarely cope with network complexities, which
forces researchers to seek new trends such as Artificial Neural where 𝛱 𝑖 is the probability that a new node 𝑖 will connect to
Networks that optimise routing. an existing node, 𝑘 is the degree of node 𝑖, and 𝑘 represents
the degree of node 𝑗. This formula helps the ANN in
predicting the most efficient pathways by optimising the
i. The Challenge of Network Routing network topology based on the likelihood of node connections.
Present-day computer networks are complex systems, the Additionally, the system delay model used to minimise latency
packets of which pass through several nodes before reaching
the final destination. In conditions of growing network traffic,
quality routing becomes a necessary condition because
otherwise, there is an opportunity for congestion and data loss.
At the same time, the difficulty of the routing situation, which and optimise routing is given by:
depends on factors such as network topology and load, where 𝑇𝐶 is the total system delay, 𝑇 represents the
suggests that a dynamic solution will be optimal. transmission delay, and 𝑡𝑏 represents the delay experienced
due to inadequate bandwidth availability, signifying the
ii. ANN-Based Optimization Approach waiting time for data transmission. 𝑃 , symbolises the delay
This section describes a new method using Artificial Neural induced by queuing at the Mobile Edge Computing (MEC)
Networks to optimise network routing. It uses the ANN infrastructure, and 𝑡 signifies the delay attributed to task
execution by the MEC server. These components are crucial in fraud detection is not just a representation of their aptitude
for evaluating the efficiency of different routing paths and are for detecting sophisticated patterns in large data sets but a
integral to the ANN’s decision-making process. profound illustration of their excellence. The neural networks,
namely neural networks that are built based on the
iv. Simulation Results transactional data, user behaviour, and historical patterns, are
A comparative assessment according to the traditional routing capable of spotting anomalous activities that could be
of the network demonstrates the much higher efficacy of the fraudulent behaviour. They are rather capable of adjusting to
new proposed model. According to the results, the ANN fresh cases of fraud and learning from these new instances.
model practically reduces packet loss and delay to zero and [18] As a result, the anti-fraud mechanism is constantly
does not require human intervention, which can significantly improving its detection algorithms, and therefore, the efficacy
increase its effectiveness in terms of routing. of fraud prevention technologies goes up. It thus facilitates
eliminating the requirement of labour-intensive manual feature
v. Applications and Future Work engineering, which may also be very time-consuming and
The ANN-based routing optimization model has extensive domain specific. In addition, deep learning approaches are
prospects in that it can be applied to both small corporate good at handling multidimensional data and finding hidden
networks and international Internet backbones. The relationships, especially the complex and hidden ones, which
programmed scalability allows the use of this algorithm for give the system a unique feature of identifying subtle and
modern dynamic telecommunications. Future work will covert signs that characterise the fraudulent behaviours.
involve the reduction of dependency on human data and the Different deep learning architectures, such as the
rapid response of the network to the situation. [17] convolutional neural networks (CNN), and the graph neural
networks (GNN) are used to detect financial fraud in recent
times. These models are used in different types of financial
B. In Security systems like detecting credit card fraud, insurance, and money
ANN adaptability and efficacy have rendered them laundering. Most notably, deep learning models have been
indispensable in safeguarding critical systems, combating consistently outperforming classic approaches, with a success
fraudulent activities, and enhancing security measures across rate of around 99%. [19] [20].
diverse domains.
This section delves into the multifaceted applications of neural i. CNN in Fraud detection
networks in security, focusing on three pivotal areas: fraud Convolutional Neural Networks (CNN) is a popular deep
detection, anomaly detection in cybersecurity, and facial learning algorithm, which shows good results in finding
recognition for security purposes. The integration of neural unobservable features of dubious transactions and helps to
networks in these realms not only augments traditional avoid overfitting of the model. The CNN algorithm has three
security measures but also empowers organisations to main layers which are: Convolution layer, pooling layer, and
proactively mitigate risks and fortify their defences against fully connected layer constitute the neural network. Normally,
evolving threats. the role of the convolution and pooling layers is to perform
feature extraction. The third layer which is known as the fully
Neural networks have especially much to offer in the security connected layer performs the operation of mapping the
field by focusing on the logic of input-output relationship extracted features into its final output, such as classification.
surface and the depth learning process inspired by the human [19] [21]
brain, they acquire knowledge by learning and storing it
within connection strengths between the neurons, recognized
as synaptic weights. Different from traditional fit to purpose
linear models, neural networks demonstrate their flexibility to
non-linear and linear correlations while not using intermediate
variables to model the reality. This capability is proven to be
vital in cybersecurity in situations where the risks develop at a
great speed and display nonlinear characteristics. Through the
use of networks simulating actual biological systems, security
infrastructures can naturally adapt to new menaces, grasping Figure 5. Overall Network Structure
the patterns and misconforms in real time to fortify the shield
and hedge risks that might occur. [18] The design of network structure is intended to make it possible
for applying the analytical tool to network transaction data and
B1. Fraud Detection for the identification of criminal financial activities in a short
Financial fraud has continued to be an enduring threat that is time. In essence, we have an input feature sequencing layer, a
faced in the financial sector, and this assails on the group of four convolutional layers interlaced with pooling
individuals, institutions, and economies greatly. The deep layers, and a fully connected layer (Fig. 1). The next task is
neural networks that exhibit this capability are known for the feature sequencing layer; a layer operated through which
autonomous learning of complex patterns and representations the input features are processed according to their orders.
from raw data, therefore this technique could be very effective Distinction of effects are accumulated on the model whenever
in addressing this issue. The performance of neural networks different order feature input layers are convoluted. The
filtering function of the convolutional layer is to detect the financial institution with the means to tackle financial fraud
local feature of the input data; in this context, developers efficiently and effectively by being more proactive.
would benefit from the new computed features based on the
input features. These new attribute items that are not defined B2. Anomaly Detection in Cybersecurity
physically but are certainly useful in the data modelling The cybersecurity domain is nowadays being challenged by
domain, they are. Pooling helps to combine the features from non-trivial attacks, whose skilling development is advanced.
the adjacent areas into a single higher-level feature which is This is why the research in defence mechanisms is now
more efficient and makes use of less of the data. The final booming. Traditional detection systems that are designed to
layer, which is fully connected, is responsible for work only with attack templates are not effective enough when
classification of stocks. The number of nodes in each layer of it comes to the development of new threats or changing attack
a neural network varies from one input to another. The trained strategies, which has already resulted in search for better
networks model will get the optimised model parameters from dynamic and smart solutions. The fact that machine learning
the training data. The optimised model parameters also can be techniques including the neural networks are used as a good
directly applied to the detection of real trading data in a real option to strengthen intrusion detection systems and those
time. [22] systems have the ability of learning and reacting to new
threats in (a) real-time has been a positive sign. Through the
ii. GNNs applied for financial fraud detection application of the data science and analytics, cybersecurity
Graph neural networks (GNN) is grasping a larger pool of experts can obtain more and more useful data from the vast
users as they discover their utility in learning about graphs. data set, which will make the defence mechanism more
The structure of the graph naturally supports strong problem- effective and the digital fortress also stronger because of the
solving and modelling of complex relationships between continual cybersecurity threats evolution. Neural networks
nodes through message passing and agglomeration. [23] provide an alternative solution by resorting to their ability to
The graph applied in the case of financial fraud detection observe the smallest disparities with well-established norms.
scenario is usually made-up of nodes that refer to accounts and Shifting from reactive detection to proactive detection, neural
edges which represent transactions. Every node means a networks automatically process historical information and
financial account including the examples of bank account, datasets containing malicious behaviour patterns, thus being
credit card account, or any financial institution implicated in a more capable of identifying and mitigating cyber threats in
transaction. Nodes can possess values, namely type of real-time.
account, transaction history, current balance, account owner
information, and other data applicable to fraud detection. All i. RNN in cybersecurity
ripples in between correspond to a financial exchange between RNN, or recurrent neural network, which is a subset of neural
two accounts. The edge label displays the transaction amount networks, features loops within its nodes, forming a directed
transferred, in relation from account A to account B. Edges graph. This structure enhances its status as a network. This
may be linked with weighted attributes representing the subject allows us to demonstrate the recognition of the
quantities’ transfers or transactions annotations (e.g. dynamic behaviour that is carried out in the sequence. The
transactions mechanism in certain occasions or the transferred internal memory serves as a place where the sequence of
sums). activations is processed, that way they can conduct both back
and forward transmission by forming feedback loops in the
The graph neural networks achieve this by using message network. Gradients are more complicated to deal with when
passing procedures where it disseminates information across training RNNs, however. Nevertheless, the progress attained
the network edges thus processing information in a way that in architecture and training as-of-today yielded different
encapsulates the graph topology and relationships of the RNNs. The model is a little bit easier to train as it is. LSTM
nodes. GNN gives fraud scores to the node or transaction as it (long short-term memory), the improved one of RNN, was
does graph embedding operations on the financial transaction proposed in 1997 as they were put forward by Hohenreiter and
graph and learning its features. These suspicion scores are the Schmidhuber. LSTM is the first step of a new revolution on
variables that are determined for the accounting of these speech recognition and incredible success on some traditional
systems in order to be exposed to fraud. The GNN was used to models in niche applications. It serves to overcome the only
guess fraud scores and a threshold that separated ordinary and drawback of RNNs, in short-term memory. LSTMs, with
suspicious transactions was applied. Fraud scores higher than several neurons connected to the previous time unit. The
a certain threshold is the sign to put the transactions on stake, memory accumulator is the term that defines the configuration
and they are investigated deeper. The boundary value might be of units responsible for collecting the information and is called
computed from the data distribution to avoid the occurrence of a memory cell [24] [25]. In Deep Learning Based Multi-
either false positives or false negatives while optimising for Channel Intelligent Attack Detection for Data Security [26]
the necessary intervals within the domain knowledge. the authors recommend the following algorithm as seen
Cooperation between automated detection from the GNN and below:
the expertise of professional human analysts, will provide any
Algorithm 1: Training Neural Network widespread use in border security, access control systems,
----------------------------------------------------------- monitoring and enforcing the law. This helps in addressing
Input: Features X extracted from the training security related issues but at the same time making privacy
dataset with labelled information and accuracy a top priority. The utilisation of people’s faces in
the photos to give rise to the increasing interest among the
Initialization: scientists is a factor which is due to their application interests
1. for channel = 1 to N do as well as the challenge that this presents to artificial vision
2. Train LSTM-RNN model algorithms. The specialists have to be ready to deal with the
3. Save the LSTM-RNN model as a classifier c extremely high diversity of the features of faces, as well as of
4. end for the many different parameters of the image (angle, lighting,
hairstyle, facial expression, background, etc.). Currently, the
Return: c most widely recognized face recognition methods utilise
Convolutional Neural Networks. It describes the architecture
The detection algorithm is described by pseudocode, given as of a Deep Learning model which allows the enhancement of
Algorithm 2. the existing best programs in terms of accuracy and processing
time.
Algorithm 2: Attack Detection
----------------------------------------------------------- i. CNN in Facial Recognition
Input: Feature X extracted from test dataset with The said network is composed of two convolutional layers,
labelled information then a fully connected layer and at last classification layer.
Every layer of convolution is succeeded by an activation layer
Initialization: and a carpooling operation. Also, two regularisation
1. for channel = 1 to N do techniques after each convolution layer are added: batch norm
2. Load LSTM-RNN model as a classifier and dropout. The fully connected layer is then applied
3. Get the result vector R of the classifier followed by the dropout technique which is to reduce
4. end for overfitting and to improve the performance of the proposed
neural network model. [27]
Vote to get the majority element v:
1. for r in R do While for image processing or any sort of prediction, which is
2. Vote to get the majority element v associated with image, a convolutional neural network is first
3. end for of all the choice. A standard convolutional neural network
would constitute of a number of simple layers, which may be
Return: v repeated n times in the network depending on the topic that is
to be predicted [28] [29]. The first layer consists of a
Algorithm 1 presents the process for training a network that convolutional layer populated with some filter that will be
will have a Long Short-Term Memory Recurrent Neural applied to the pixels of the image.
Network (LSTM-RNN) model. From the labelled training
dataset features X this requires are taken in as the input. The Usually, the image should be larger relative to the filter
algorithm gets started with setting up the LSTM-RNN model applied to it. From the beginning to the end of the image, the
for each channel in the dataset. It performs the process of filter goes in the horizontal and vertical directions, one step at
looping over all the channels, trains the LSTM-RNN network a time, the values of the convolutional layer are calculated
model, and saves the trained model in the classifier. After that, with a dot product method. The generated convolutional layer
it returns the classifier c that can make predictions. Algorithm results are then passed to the next layer called pooling layer.
2 explains the detection scheme with the classifier made using Through this process, the dimensions of values taken from the
LSTM-RNN which is learned from Algorithm 1. It reviews previous layer are actually the features we have extracted to
the test data set that comes in as a featured data X including better describe the image. The same needs to be approached
the labelled data. The algorithm introduced begins with using a pooling filter which smoothly scans the output of the
classifying the specified LSTM-RNN model as a classifier of previous output. Conditioned on the topic to be predicted, a
each channel. It then gets the R vector indicating results of convolutional layer and successive pooling layers are
evaluation through the classifier by using the test dataset. repeatedly applied to produce the desired output.
Continuing, it goes through all elements of R by applying the Subsequently, the subset is exposed to the compression stage,
voting method to determine the value v as the element of where after it is pooled, the final dimension is flattened out.
majority. It finishes by returning the element v as the result of Such output from the first layer goes to the next layer which is
the attack detection process. [26] fully connected, and the prediction is done; finally at the last
layer, the predicted output can be seen. In the present study, an
B3. Facial Recognition for Security Purposes exhaustive search of the data from the image is going to
Facial recognition is the most critical function of video produce around 68 key points which is the main asset of the
surveillance systems, which makes it possible to determine study. It is evident that the overall CNN model can be
whether the image is that of a person in a scene, and mostly extracted from the given Fig. 1 to understand the structure of
monitored through a network of cameras. Such application has the CNN. The image will be pre-trained in the proposed CNN
architecture which hasn’t been done in the previous stage [30] discussing the diverse range of their applications across
[31]. The RGB-formatted input image that uses colour space various medical fields, as well as analysing the challenges of
from [0,255], will be converted to grayscale so that it changes applying deep learning in healthcare.
to [0,1]. To maintain the consistency of the original
information- it has a resolution of 224*224 pixels -, this C1. Architectures
grayscale data is resampled to the standard pixel size [32] [33] This section describes the various neural network architectures
[34]. The task is to apply appropriate formatting steps. After adapted for healthcare applications. While Convolutional
that, the convolution model accepts the image. Human figure Neural Networks (CNN) and Recurrent Neural Networks
key point extraction was achieved by the use of the given (RNN) are extensively used in healthcare, this section will
figure, which is the architecture of the CNN model in Fig. 6. focus on Autoencoders (AE), Restricted Boltzmann Machines
(RBM) and Long Short-Term Memory (LSTM).

i. Autoencoders (AE)
Autoencoders are one of the deep learning models that
illustrate the idea of unsupervised representation learning.
Initially, they were introduced as an early tool used to pre-
train supervised deep learning models, when labeled data was
uncommon. Despite that, they kept usefulness for
Figure 6. CNN architecture for Facial Key point Prediction unsupervised procedures such as the phenotype discovery
[36]. Explicitly, autoencoders are divided into two main parts
B4. Challenges the encoder and the decoder. The encoder consists of an input
Application of neural networks to security, on the other hand, layer, while the decoder comprises an output layer [37].
is fraught with a lot of challenges even with the effectiveness Moreover, they possess a similar number of nodes for both
of it. There is one prominent drawback of neural network input and output, and the number of units that are not visible is
models; it is in the paring of the network architecture. When less than that of the input or output layers, which achieves the
carrying out some studies researchers have noticed that the whole purpose of AE. Autoencoders are designed to encode
number of layers in the model can be affected in a negative the input data into a lower dimensional space [38]. By training
way through a decrease in accuracy. [20] Here is a an AE on a dataset, they are able to transform the input data
manifestation highlighting the importance of the model (model into a format focused only on storing the most important
class) architecture by demonstrating how it affects the derived dimensions. In this way, they bear resemblance to
accuracy; hence, an appropriate model class architecture and standard dimensionality reduction techniques, for instance, the
tuning are required. Ensuring that they keep up with the latest singular value decomposition (SVD) and the principal
algorithms and solutions for neural networks for organisations component analysis (PCA). However, autoencoders have an
that are prone to financial abuse is also critical. [35] The important advantage for complicated problems on account of
malicious changing nature of fraud schemes will continue to nonlinear transformations by each hidden layer’s activation
pose a challenge for financial institutions since the criminals functions, but one hidden layer of an autoencoder could
are always devising new means to carry out their scams. In potentially be insufficient to represent all the data if the input
other words, although neural networks leverage very attractive is of high dimensionality.
tools for fraud detection, anomaly detection etc, their
incorporation necessitates in-depth comprehension of their Additionally, autoencoders when stacked on top of each other
capabilities, defects, and latest developments to make them an are able to construct a Deep Autoencoder (DAE) architecture.
excellent weapon against crimes.
Numerous mutations of AE have been proposed to convert the
B5. Conclusion acquired representations into something more robust and
As we navigate an increasingly interconnected and digitised consistent rather than tiny changes in the input pattern. One of
world, the integration of neural networks in security systems those mutations is the Sparse Autoencoder (SAE), which
promises to fortify defences, thwart malicious activities, and specializes in learning sparse representations of the input data.
safeguard critical assets. Through an exploration of their Sparse Autoencoders achieve sparsity by activating only a
applications in fraud detection, anomaly detection in small subset of neurons during encoding, making the classes
cybersecurity, and facial recognition for security purposes, this even more divisible. Vincent et al. [39] proposed another
section illuminates the transformative potential of neural mutation known as denoising autoencoders. This method
networks in shaping the future of security paradigms. remakes the input by bringing in noise to the patterns, forcing
the model to focus solely on capturing the formation of the
C. In Health Care input. A similar concept was introduced by Rifai et al. [40] in
In recent years, the technological advancements in health their proposal of contractive autoencoders. However, instead
systems and especially the integration of neural networks in of corrupting the training set with noise, this mutation adds an
healthcare have revolutionized the world of medicine. analytical contractive penalty to the error function. Lastly, in
In this section, we will focus on the influence neural networks Convolutional Autoencoders (CAE) [41] their weights are
have had in healthcare, emphasizing on the various neural shared amidst all locations in the input to maintain spatial
network architectures that are commonly used in medicine,
locality and accurately process two-dimensional (2-D) forget gates [42]. These gates regulate the flow of information
patterns. within the network. They control how much information is
stored or discarded from the memory cell each time step,
ii. Restricted Boltzmann Machine (RBM) enabling the model to learn long-term dependencies more
The Restricted Boltzmann machine is another unsupervised effectively. One of the main motivations behind LSTM’s
deep learning architecture for learning input data design is to address the vanishing gradient problem
representations. Their aim is similar to autoencoders, but encountered in traditional RNNs. By introducing the memory
RBMs put on a stochastic outlook by evaluating the cell and gating mechanism, LSTM can reduce the issue of
probability distribution of the input data. Because of this, they vanishing gradients, allowing it to carry forward errors over
are frequently considered as generative models, aiming to extended sequences without the gradients diminishing to zero.
model the underlying process, responsible for generating the
data. Training an RBM usually includes stochastic C2. Applications
optimization methods, such as Gibbs sampling, which This section explores the applications of neural networks in
gradually adjusts the weights to minimize the reconstruction healthcare, focusing on three important areas: Medical
error. In an RBM, the visible and hidden units are combined to Imaging, Medical Informatics, and Disease Diagnosis
form a bipartite graph allowing for the implementation of Prediction.
more effective and thorough training algorithms. The
Restricted Boltzmann Machines serve as learning models in i. Medical Imaging
two main deep learning configurations, that have been In modern medicine, automatic medical imaging analysis
proposed in literature. These are the Deep Belief Network holds significant importance, since diagnosis based on the
(DBN) and the Deep Boltzmann machine (DBM). interpretation of images can be extremely subjective.

a. Deep Belief Network (DBN) Computer-aided diagnosis (CAD) offers an objective


A DBN can be taken as a combination of RBMs. In this assessment of the underlying disease processes. Modelling
structure, each subnetwork’s hidden layer is connected to the disease progression is common in various neurological
visible layer of the succeeding RBM. In DBNs, the top two conditions like Alzheimer's and multiple sclerosis. It requires
layers have undirected links, while the lower layers have a detailed examination of brain scans based on multimodal
directed links. Initially, a DBN goes through an efficient layer- data and precise mapping of brain regions.
by-layer greedy learning approach. This strategy is later
altered based on anticipated outputs. [44] [51] Recently, CNNs have been rapidly gaining traction within the
medical imaging research community due to their outstanding
b. Deep Boltzmann Machines (DBM) performance in computer vision and their ability to be
A DBM is a variant of Deep Neural Network (DNN) within parallelized with Graphics Processing Units (GPUs). [52]
the Boltzmann class. The main distinction from Deep Belief
Networks (DBN) lies in the presence of undirected or One of the biggest challenges in Computer-Aided Diagnosis is
unguided links that are conditionally independent between all the inconsistency in the intensity and shape of tumors, as well
layers of the network. In the case of DBM, computing the as the differences in imaging protocols even within the same
posterior distribution for the given visible units is not imaging modality. In many cases, the intensity of pathological
achievable by directly augmenting the probability. This is tissue may overlap with that of medically healthy samples.
because it involves interaction among the hidden units. Additionally, non-isotropic resolution, Rician noise, and bias
field effects in magnetic resonance images (MRI) cannot be
Consequently, training a Deep Boltzmann Machine typically handled automatically using simpler machine learning
requires the use of an algorithm based on stochastic maximum approaches. To tackle this complexity in the data, hand-
probability to enhance the lower bound of the probability. designed features are extracted, and conventional machine
Similarly to DBNs, DBMs utilize a greedy layer-wise training learning methods are trained to classify them in an entirely
method during pretraining. The primary challenge they face different step. [48]
lies within their inference time complexity, which is
significantly higher than that of DBN, making the argument Deep learning provides the possibility to optimize and merge
optimization impractical for large training sets [44]. the extraction of relevant features with the classification
procedure. CNNs can learn a hierarchy of continuously more
iii. Long Short-Term Memory (LSTM) complex features, allowing them to directly operate on image
LSTM is a specialized recurrent neural network (RNN) patches centred on the abnormal tissue. Their versatility is
architecture that was designed to model their long-range displayed in various medical imaging applications, including
dependencies and their temporal sequences, more accurately the classification of interstitial lung diseases based on CT
than conventional RNNs [41]. In the typical architecture of images, tuberculosis manifestation from X-ray images, and the
LSTM networks, there is an input layer, a recurrent LSTM identification of neural progenitor cells. These models can also
layer, and an output layer, with the input layer being directly be tailored for specific tasks, such as body-part recognition.
connected to the LSTM layer. The recurrent connections Additionally, CNNs have been proposed for the segmentation
within the LSTM layer extend directly from the cell output of isointense brain tissues and brain extraction from
units to the cell input units, input gates, output gates, and multimodality MR images. [48]
al. [55] compared the performance of different models in
While CNNs have dominated medical image analysis, other forecasting clinic readmissions based on an extensive EHR
deep-learning techniques have also been implemented database. Despite the complexity involved in training DNN
successfully. In a recent study, researchers proposed a stacked models, they have consistently outperformed conventional
denoising autoencoder to diagnose malignant breast lesions in methods in terms of prediction precision.
ultrasound images and pulmonary nodules in CT scans [53]. To control time dependency in EHR data, especially with
This approach surpassed traditional CAD methods, largely due multivariate time series obtained from intensive care
to its automatic feature extraction and noise resilience. monitoring systems, Lipton et al. [56] implemented a Long
Short-Term Memory (LSTM) Recurrent Neural Networks.
Moreover, it eliminated the need for image segmentation to RNNs are preferred for their ability to capture sequential
acquire lesion boundaries. In another study, Shan et al. [54] events, thus improving the modelling of time delays between
introduced a stacked sparse autoencoder that detects the inception of emergency clinical events and symptom
microaneurysms in fundus images as part of a diabetic manifest.
retinopathy strategy. This method learns distinctive features
only from pixel intensities, demonstrating how flexible are Deep learning offers extraordinary power and efficiency in
autoencoder-based approaches in medical image analysis. gathering valuable insights from large-scale datasets, laying
In general, deep learning in medical imaging provides the the foundations for personalized healthcare. However,
automatic discovery of object features and the automatic appropriate initialization and tuning are important in
investigation of feature hierarchy. Along these lines, a simple preventing overfitting, especially because of the challenges
training process and systematic performance tuning can be caused by noisy and sparse datasets. Addressing these
applied, improving over the state-of-the-art deep learning challenges remains a priority in advancing deep learning
approaches. algorithms in medical informatics. [48]

iii. Disease Diagnosis Prediction


Despite the increased integrations of machine learning in
healthcare, the primary focus of research revolves around the
nervous system, cancer, and heart diseases, given their
significant impact on mortality and quality of life. However,
there’s a noteworthy increase in research concerning chronic
Figure 7. MRI scans, CT scans and X-rays
and infectious diseases, such as type 2 diabetes and
inflammatory bowel diseases. Advancements in understanding
ii. Medical Informatics clinical data and diseases through data-driven models have
Medical Informatics focuses on analysing large-scale data enabled early diagnosis of several conditions, thereby
within the healthcare context, aiming to improve clinical transforming them into diagnostic systems [43].
decision support systems and simplify the assessment of
medical data. Both purposes are ensuring quality assurance • Chronic Kidney Disease (CKD) is one of the most
and improving access to healthcare services. Electronic health significant health challenges globally. Recent statistics
records (EHRs) are a rich source of patient information indicate that over 10% of individuals in the general population
including medical history, allergies, test results, laboratory and worldwide are afflicted with CKD [46]. The research to detect
diagnostic exams, images from radiology, medications, CKD with machine learning algorithms has enhanced the
treatment plans, and diagnoses. Thorough extraction of this procedure and consequence accuracy. A hybrid model has
vast array of data could provide valuable insights into disease demonstrated an impressive 99% accuracy in predicting CKD
management [48]. [47].

Deep learning methods have been tailored to handle properly • In eye diseases, neural networks are invaluable in
large and distributed datasets. The huge success of Deep diagnosing conditions like diabetic retinopathy, as seen in the
Neural Networks (DNNs) lies in their ability to learn features IDx-DR and IDx Technologies systems. These models use
and understand data representation in both supervised and medical imaging data, particularly retinal images for accurate
unsupervised hierarchical modes. DNNs are also effective in diagnosis. Additionally, supervised algorithms such as the
processing multimodal information by simply integrating random forest algorithm are used for predicting myopia, by
several components of their architecture. Consequently, it is drawing insights from electronic health records. This
not surprising that deep learning has rapidly been adopted in algorithm accurately predicted the development of adult
the area of medical informatics research. myopia in children up to eight years in advance, having an
accuracy rate ranging from 85% to 99% [43] [48].
Various applications demonstrate the adaptability of deep
learning in medical informatics. For example, authors • For cardiac irregularities, cloud-based artificial
highlighted their system’s capability to predict the probability neural network algorithms like Cardio DL, are used for
of patients developing certain conditions such as diagnosing such conditions. These algorithms use medical
schizophrenia, cancer, and diabetes. Additionally, Futoma et image data, in this case, Magnetic Resonance Imaging (MRI)
scans of heart ventricles. They have displayed efficacy in
studying the functioning of heart ventricles and blood flow, variety of neural network architectures that are commonly
contributing insights comparable to those of radiologists [43]. utilized in healthcare and their applications. These
architectures have demonstrated great efficacy, but despite all
• In Fractures, a machine learning model known as the advancements, they encounter many challenges such as
OsteoDetect is used for detecting radius fractures located managing large volumes of data and ensuring privacy, that
away from the joint. This model, uses wrist image data, require ongoing search efforts to overcome. However, the
specifically X-rays, for detection purposes. This model has potential of neural networks to transform healthcare continues
enhanced the efficiency of orthopedic clinicians in fracture to be very promising, due to the constant innovations of these
diagnosis and management [43]. technologies.

C3. Challenges
Applying deep learning in healthcare shows promising results. IV. FUTURE DIRECTIONS
However, because of those applications, there are also many
challenges being faced. In the following subsection, there’s a ANNs hold great promise for the future. They have the
summary of these challenges. potential to evolve in numerous fields, boosting our lives and
achieving remarkable deeds beyond our current imagination.
i. Volume of Data
Deep learning models are often considered computationally Pulsed or Spiked Neural Networks (SNNs) are considered to
intensive models, due to the large population of parameters be the next generation of neural networks. SNNs research
they require. To train these models effectively access to a wide started after data from neurobiological experiments made clear
clinical data is essential. However, due to confidentiality and that biological neural networks communicated through pulses,
ethical concerns, many researchers face challenges in using their timing to send information and perform
obtaining medical records. Moreover, in the case of calculations. SNNs model, the spiking behaviour of neurons,
underdeveloped countries, where there is a lack of healthcare and how their membrane changes electrically when influenced
records, and insufficient training of healthcare workers further by external factors. They are a prime factor in the evolution of
complicates the understanding of the relationships between Computer Vision, and they are used for image classification,
diseases and symptoms. [38] [44] object detection, object tracking, object segmentation, and
optical-flow estimation. SNNs are also employed in Robotic
ii. Temporality Control. They are used as a ‘brain’ for robots, which allows
Infections are continuously evolving, in a non-deterministic them to observe their surroundings and mimic the actions
manner. However, many existing deep learning models noted in their environment. For the robot to perform a task
depend on static vector-based sources of information, which such as the movement system inspired by the biological
cannot handle temporal aspects. Developing deep learning system, the network can be customised and adjusted by hand.
approaches capable of handling temporal healthcare data is an [57] [58]
imperative aspect that will require the creation of innovative
solutions [38] [44]. Multi/Infinite Dimensional Neural Networks (MDNNs) are a
new model of ANNs, that are the generalised version of One-
iii. Data Quality Dimensional Neural Networks (RNNs, CNNs, etc.). Their
Data quality in healthcare varies from structured datasets in theory is still under development but is based on the
computing and information security. Electronic healthcare generalisation of the gates from the one-dimensional logic to
data often suffers from issues related to dataset quality and the multidimensional logic. MDNN architecture is portrayed
these issues most of the time lead to paucity of data and by a Tensor State Space Representation, which is used to
inconsistencies in disease condition assessments. [38] [44] compute the output of each neuron. MDNNs use the BP
algorithm, which is tailored to neural networks with complex
iv. Privacy values, depending on complex signum and sigmoid functions.
One of the most crucial challenges in applying deep learning MDNNs have employed applications in the foundation of the
in healthcare is to understand whether neural network models three core concepts of cybernetics: (1) Development of the
are vulnerable to privacy or security threats. Artificial unified theory of control, (2) Communication, and (3) Coding.
intelligence models and privacy-preserving data mining are They also have applications in the field of binary filters and
subjects under extensive research. False positive the complex hypercube, which is a foundation of complex-
classifications for patients could lead to unnecessary concern. valued neural associative systems. [57]
Moreover, if poisoning attacks are detected, dataset clients
may take appropriate actions, such as dismissing the results of Forecasting methods in the context of NNs have both
the machine learning algorithm or attempting to identify and limitations and future innovations. There is a growing interest
eliminate any malicious data from the dataset. [43] [44] in the research community in exploring the application of
probabilistic forecasting to reduce the uncertainty of NN
predictions. Furthermore, multivariate forecasting will be
C4. Conclusion necessary for complex scenarios since products are becoming
Over the last decade, machine learning and pattern recognition more diverse, emphasising the need to explore multiple
have grown significantly. In this section, we explored the wide seasonality models, especially for high-frequency big data
contexts. Since RNNs were inefficient in modelling [10] Z. C. Lipton, “A Critical Review of Recurrent Neural Network for
sequence Learning”, researchgate.net, Jun. 2015
seasonality, researchers explored alternative approaches, such [11] C. S. K. Dash, A. K. Behera, S. Dehuri, S-B. Cho, “Radial basis function
as combining CNN filters with customised attention neural networks: a topical state-of-the-art survey”, 2016
algorithms. Temporal convolution networks (TCNs), an [12] C. Wang and L. Wang, “Artificial Neural Network and Its Application in
advanced type of CNN architecture, provide an efficient Image Recognition”, Journal of Engineering Research and Reports,
Volume 24, Issue 2, Feb. 2023
training process by combining convolutions with residual [13] X. Li and X. Lv, "Research on Image Recognition Method of
connections, resulting in improved efficiency for forecasting Convolutional Neural Network with Improved Computer Technology”,
tasks. [59] Journal of Physics: Conference Series 1744, 2021
[14] F. L. Borchardt, “Neural Network Computing and Natural Language
Processing”, CALICO Journal, Jun. 1988
Future projects that employ ANNS aim to address challenges [15] R. G. Franklin, A. R. Doni, D. Poornima, S. I. S. Prabu, “The Use of
and advance capabilities in programmable network devices, Recurrent Neural Networks in the Optimization of Computer Science
such as hardware offloading, data plane virtualization, NN Algorithms”, IEEE International Conference on Emerging Research in
orchestration, incremental and online learning, as well as Computational Science, 2023
[16] Z. Yan, “Research and Application on BP Neural Network Algorithm”
distributed and federated learning. [60] IEEE International Industrial Informatics and Computer Engineering
Conference, 2015
To enhance the accuracy and efficiency of ANNs in the future, [17] L. Liu, “Computer Network Routing Optimization Algorithm Based on
we can increase the number of hidden layers, and vary the Neural Network Mode” IEEE Asia-Pacific Conference on Image
Processing, Electronics and Computers (IPEC), Apr. 2023
training and learning rules applied within them. The ANN [18] A. K. Swain, S. K. Jayasingh, “Neural Network in Fraud Detection”,
technology will advance over time, with most applications Conference Paper, Aug. 2011
utilising them becoming more advanced, while researchers [19] M. L. Gambo, A. Zainal, M. N. Kassim, “A Convolutional Neural
invent new training ways and network architectures. [57] Network Model for Credit Card Fraud Detection”, Inter. Confer. on Data
Science and Its Applications (ICoDSA), 2022
[20] B. F. Murorunkwere, O. Tuyishimire, D. Haughton, J. Nzabanita, “Fraud
V. CONCLUSION Detection Using Neural Networks: A Case Study of Income Tax”, MDPI,
May 2022
[21] S. Yuan, X. Wu, J. Li, A. Lu, “Spectrum-based deep neural networks for
ANNs are one of the greatest inventions from the combination fraud detection”, Jun. 2017
of the Computer Science and the Neuroscience fields. Enabled [22] Z. Zhang, X. Zhou, X. Zhang, L. Wang, P. Wang, “A Model Based on
by their contributions, numerous fields including Computer Convolutional Neural Network for Online Transaction Fraud Detection”,
Science, Security, and Health Care benefited, unravelling Aug. 2018
[23] M. Lu, Z. Han, Z. Zhang, Y. Zhao, Y. Shan, “Graph Neural Networks in
many challenges in the process. Their ability to learn from Real-Time Fraud Detection with Lambda Architecture”, Oct. 2021
data and adapt to new information makes them capable of [24] P. Podder, S. Bharati, M. Rubaiyat Hossain Mondal, P. Kumar Paul, U.
solving complex problems, which is beneficial for most fields. Kose, “Artificial Neural Network for Cybersecurity: A Comprehensive
Although they offer solutions to many problems, challenges Review”, 2020
[25] T. A Tang, L. Mhamdi, D. McLernon, S. Ali Raza Zaidi, M. Ghogho,
still exist due to the complexities inherent in their “Deep Recurrent Neural Network for Intrusion Detection in SDN-based
implementation, which still await resolution. As technology Networks”, IEEE International Conference on Network Softwarization
and science advance, we acquire a new understanding of the (NetSoft 2018) - Technical Sessions, 2018
human brain, which originally inspired ANNs. This leads to [26] F. Jiang, Y. Fu, B. B. Gupta, Y. Liang, S. Rho, F. Lou, F. Meng, Z. Tian,
“Deep Learning Based Multi-Channel Intelligent Attack Detection for
the creation of architectures and training methods for ANNs Data Security", IEEE Transactions On Sustainable Computing, April-Jun.
that are more efficient, and potentially give solutions 2020
encountered by previous models. [61] [27] Y. Said, M. Barr, H. Eddine Ahmed, “Design of a Face Recognition
System based on Convolutional Neural Network (CNN)”, Engineering,
Technology & Applied Science Research, 2020
REFERENCES [28] S. Kanithan, N.A. Vignesh, E. Karthikeyan, N. Kumareshan, “An
[1] R. Qamar, B. A. Zardari, “Artificial Neural Networks: An Overview”, intelligent energy efficient cooperative MIMO-AF multi-hop and relay
ResearchGate, Mesopotamian Journal of Computer Science, Aug 2023. based communications for Unmanned Aerial Vehicular networks”,
[2] P. J. Denning, “The Science of Computing: Neural Networks”, American Comput. Commun., 2020
Scientist, Sigma Xi, The Scientific Research Honor Society, Sep-Oct [29] M. Rahman, S. Rahman, M. U. A. Ayoobkhan, “On the effectiveness of
1992 deep transfer learning for Bangladeshi meat based curry image
[3] I. A. Basheer, M. Hajmeer, “Artificial neural networks: fundamentals, classification”, International Conference on Innovations in Science,
computing, design, and application”, sciencedirect.com, vol. 43 no. 1, Engineering and Technology (ICISET), IEEE, 2022
Dec. 2000 [30] M. Rahman, S. Rahman, M. U. A. Ayoobkhan, “Fine Tuned
[4] Q. Liu, Y. Wu. “Supervised Learning”, researchgate.net, Jan. 2012 convolutional neural networks for Bangladeshi vehicle classification”,
[5] Y. Tishan, “Understanding the Difference Between Supervised and International Conference on Innovations in Science, Engineering and
Unsupervised Learning Techniques”, Sep. 2023 Technology (ICISET), IEEE, 2022
[6] B. M. Devassy, S. George, P. Nussbanm, “Unsupervised Clustering of [31] S.T. Suganthi, M. U. A. Ayoobkhan, N. B., K. Venkatachalam, H.
Hyperspectral Paper Data Using t-SNE”, researchgate.net, Journal of ˇStˇep´an, and T. Pavel. "Deep learning model for deep fake face
Imaging 6(5):29, May 2020 recognition and detection.", PeerJ Computer Science, 2022
[7] K. Sivamayil, E. Rajaseker, B. Aljafari, S. Nikolovski, S. [32] T. Guo, J. Dong, H. Li, Y. Gao, “Simple convolutional neural network on
Vairavasundaram, I. Vairavasundaram, “A systematic study on image classification”, IEEE 2nd International Conference on Big Data
Reinforcement Learning Based Applications”, mdpi.com, Feb. 2023 Analysis (ICBDA), IEEE, Mar. 2017
[8] M. H. Sazli, “A brief review of feed-forward neural networks”, [33] F. Sultana, A. Sufian, P. Dutta, “Advancements in image classification
researchgate.net, May 2015 using convolutional neural network”, Fourth International Conference on
[9] V. Srilakshmi, G. U. Kiran, M. Mounika, A. Sravanthi, N. V. K. Sravya, Research in Computational Intelligence and Communication Networks
V. N. S. Akhil, M. Manasa, “Evolving Convolutional Neural Network (ICRCICN), IEEE, Nov. 2018
with Meta-Heuristics for Transfer Learning in Computer Vision”,
sciencedirect.com, 2023
[34] Y. Pei, Y. Huang, Q. Zou, X. Zhang, S. Wang, “Effects of image [58] K. Yamazaki, V. Vo-Ho, D. Bulsara, and N. Le, “Spiking Neural
degradation and degradation removal to CNN-based image Networks and their Applications: A Review”, MDPI,brain sciences, Jun
classification”, IEEE Trans. Pattern Anal. Mach. Intel., 2019 2022.
[35] A. C. Navarrete, A. C. Gallegos, “Neural Network Algorithms for Fraud [59] H. Hewamalage, C. Bergmeir, and K. Bndara,“Recurrent Neural
Detection: A Comparison of the Complementary Techniques in the Last Networks for Time series Forecasting: Current status and future
Five Years”, 2021 directions”, ScienceDirect, Faculty of Information Technology, Monash
[36] M. Cabrera-Bean, V. J. Santos, A. R. Llorach, S F. Bertolin, J. Vidal, and University, Melbourne, Australia, Jan-Mar 2021 .
C. Violan, “Autoencoders for health improvement by compressing the set [60] F. D. Rossi, M. C. Luizelli, A. Lorenzon, and M. Caicedo, “In-Network
of patient features”, IEEE Engineering in Medicine & Biology Society, Neural Networks: Challenges and Opportunities for Innovation”,
Sep. 2018 ResearchGate, IEEE Network, Nov-Dec 2021
[37] B. Shickel, P. J. Tighe, A. Bihorac, and P. Rashidi, “Deep ehr: A survey [61] S, Agrawal, J. Agrawal, “Neural Network Techniques for Cancer
of recent advances in deep learning techniques for electronic health record Prediction: A Survey”, sciencedirect.com,19th International Conference
(ehr) analysis,” IEEE Journal of Biomedical and Health Informatics, Sep. on Knowledge Based and Intelligent Information and Engineering
2018 Systems, Dec 2015.
[38] I. Zion, S. Ozuomba, P. Asuquo, “An Overview of Neural Network
Architectures for Healthcare”, IEEE International Conference in
Mathematics, Computer Engineering and Computer Science, Apr. 2020
[39] P. Vincent, H. Larochelle, Y. Bengio, P. A. Manzagol, “Extracting and
Composing Robust Features with Denoising Autoencoders”,
researchgate.net, University of Montreal, Jan. 2008
[40] S. Rifai, P. Vincent, X. Muller, X. Glorot, Y. Bengio, “Contractive Auto-
Encoders: Explicit Invariance During Feature Extraction”,
scholar.google.com, ICML, Jan. 2008
[41] H. Sak, A. Senior, F. Beaufays, “Long Short-Term Memory Recurrent
Neural Network Architectures for Large Scale Acoustic Modelling”,
Cornell University, Sep. 2014
[42] H. Sak, A. Senior, F. Beaufays, “Long Short-Term Memory Based
Recurrent Neural Network Architectures for Large Vocabulary Speech
Recognition”, Cornell University, Sep. 2014
[43] A. Pandit, A. Garg, “Artificial Neural Network in Healthcare: A
Systematic Review”, IEEE International Conference on Cloud
Computing, Data Science & Engineering, Mar. 2021
[44] S. K. Pandey, R. R. Janghel, “Recent Deep Learning Techniques,
Challenges and Its Applications for Medical Healthcare System: A
Review”, Department of Information Technology, India, Jan. 2019
[45] C. P. Kovesdy, “Epidemiology of Chronic Kidney Disease: an update
2022”, University of Tennessee Health Science Center, Memphis, USA,
Apr. 2022
[46] H. Khalid, A. Khan, M. Z. Khan, G. Mehmood, M. S. Qureshi, “Machine
Learning Hybrid Model for the Prediction of Chronic Kidney Disease”,
National Library of Medicine, Mar. 2023
[47] S. M. Li, M. Y. Ren, J. Gan, S. G. Zhang, M. T. Kang, H. Li, D. A.
Atchison, J. Rozema, A. Grzybowski, N. Wang, “Machine Learning to
Determine Risk Factors for Myopia Progression in Primary School
Children: The Anyang Childhood Eye Study”, National Library of
Medicine, Apr. 2022
[48] D. Ravi, C. Wong, F. Deligianni, M. Berthelot, J. A. Perez, B. Lo, G. Z.
Yang, “Deep Learning for Health Informatics”, IEEE Journal of
Biomedical and Health Informatics, Dec. 2016
[49] F. Li, L. Tran, K. H. Thung, S. Ji, D. Shen, and J. Li, “A robust deep
model for improved classification of ad/mci patients,” IEEE J. Biomed.
Health Inform, Sep. 2015.
[50] D. Kuang and L. He, “Classification on ADHD with deep learning,”
International Conference on Cloud Computing and Big Data, Nov 2014.
[51] G. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for
deep belief nets,” Neural computation, Aug 2006.
[52] M. Havaei, N. Guizard, H. Larochelle, and P.-M. Jodoin,” Deep Learning
Trends for Focal Brain Pathology Segmentation in MRI”, July 2016.
[53] J. Z. Cheng et al., “Computer-Aided Diagnosis with Deep Learning
Architecture: Applications to Breast Lesions in US Images and
Pulmonary Nodules in CT Scans,” https://www.nature.com/srep/, Apr.
2016
[54] J. Shan and L. Li, “A Deep Learning Method for microaneurysm
detection in fundus images,” IEEE International Conference on
Connected Health, Jun. 2016
[55] J. Futoma, J. Morris, and J. Lucas, “A comparison of models for
predicting early hospital readmissions,” Journal of Biomedical
Informatics, Aug. 2015.
[56] Z. C. Lipton, D. C. Kale, C. Elkan, and R. C. Wetzel, “Learning to
diagnose with LSTM recurrent neural networks,” Cornell University,
Nov. 2015
[57] S. Lakra, T. V. Prasad, G. Ramakrishna, “The Future of Neural
Networks”, ResearchGate, 6th National Conference - Computing For
Nation Development, INDIA, Feb 2012.

You might also like