ANNand Its Applications
ANNand Its Applications
Alexandros Vasileiadis, Eirini Alexandrou, Lydia Paschalidou, Maria Chrysanthou, Maria Hadjichristoforou
Abstract—This paper focuses on Artificial Neural Networks backpropagation (BP) training algorithm programmer. Despite
(ANNs) and their applications. Initially, it explores the core the numerous training techniques, establishing an optimal
concepts of a neural network (NN), including their inspiration, ANN for a particular application remains a notable challenge.
basic structure, and training process, along with an overview of This challenge persists from compelling evidence from both
the most commonly used models. Additionally, the paper delves biological and technical perspectives, suggesting that the
into the three fields that ANNs play an important role: (1) effectiveness of an ANN in manipulating knowledge is
Computer Science, (2) Security, and (3) Health Care. These fields impacted by its design. [1]
are marked as significant since they hold great impact on various
aspects of society. For each one field, the paper discusses ways
that NNs have been utilised to unravel problems, the This research will focus on neural network applications in
architectures employed, notable applications of NN within the computer science, security, and healthcare. It will explore how
domain and challenges faced because of NNs implementation. ANNs can be used in these fields, delve into their impact and
Lastly, it discusses the future directions of ANNs, exploring challenges, and discuss their potential future.
potential advancements in architecture, models, and applications
across diverse domains. Neural Networks are used in computer science for problem-
solving across various disciplines. Through algorithms, they
Index Terms—Artificial Neural Networks, Neural Networks, can execute various tasks, including image recognition, natural
Core Concepts, Training, models, Applications, Computer language processing (NLP), machine translation, speech
Science, Security, Health Care, Challenges, Architecture, Future
Direction
recognition, and help with developing language translation
systems. Building their success in Computer Science, ANNs
extended their applications into the Security sector. Various
types of ANNs, such as Convolutional Neural Networks
I. INTRODUCTION (CNNs), Graph Neural Networks (GNNs), and Recurrent
Neural Networks (RNNs), play an important role in addressing
Artificial Neural network (ANNs) is a machine learning security matters, such as Fraud Detection, cybersecurity
model, designed to emulate human decision-making processes threads, and facial recognition. Moreover, ANNs architectures
by simulating how biological neurons work. They consist of have expanded their use into the realm of healthcare.
interconnected layers of units, where data flows through them Capitalising on their abilities, they can help analyse medical
in an orderly sequence. Specifically, it can be categorised into images such as MRIs, CT scans, X-rays and ultrasounds which
three neural layers: (1) an input layer, (2) a hidden layer, and helps us to make early clinical diagnosis. ANNs can also
(3) an output layer. Even though ANNs are the simplified predict epidemic outbreaks, organise patients' health records
version of how our brain works, they are adept at learning to and personalise their medicine.
solve difficult problems through training, using experiments
and observations. Meaning, they are proficient at
comprehending intricate patterns and connections. [1]
II. CORE CONCEPTS
Particularly, ANNs function by manipulating inputs and
adjusting connections between neurons. They execute multiple The concept of ANN’s starts its inspiration from the human
pattern recognitions and mapping tasks. They can rebuild brain, particularly its building blocks, the neurons. The human
stored patterns using partial or noisy inputs, associate a given brain is a powerful tool that can do tasks such as thinking,
pattern with another associated pattern in temporal sequence, recognizing, and solving hard and complex problems, and to
create new patterns for complex problems, and group similar do all of that, the neuron - an electrically excitable cell - plays
patterns into clusters by creating new pattern representatives a big part. Our brain consists of an estimated 90 to 100 billion
for them. [2] neurons, where each neuron is connected with 1 to 10
thousand others, which makes up to 10 interconnections.
There are various types of neural networks, including Neurons communicate by sending electrical and chemical
Feedforward Neural Networks (FNN), Convolutional Neural signals to their neighbours. Signals are sent through the
Networks (CNN), and Recurrent Neural Networks (RNN). neuron's branch, the axon, which further extends into smaller
ANNs find applications across various domains, the greater segments called collaterals. At the end of these collaterals,
part of which engage feedforward architecture ANNs and the neuromuscular junctions known as synapses, form connections
with neighbour neurons, allowing the neuron to transfer a
signal. Meanwhile, on the receiving end, the neuron’s
dendrites receive those signals via the synapses and merge
them within the soma, the neuron body. Neurons with stronger neuron, two of which are also located in the multilayer
synaptic connections have a greater impact on each other. This perceptron architecture (MLP). In the context of these
massive web of billions of interconnected neurons working intermediary layers (known as hidden layers), where hidden
together allows the brain to achieve its amazing abilities. [3] nodes are located, inside which the information transformation
occurs without direct access to the external environment. By
A. Structure of a neural network the same pattern of neuron dynamics, the hidden neurons
In 1943, McCulloch and Pitts, the founding fathers of AI, examine the information, which was transmitted by the input
developed the first mathematical model of a neuron. Through nodes and send it to the output layer. However, MLP has a
an analogy between a nerve cell and an artificial neuron, learning behaviour, which is a lot more complex than a single
where dendrites and dendrites symbolise input and output of a perceptron learning. Despite the increased complexity, the
neuron, synapses portray the weight of a neuron and the learning process is built upon the basis of the simple
activity in soma represents the threshold. perceptron algorithm, and therefore MLP is able to handle the
non-linearities more efficiently. [3]
That is based on the experiences of McCulloch and Pitts, the
concept of a perceptron was introduced by Rosenblatt in 1958.
The breakthrough that marked the early artificial neural
network structures which can monitor, learn, and operate
imitating the human-like learning through example. An
algorithm that gives neurons the ability to learn and at the
same time process information efficiently, that helps them to
learn independently.
The Perceptron is an algorithm which is made by the concept
of having numerical inputs together with the weights and bias.
This produces a weighted summation of the input multiplied
with the weight. It is achieved by the introduction of weighted
bias in the products. The activation function applies
programming to compute and deliver the final value.
Figure 2. MLP showing input, hidden and output layers and nodes
with feedforward links.
Where 𝑦 is the actual output for neuron 𝑗, 𝑦 is the predicted c. Game Development and Strategy Planning
BP networks are used to build stronger, more adaptable AIs in
output for neuron 𝑗, and 𝑓′ 𝑧 is the derivative of the
the gaming industry. The network learns from huge databases
activation function applied at the output of neuron 𝑗, 𝑧 . of people playing; it can easily predict human behaviour and
offer the challenge or counteraction in the game without the
AI “cheating”. BP networks assist in real-time decisions learning algorithm, which, in turn, operates on the principle of
within the game, helping optimise the engine performance via artificial reproduction of the brain’s structural units. Such
predictive modelling. networks allow simulating many neurons’ joint work and
choosing between several options based on historical data.
d. Autonomous vehicles
The science behind autonomous vehicles is one of the most iii. Methodology
critical areas where BP neural networks have led to significant The designed ANN model includes a layered network
technological advancement. BP neural networks take input structure in which each node represents a decision location
from sensors and cameras fitted on a vehicle and make real- during the routing of data packets along the node paths. A
time decisions about how to navigate and, at a higher level, network is trained off samples of the appropriate network
recognize road signs and avoid obstacles such as other conditions and routing selections that enable it to discover the
vehicles, facilitating safe driving. Because BP networks can most effective routing patterns. Figure 2 below portrays the
learn from different circumstances and scenarios, without this neural network’s architecture which is featured with input,
technique, the innovation and improvement of autonomous hidden, and output layers. The neural network will implement
driving is impossible. the decision-making process based on these layers.
e. Robotics
Another obvious application in which BP neural networks
have expanded the frontier of computational science is in
robotics, particularly for robots performing complex tasks in
which the environment changes continuously. Assemblers,
most missions, hazard sensing, space exploration and even
surgery are some of the required tasks because these robots
interact with their surroundings in real-time and gain
experience by a neural network model to execute tasks with
improved precision and efficiency.
The above scenarios are just a few examples that indicate how
BP algorithms can be used to model intricate patterns and
make informed patterns. Given their history of innovation and
the optimal efficiency of their models, BP models are
unquestionably going to be the cornerstone of modern
computational sciences. [16]
Figure 4. Architecture of the Neural Network for Routing
A5. Computer Network Routing Optimization Optimization
Algorithm
The development of Internet technologies has not only The ANN employs a probabilistic model to dynamically form
changed the way we live, but also led to the emergence of a network connections, as described by the following equation:
particularly urgent problem – the need for high-quality
network infrastructure. Due to the steady increase in network
requirements, one of the most pressing concerns is the
optimization of the network routing process. Traditional
technologies rarely cope with network complexities, which
forces researchers to seek new trends such as Artificial Neural where 𝛱 𝑖 is the probability that a new node 𝑖 will connect to
Networks that optimise routing. an existing node, 𝑘 is the degree of node 𝑖, and 𝑘 represents
the degree of node 𝑗. This formula helps the ANN in
predicting the most efficient pathways by optimising the
i. The Challenge of Network Routing network topology based on the likelihood of node connections.
Present-day computer networks are complex systems, the Additionally, the system delay model used to minimise latency
packets of which pass through several nodes before reaching
the final destination. In conditions of growing network traffic,
quality routing becomes a necessary condition because
otherwise, there is an opportunity for congestion and data loss.
At the same time, the difficulty of the routing situation, which and optimise routing is given by:
depends on factors such as network topology and load, where 𝑇𝐶 is the total system delay, 𝑇 represents the
suggests that a dynamic solution will be optimal. transmission delay, and 𝑡𝑏 represents the delay experienced
due to inadequate bandwidth availability, signifying the
ii. ANN-Based Optimization Approach waiting time for data transmission. 𝑃 , symbolises the delay
This section describes a new method using Artificial Neural induced by queuing at the Mobile Edge Computing (MEC)
Networks to optimise network routing. It uses the ANN infrastructure, and 𝑡 signifies the delay attributed to task
execution by the MEC server. These components are crucial in fraud detection is not just a representation of their aptitude
for evaluating the efficiency of different routing paths and are for detecting sophisticated patterns in large data sets but a
integral to the ANN’s decision-making process. profound illustration of their excellence. The neural networks,
namely neural networks that are built based on the
iv. Simulation Results transactional data, user behaviour, and historical patterns, are
A comparative assessment according to the traditional routing capable of spotting anomalous activities that could be
of the network demonstrates the much higher efficacy of the fraudulent behaviour. They are rather capable of adjusting to
new proposed model. According to the results, the ANN fresh cases of fraud and learning from these new instances.
model practically reduces packet loss and delay to zero and [18] As a result, the anti-fraud mechanism is constantly
does not require human intervention, which can significantly improving its detection algorithms, and therefore, the efficacy
increase its effectiveness in terms of routing. of fraud prevention technologies goes up. It thus facilitates
eliminating the requirement of labour-intensive manual feature
v. Applications and Future Work engineering, which may also be very time-consuming and
The ANN-based routing optimization model has extensive domain specific. In addition, deep learning approaches are
prospects in that it can be applied to both small corporate good at handling multidimensional data and finding hidden
networks and international Internet backbones. The relationships, especially the complex and hidden ones, which
programmed scalability allows the use of this algorithm for give the system a unique feature of identifying subtle and
modern dynamic telecommunications. Future work will covert signs that characterise the fraudulent behaviours.
involve the reduction of dependency on human data and the Different deep learning architectures, such as the
rapid response of the network to the situation. [17] convolutional neural networks (CNN), and the graph neural
networks (GNN) are used to detect financial fraud in recent
times. These models are used in different types of financial
B. In Security systems like detecting credit card fraud, insurance, and money
ANN adaptability and efficacy have rendered them laundering. Most notably, deep learning models have been
indispensable in safeguarding critical systems, combating consistently outperforming classic approaches, with a success
fraudulent activities, and enhancing security measures across rate of around 99%. [19] [20].
diverse domains.
This section delves into the multifaceted applications of neural i. CNN in Fraud detection
networks in security, focusing on three pivotal areas: fraud Convolutional Neural Networks (CNN) is a popular deep
detection, anomaly detection in cybersecurity, and facial learning algorithm, which shows good results in finding
recognition for security purposes. The integration of neural unobservable features of dubious transactions and helps to
networks in these realms not only augments traditional avoid overfitting of the model. The CNN algorithm has three
security measures but also empowers organisations to main layers which are: Convolution layer, pooling layer, and
proactively mitigate risks and fortify their defences against fully connected layer constitute the neural network. Normally,
evolving threats. the role of the convolution and pooling layers is to perform
feature extraction. The third layer which is known as the fully
Neural networks have especially much to offer in the security connected layer performs the operation of mapping the
field by focusing on the logic of input-output relationship extracted features into its final output, such as classification.
surface and the depth learning process inspired by the human [19] [21]
brain, they acquire knowledge by learning and storing it
within connection strengths between the neurons, recognized
as synaptic weights. Different from traditional fit to purpose
linear models, neural networks demonstrate their flexibility to
non-linear and linear correlations while not using intermediate
variables to model the reality. This capability is proven to be
vital in cybersecurity in situations where the risks develop at a
great speed and display nonlinear characteristics. Through the
use of networks simulating actual biological systems, security
infrastructures can naturally adapt to new menaces, grasping Figure 5. Overall Network Structure
the patterns and misconforms in real time to fortify the shield
and hedge risks that might occur. [18] The design of network structure is intended to make it possible
for applying the analytical tool to network transaction data and
B1. Fraud Detection for the identification of criminal financial activities in a short
Financial fraud has continued to be an enduring threat that is time. In essence, we have an input feature sequencing layer, a
faced in the financial sector, and this assails on the group of four convolutional layers interlaced with pooling
individuals, institutions, and economies greatly. The deep layers, and a fully connected layer (Fig. 1). The next task is
neural networks that exhibit this capability are known for the feature sequencing layer; a layer operated through which
autonomous learning of complex patterns and representations the input features are processed according to their orders.
from raw data, therefore this technique could be very effective Distinction of effects are accumulated on the model whenever
in addressing this issue. The performance of neural networks different order feature input layers are convoluted. The
filtering function of the convolutional layer is to detect the financial institution with the means to tackle financial fraud
local feature of the input data; in this context, developers efficiently and effectively by being more proactive.
would benefit from the new computed features based on the
input features. These new attribute items that are not defined B2. Anomaly Detection in Cybersecurity
physically but are certainly useful in the data modelling The cybersecurity domain is nowadays being challenged by
domain, they are. Pooling helps to combine the features from non-trivial attacks, whose skilling development is advanced.
the adjacent areas into a single higher-level feature which is This is why the research in defence mechanisms is now
more efficient and makes use of less of the data. The final booming. Traditional detection systems that are designed to
layer, which is fully connected, is responsible for work only with attack templates are not effective enough when
classification of stocks. The number of nodes in each layer of it comes to the development of new threats or changing attack
a neural network varies from one input to another. The trained strategies, which has already resulted in search for better
networks model will get the optimised model parameters from dynamic and smart solutions. The fact that machine learning
the training data. The optimised model parameters also can be techniques including the neural networks are used as a good
directly applied to the detection of real trading data in a real option to strengthen intrusion detection systems and those
time. [22] systems have the ability of learning and reacting to new
threats in (a) real-time has been a positive sign. Through the
ii. GNNs applied for financial fraud detection application of the data science and analytics, cybersecurity
Graph neural networks (GNN) is grasping a larger pool of experts can obtain more and more useful data from the vast
users as they discover their utility in learning about graphs. data set, which will make the defence mechanism more
The structure of the graph naturally supports strong problem- effective and the digital fortress also stronger because of the
solving and modelling of complex relationships between continual cybersecurity threats evolution. Neural networks
nodes through message passing and agglomeration. [23] provide an alternative solution by resorting to their ability to
The graph applied in the case of financial fraud detection observe the smallest disparities with well-established norms.
scenario is usually made-up of nodes that refer to accounts and Shifting from reactive detection to proactive detection, neural
edges which represent transactions. Every node means a networks automatically process historical information and
financial account including the examples of bank account, datasets containing malicious behaviour patterns, thus being
credit card account, or any financial institution implicated in a more capable of identifying and mitigating cyber threats in
transaction. Nodes can possess values, namely type of real-time.
account, transaction history, current balance, account owner
information, and other data applicable to fraud detection. All i. RNN in cybersecurity
ripples in between correspond to a financial exchange between RNN, or recurrent neural network, which is a subset of neural
two accounts. The edge label displays the transaction amount networks, features loops within its nodes, forming a directed
transferred, in relation from account A to account B. Edges graph. This structure enhances its status as a network. This
may be linked with weighted attributes representing the subject allows us to demonstrate the recognition of the
quantities’ transfers or transactions annotations (e.g. dynamic behaviour that is carried out in the sequence. The
transactions mechanism in certain occasions or the transferred internal memory serves as a place where the sequence of
sums). activations is processed, that way they can conduct both back
and forward transmission by forming feedback loops in the
The graph neural networks achieve this by using message network. Gradients are more complicated to deal with when
passing procedures where it disseminates information across training RNNs, however. Nevertheless, the progress attained
the network edges thus processing information in a way that in architecture and training as-of-today yielded different
encapsulates the graph topology and relationships of the RNNs. The model is a little bit easier to train as it is. LSTM
nodes. GNN gives fraud scores to the node or transaction as it (long short-term memory), the improved one of RNN, was
does graph embedding operations on the financial transaction proposed in 1997 as they were put forward by Hohenreiter and
graph and learning its features. These suspicion scores are the Schmidhuber. LSTM is the first step of a new revolution on
variables that are determined for the accounting of these speech recognition and incredible success on some traditional
systems in order to be exposed to fraud. The GNN was used to models in niche applications. It serves to overcome the only
guess fraud scores and a threshold that separated ordinary and drawback of RNNs, in short-term memory. LSTMs, with
suspicious transactions was applied. Fraud scores higher than several neurons connected to the previous time unit. The
a certain threshold is the sign to put the transactions on stake, memory accumulator is the term that defines the configuration
and they are investigated deeper. The boundary value might be of units responsible for collecting the information and is called
computed from the data distribution to avoid the occurrence of a memory cell [24] [25]. In Deep Learning Based Multi-
either false positives or false negatives while optimising for Channel Intelligent Attack Detection for Data Security [26]
the necessary intervals within the domain knowledge. the authors recommend the following algorithm as seen
Cooperation between automated detection from the GNN and below:
the expertise of professional human analysts, will provide any
Algorithm 1: Training Neural Network widespread use in border security, access control systems,
----------------------------------------------------------- monitoring and enforcing the law. This helps in addressing
Input: Features X extracted from the training security related issues but at the same time making privacy
dataset with labelled information and accuracy a top priority. The utilisation of people’s faces in
the photos to give rise to the increasing interest among the
Initialization: scientists is a factor which is due to their application interests
1. for channel = 1 to N do as well as the challenge that this presents to artificial vision
2. Train LSTM-RNN model algorithms. The specialists have to be ready to deal with the
3. Save the LSTM-RNN model as a classifier c extremely high diversity of the features of faces, as well as of
4. end for the many different parameters of the image (angle, lighting,
hairstyle, facial expression, background, etc.). Currently, the
Return: c most widely recognized face recognition methods utilise
Convolutional Neural Networks. It describes the architecture
The detection algorithm is described by pseudocode, given as of a Deep Learning model which allows the enhancement of
Algorithm 2. the existing best programs in terms of accuracy and processing
time.
Algorithm 2: Attack Detection
----------------------------------------------------------- i. CNN in Facial Recognition
Input: Feature X extracted from test dataset with The said network is composed of two convolutional layers,
labelled information then a fully connected layer and at last classification layer.
Every layer of convolution is succeeded by an activation layer
Initialization: and a carpooling operation. Also, two regularisation
1. for channel = 1 to N do techniques after each convolution layer are added: batch norm
2. Load LSTM-RNN model as a classifier and dropout. The fully connected layer is then applied
3. Get the result vector R of the classifier followed by the dropout technique which is to reduce
4. end for overfitting and to improve the performance of the proposed
neural network model. [27]
Vote to get the majority element v:
1. for r in R do While for image processing or any sort of prediction, which is
2. Vote to get the majority element v associated with image, a convolutional neural network is first
3. end for of all the choice. A standard convolutional neural network
would constitute of a number of simple layers, which may be
Return: v repeated n times in the network depending on the topic that is
to be predicted [28] [29]. The first layer consists of a
Algorithm 1 presents the process for training a network that convolutional layer populated with some filter that will be
will have a Long Short-Term Memory Recurrent Neural applied to the pixels of the image.
Network (LSTM-RNN) model. From the labelled training
dataset features X this requires are taken in as the input. The Usually, the image should be larger relative to the filter
algorithm gets started with setting up the LSTM-RNN model applied to it. From the beginning to the end of the image, the
for each channel in the dataset. It performs the process of filter goes in the horizontal and vertical directions, one step at
looping over all the channels, trains the LSTM-RNN network a time, the values of the convolutional layer are calculated
model, and saves the trained model in the classifier. After that, with a dot product method. The generated convolutional layer
it returns the classifier c that can make predictions. Algorithm results are then passed to the next layer called pooling layer.
2 explains the detection scheme with the classifier made using Through this process, the dimensions of values taken from the
LSTM-RNN which is learned from Algorithm 1. It reviews previous layer are actually the features we have extracted to
the test data set that comes in as a featured data X including better describe the image. The same needs to be approached
the labelled data. The algorithm introduced begins with using a pooling filter which smoothly scans the output of the
classifying the specified LSTM-RNN model as a classifier of previous output. Conditioned on the topic to be predicted, a
each channel. It then gets the R vector indicating results of convolutional layer and successive pooling layers are
evaluation through the classifier by using the test dataset. repeatedly applied to produce the desired output.
Continuing, it goes through all elements of R by applying the Subsequently, the subset is exposed to the compression stage,
voting method to determine the value v as the element of where after it is pooled, the final dimension is flattened out.
majority. It finishes by returning the element v as the result of Such output from the first layer goes to the next layer which is
the attack detection process. [26] fully connected, and the prediction is done; finally at the last
layer, the predicted output can be seen. In the present study, an
B3. Facial Recognition for Security Purposes exhaustive search of the data from the image is going to
Facial recognition is the most critical function of video produce around 68 key points which is the main asset of the
surveillance systems, which makes it possible to determine study. It is evident that the overall CNN model can be
whether the image is that of a person in a scene, and mostly extracted from the given Fig. 1 to understand the structure of
monitored through a network of cameras. Such application has the CNN. The image will be pre-trained in the proposed CNN
architecture which hasn’t been done in the previous stage [30] discussing the diverse range of their applications across
[31]. The RGB-formatted input image that uses colour space various medical fields, as well as analysing the challenges of
from [0,255], will be converted to grayscale so that it changes applying deep learning in healthcare.
to [0,1]. To maintain the consistency of the original
information- it has a resolution of 224*224 pixels -, this C1. Architectures
grayscale data is resampled to the standard pixel size [32] [33] This section describes the various neural network architectures
[34]. The task is to apply appropriate formatting steps. After adapted for healthcare applications. While Convolutional
that, the convolution model accepts the image. Human figure Neural Networks (CNN) and Recurrent Neural Networks
key point extraction was achieved by the use of the given (RNN) are extensively used in healthcare, this section will
figure, which is the architecture of the CNN model in Fig. 6. focus on Autoencoders (AE), Restricted Boltzmann Machines
(RBM) and Long Short-Term Memory (LSTM).
i. Autoencoders (AE)
Autoencoders are one of the deep learning models that
illustrate the idea of unsupervised representation learning.
Initially, they were introduced as an early tool used to pre-
train supervised deep learning models, when labeled data was
uncommon. Despite that, they kept usefulness for
Figure 6. CNN architecture for Facial Key point Prediction unsupervised procedures such as the phenotype discovery
[36]. Explicitly, autoencoders are divided into two main parts
B4. Challenges the encoder and the decoder. The encoder consists of an input
Application of neural networks to security, on the other hand, layer, while the decoder comprises an output layer [37].
is fraught with a lot of challenges even with the effectiveness Moreover, they possess a similar number of nodes for both
of it. There is one prominent drawback of neural network input and output, and the number of units that are not visible is
models; it is in the paring of the network architecture. When less than that of the input or output layers, which achieves the
carrying out some studies researchers have noticed that the whole purpose of AE. Autoencoders are designed to encode
number of layers in the model can be affected in a negative the input data into a lower dimensional space [38]. By training
way through a decrease in accuracy. [20] Here is a an AE on a dataset, they are able to transform the input data
manifestation highlighting the importance of the model (model into a format focused only on storing the most important
class) architecture by demonstrating how it affects the derived dimensions. In this way, they bear resemblance to
accuracy; hence, an appropriate model class architecture and standard dimensionality reduction techniques, for instance, the
tuning are required. Ensuring that they keep up with the latest singular value decomposition (SVD) and the principal
algorithms and solutions for neural networks for organisations component analysis (PCA). However, autoencoders have an
that are prone to financial abuse is also critical. [35] The important advantage for complicated problems on account of
malicious changing nature of fraud schemes will continue to nonlinear transformations by each hidden layer’s activation
pose a challenge for financial institutions since the criminals functions, but one hidden layer of an autoencoder could
are always devising new means to carry out their scams. In potentially be insufficient to represent all the data if the input
other words, although neural networks leverage very attractive is of high dimensionality.
tools for fraud detection, anomaly detection etc, their
incorporation necessitates in-depth comprehension of their Additionally, autoencoders when stacked on top of each other
capabilities, defects, and latest developments to make them an are able to construct a Deep Autoencoder (DAE) architecture.
excellent weapon against crimes.
Numerous mutations of AE have been proposed to convert the
B5. Conclusion acquired representations into something more robust and
As we navigate an increasingly interconnected and digitised consistent rather than tiny changes in the input pattern. One of
world, the integration of neural networks in security systems those mutations is the Sparse Autoencoder (SAE), which
promises to fortify defences, thwart malicious activities, and specializes in learning sparse representations of the input data.
safeguard critical assets. Through an exploration of their Sparse Autoencoders achieve sparsity by activating only a
applications in fraud detection, anomaly detection in small subset of neurons during encoding, making the classes
cybersecurity, and facial recognition for security purposes, this even more divisible. Vincent et al. [39] proposed another
section illuminates the transformative potential of neural mutation known as denoising autoencoders. This method
networks in shaping the future of security paradigms. remakes the input by bringing in noise to the patterns, forcing
the model to focus solely on capturing the formation of the
C. In Health Care input. A similar concept was introduced by Rifai et al. [40] in
In recent years, the technological advancements in health their proposal of contractive autoencoders. However, instead
systems and especially the integration of neural networks in of corrupting the training set with noise, this mutation adds an
healthcare have revolutionized the world of medicine. analytical contractive penalty to the error function. Lastly, in
In this section, we will focus on the influence neural networks Convolutional Autoencoders (CAE) [41] their weights are
have had in healthcare, emphasizing on the various neural shared amidst all locations in the input to maintain spatial
network architectures that are commonly used in medicine,
locality and accurately process two-dimensional (2-D) forget gates [42]. These gates regulate the flow of information
patterns. within the network. They control how much information is
stored or discarded from the memory cell each time step,
ii. Restricted Boltzmann Machine (RBM) enabling the model to learn long-term dependencies more
The Restricted Boltzmann machine is another unsupervised effectively. One of the main motivations behind LSTM’s
deep learning architecture for learning input data design is to address the vanishing gradient problem
representations. Their aim is similar to autoencoders, but encountered in traditional RNNs. By introducing the memory
RBMs put on a stochastic outlook by evaluating the cell and gating mechanism, LSTM can reduce the issue of
probability distribution of the input data. Because of this, they vanishing gradients, allowing it to carry forward errors over
are frequently considered as generative models, aiming to extended sequences without the gradients diminishing to zero.
model the underlying process, responsible for generating the
data. Training an RBM usually includes stochastic C2. Applications
optimization methods, such as Gibbs sampling, which This section explores the applications of neural networks in
gradually adjusts the weights to minimize the reconstruction healthcare, focusing on three important areas: Medical
error. In an RBM, the visible and hidden units are combined to Imaging, Medical Informatics, and Disease Diagnosis
form a bipartite graph allowing for the implementation of Prediction.
more effective and thorough training algorithms. The
Restricted Boltzmann Machines serve as learning models in i. Medical Imaging
two main deep learning configurations, that have been In modern medicine, automatic medical imaging analysis
proposed in literature. These are the Deep Belief Network holds significant importance, since diagnosis based on the
(DBN) and the Deep Boltzmann machine (DBM). interpretation of images can be extremely subjective.
Deep learning methods have been tailored to handle properly • In eye diseases, neural networks are invaluable in
large and distributed datasets. The huge success of Deep diagnosing conditions like diabetic retinopathy, as seen in the
Neural Networks (DNNs) lies in their ability to learn features IDx-DR and IDx Technologies systems. These models use
and understand data representation in both supervised and medical imaging data, particularly retinal images for accurate
unsupervised hierarchical modes. DNNs are also effective in diagnosis. Additionally, supervised algorithms such as the
processing multimodal information by simply integrating random forest algorithm are used for predicting myopia, by
several components of their architecture. Consequently, it is drawing insights from electronic health records. This
not surprising that deep learning has rapidly been adopted in algorithm accurately predicted the development of adult
the area of medical informatics research. myopia in children up to eight years in advance, having an
accuracy rate ranging from 85% to 99% [43] [48].
Various applications demonstrate the adaptability of deep
learning in medical informatics. For example, authors • For cardiac irregularities, cloud-based artificial
highlighted their system’s capability to predict the probability neural network algorithms like Cardio DL, are used for
of patients developing certain conditions such as diagnosing such conditions. These algorithms use medical
schizophrenia, cancer, and diabetes. Additionally, Futoma et image data, in this case, Magnetic Resonance Imaging (MRI)
scans of heart ventricles. They have displayed efficacy in
studying the functioning of heart ventricles and blood flow, variety of neural network architectures that are commonly
contributing insights comparable to those of radiologists [43]. utilized in healthcare and their applications. These
architectures have demonstrated great efficacy, but despite all
• In Fractures, a machine learning model known as the advancements, they encounter many challenges such as
OsteoDetect is used for detecting radius fractures located managing large volumes of data and ensuring privacy, that
away from the joint. This model, uses wrist image data, require ongoing search efforts to overcome. However, the
specifically X-rays, for detection purposes. This model has potential of neural networks to transform healthcare continues
enhanced the efficiency of orthopedic clinicians in fracture to be very promising, due to the constant innovations of these
diagnosis and management [43]. technologies.
C3. Challenges
Applying deep learning in healthcare shows promising results. IV. FUTURE DIRECTIONS
However, because of those applications, there are also many
challenges being faced. In the following subsection, there’s a ANNs hold great promise for the future. They have the
summary of these challenges. potential to evolve in numerous fields, boosting our lives and
achieving remarkable deeds beyond our current imagination.
i. Volume of Data
Deep learning models are often considered computationally Pulsed or Spiked Neural Networks (SNNs) are considered to
intensive models, due to the large population of parameters be the next generation of neural networks. SNNs research
they require. To train these models effectively access to a wide started after data from neurobiological experiments made clear
clinical data is essential. However, due to confidentiality and that biological neural networks communicated through pulses,
ethical concerns, many researchers face challenges in using their timing to send information and perform
obtaining medical records. Moreover, in the case of calculations. SNNs model, the spiking behaviour of neurons,
underdeveloped countries, where there is a lack of healthcare and how their membrane changes electrically when influenced
records, and insufficient training of healthcare workers further by external factors. They are a prime factor in the evolution of
complicates the understanding of the relationships between Computer Vision, and they are used for image classification,
diseases and symptoms. [38] [44] object detection, object tracking, object segmentation, and
optical-flow estimation. SNNs are also employed in Robotic
ii. Temporality Control. They are used as a ‘brain’ for robots, which allows
Infections are continuously evolving, in a non-deterministic them to observe their surroundings and mimic the actions
manner. However, many existing deep learning models noted in their environment. For the robot to perform a task
depend on static vector-based sources of information, which such as the movement system inspired by the biological
cannot handle temporal aspects. Developing deep learning system, the network can be customised and adjusted by hand.
approaches capable of handling temporal healthcare data is an [57] [58]
imperative aspect that will require the creation of innovative
solutions [38] [44]. Multi/Infinite Dimensional Neural Networks (MDNNs) are a
new model of ANNs, that are the generalised version of One-
iii. Data Quality Dimensional Neural Networks (RNNs, CNNs, etc.). Their
Data quality in healthcare varies from structured datasets in theory is still under development but is based on the
computing and information security. Electronic healthcare generalisation of the gates from the one-dimensional logic to
data often suffers from issues related to dataset quality and the multidimensional logic. MDNN architecture is portrayed
these issues most of the time lead to paucity of data and by a Tensor State Space Representation, which is used to
inconsistencies in disease condition assessments. [38] [44] compute the output of each neuron. MDNNs use the BP
algorithm, which is tailored to neural networks with complex
iv. Privacy values, depending on complex signum and sigmoid functions.
One of the most crucial challenges in applying deep learning MDNNs have employed applications in the foundation of the
in healthcare is to understand whether neural network models three core concepts of cybernetics: (1) Development of the
are vulnerable to privacy or security threats. Artificial unified theory of control, (2) Communication, and (3) Coding.
intelligence models and privacy-preserving data mining are They also have applications in the field of binary filters and
subjects under extensive research. False positive the complex hypercube, which is a foundation of complex-
classifications for patients could lead to unnecessary concern. valued neural associative systems. [57]
Moreover, if poisoning attacks are detected, dataset clients
may take appropriate actions, such as dismissing the results of Forecasting methods in the context of NNs have both
the machine learning algorithm or attempting to identify and limitations and future innovations. There is a growing interest
eliminate any malicious data from the dataset. [43] [44] in the research community in exploring the application of
probabilistic forecasting to reduce the uncertainty of NN
predictions. Furthermore, multivariate forecasting will be
C4. Conclusion necessary for complex scenarios since products are becoming
Over the last decade, machine learning and pattern recognition more diverse, emphasising the need to explore multiple
have grown significantly. In this section, we explored the wide seasonality models, especially for high-frequency big data
contexts. Since RNNs were inefficient in modelling [10] Z. C. Lipton, “A Critical Review of Recurrent Neural Network for
sequence Learning”, researchgate.net, Jun. 2015
seasonality, researchers explored alternative approaches, such [11] C. S. K. Dash, A. K. Behera, S. Dehuri, S-B. Cho, “Radial basis function
as combining CNN filters with customised attention neural networks: a topical state-of-the-art survey”, 2016
algorithms. Temporal convolution networks (TCNs), an [12] C. Wang and L. Wang, “Artificial Neural Network and Its Application in
advanced type of CNN architecture, provide an efficient Image Recognition”, Journal of Engineering Research and Reports,
Volume 24, Issue 2, Feb. 2023
training process by combining convolutions with residual [13] X. Li and X. Lv, "Research on Image Recognition Method of
connections, resulting in improved efficiency for forecasting Convolutional Neural Network with Improved Computer Technology”,
tasks. [59] Journal of Physics: Conference Series 1744, 2021
[14] F. L. Borchardt, “Neural Network Computing and Natural Language
Processing”, CALICO Journal, Jun. 1988
Future projects that employ ANNS aim to address challenges [15] R. G. Franklin, A. R. Doni, D. Poornima, S. I. S. Prabu, “The Use of
and advance capabilities in programmable network devices, Recurrent Neural Networks in the Optimization of Computer Science
such as hardware offloading, data plane virtualization, NN Algorithms”, IEEE International Conference on Emerging Research in
orchestration, incremental and online learning, as well as Computational Science, 2023
[16] Z. Yan, “Research and Application on BP Neural Network Algorithm”
distributed and federated learning. [60] IEEE International Industrial Informatics and Computer Engineering
Conference, 2015
To enhance the accuracy and efficiency of ANNs in the future, [17] L. Liu, “Computer Network Routing Optimization Algorithm Based on
we can increase the number of hidden layers, and vary the Neural Network Mode” IEEE Asia-Pacific Conference on Image
Processing, Electronics and Computers (IPEC), Apr. 2023
training and learning rules applied within them. The ANN [18] A. K. Swain, S. K. Jayasingh, “Neural Network in Fraud Detection”,
technology will advance over time, with most applications Conference Paper, Aug. 2011
utilising them becoming more advanced, while researchers [19] M. L. Gambo, A. Zainal, M. N. Kassim, “A Convolutional Neural
invent new training ways and network architectures. [57] Network Model for Credit Card Fraud Detection”, Inter. Confer. on Data
Science and Its Applications (ICoDSA), 2022
[20] B. F. Murorunkwere, O. Tuyishimire, D. Haughton, J. Nzabanita, “Fraud
V. CONCLUSION Detection Using Neural Networks: A Case Study of Income Tax”, MDPI,
May 2022
[21] S. Yuan, X. Wu, J. Li, A. Lu, “Spectrum-based deep neural networks for
ANNs are one of the greatest inventions from the combination fraud detection”, Jun. 2017
of the Computer Science and the Neuroscience fields. Enabled [22] Z. Zhang, X. Zhou, X. Zhang, L. Wang, P. Wang, “A Model Based on
by their contributions, numerous fields including Computer Convolutional Neural Network for Online Transaction Fraud Detection”,
Science, Security, and Health Care benefited, unravelling Aug. 2018
[23] M. Lu, Z. Han, Z. Zhang, Y. Zhao, Y. Shan, “Graph Neural Networks in
many challenges in the process. Their ability to learn from Real-Time Fraud Detection with Lambda Architecture”, Oct. 2021
data and adapt to new information makes them capable of [24] P. Podder, S. Bharati, M. Rubaiyat Hossain Mondal, P. Kumar Paul, U.
solving complex problems, which is beneficial for most fields. Kose, “Artificial Neural Network for Cybersecurity: A Comprehensive
Although they offer solutions to many problems, challenges Review”, 2020
[25] T. A Tang, L. Mhamdi, D. McLernon, S. Ali Raza Zaidi, M. Ghogho,
still exist due to the complexities inherent in their “Deep Recurrent Neural Network for Intrusion Detection in SDN-based
implementation, which still await resolution. As technology Networks”, IEEE International Conference on Network Softwarization
and science advance, we acquire a new understanding of the (NetSoft 2018) - Technical Sessions, 2018
human brain, which originally inspired ANNs. This leads to [26] F. Jiang, Y. Fu, B. B. Gupta, Y. Liang, S. Rho, F. Lou, F. Meng, Z. Tian,
“Deep Learning Based Multi-Channel Intelligent Attack Detection for
the creation of architectures and training methods for ANNs Data Security", IEEE Transactions On Sustainable Computing, April-Jun.
that are more efficient, and potentially give solutions 2020
encountered by previous models. [61] [27] Y. Said, M. Barr, H. Eddine Ahmed, “Design of a Face Recognition
System based on Convolutional Neural Network (CNN)”, Engineering,
Technology & Applied Science Research, 2020
REFERENCES [28] S. Kanithan, N.A. Vignesh, E. Karthikeyan, N. Kumareshan, “An
[1] R. Qamar, B. A. Zardari, “Artificial Neural Networks: An Overview”, intelligent energy efficient cooperative MIMO-AF multi-hop and relay
ResearchGate, Mesopotamian Journal of Computer Science, Aug 2023. based communications for Unmanned Aerial Vehicular networks”,
[2] P. J. Denning, “The Science of Computing: Neural Networks”, American Comput. Commun., 2020
Scientist, Sigma Xi, The Scientific Research Honor Society, Sep-Oct [29] M. Rahman, S. Rahman, M. U. A. Ayoobkhan, “On the effectiveness of
1992 deep transfer learning for Bangladeshi meat based curry image
[3] I. A. Basheer, M. Hajmeer, “Artificial neural networks: fundamentals, classification”, International Conference on Innovations in Science,
computing, design, and application”, sciencedirect.com, vol. 43 no. 1, Engineering and Technology (ICISET), IEEE, 2022
Dec. 2000 [30] M. Rahman, S. Rahman, M. U. A. Ayoobkhan, “Fine Tuned
[4] Q. Liu, Y. Wu. “Supervised Learning”, researchgate.net, Jan. 2012 convolutional neural networks for Bangladeshi vehicle classification”,
[5] Y. Tishan, “Understanding the Difference Between Supervised and International Conference on Innovations in Science, Engineering and
Unsupervised Learning Techniques”, Sep. 2023 Technology (ICISET), IEEE, 2022
[6] B. M. Devassy, S. George, P. Nussbanm, “Unsupervised Clustering of [31] S.T. Suganthi, M. U. A. Ayoobkhan, N. B., K. Venkatachalam, H.
Hyperspectral Paper Data Using t-SNE”, researchgate.net, Journal of ˇStˇep´an, and T. Pavel. "Deep learning model for deep fake face
Imaging 6(5):29, May 2020 recognition and detection.", PeerJ Computer Science, 2022
[7] K. Sivamayil, E. Rajaseker, B. Aljafari, S. Nikolovski, S. [32] T. Guo, J. Dong, H. Li, Y. Gao, “Simple convolutional neural network on
Vairavasundaram, I. Vairavasundaram, “A systematic study on image classification”, IEEE 2nd International Conference on Big Data
Reinforcement Learning Based Applications”, mdpi.com, Feb. 2023 Analysis (ICBDA), IEEE, Mar. 2017
[8] M. H. Sazli, “A brief review of feed-forward neural networks”, [33] F. Sultana, A. Sufian, P. Dutta, “Advancements in image classification
researchgate.net, May 2015 using convolutional neural network”, Fourth International Conference on
[9] V. Srilakshmi, G. U. Kiran, M. Mounika, A. Sravanthi, N. V. K. Sravya, Research in Computational Intelligence and Communication Networks
V. N. S. Akhil, M. Manasa, “Evolving Convolutional Neural Network (ICRCICN), IEEE, Nov. 2018
with Meta-Heuristics for Transfer Learning in Computer Vision”,
sciencedirect.com, 2023
[34] Y. Pei, Y. Huang, Q. Zou, X. Zhang, S. Wang, “Effects of image [58] K. Yamazaki, V. Vo-Ho, D. Bulsara, and N. Le, “Spiking Neural
degradation and degradation removal to CNN-based image Networks and their Applications: A Review”, MDPI,brain sciences, Jun
classification”, IEEE Trans. Pattern Anal. Mach. Intel., 2019 2022.
[35] A. C. Navarrete, A. C. Gallegos, “Neural Network Algorithms for Fraud [59] H. Hewamalage, C. Bergmeir, and K. Bndara,“Recurrent Neural
Detection: A Comparison of the Complementary Techniques in the Last Networks for Time series Forecasting: Current status and future
Five Years”, 2021 directions”, ScienceDirect, Faculty of Information Technology, Monash
[36] M. Cabrera-Bean, V. J. Santos, A. R. Llorach, S F. Bertolin, J. Vidal, and University, Melbourne, Australia, Jan-Mar 2021 .
C. Violan, “Autoencoders for health improvement by compressing the set [60] F. D. Rossi, M. C. Luizelli, A. Lorenzon, and M. Caicedo, “In-Network
of patient features”, IEEE Engineering in Medicine & Biology Society, Neural Networks: Challenges and Opportunities for Innovation”,
Sep. 2018 ResearchGate, IEEE Network, Nov-Dec 2021
[37] B. Shickel, P. J. Tighe, A. Bihorac, and P. Rashidi, “Deep ehr: A survey [61] S, Agrawal, J. Agrawal, “Neural Network Techniques for Cancer
of recent advances in deep learning techniques for electronic health record Prediction: A Survey”, sciencedirect.com,19th International Conference
(ehr) analysis,” IEEE Journal of Biomedical and Health Informatics, Sep. on Knowledge Based and Intelligent Information and Engineering
2018 Systems, Dec 2015.
[38] I. Zion, S. Ozuomba, P. Asuquo, “An Overview of Neural Network
Architectures for Healthcare”, IEEE International Conference in
Mathematics, Computer Engineering and Computer Science, Apr. 2020
[39] P. Vincent, H. Larochelle, Y. Bengio, P. A. Manzagol, “Extracting and
Composing Robust Features with Denoising Autoencoders”,
researchgate.net, University of Montreal, Jan. 2008
[40] S. Rifai, P. Vincent, X. Muller, X. Glorot, Y. Bengio, “Contractive Auto-
Encoders: Explicit Invariance During Feature Extraction”,
scholar.google.com, ICML, Jan. 2008
[41] H. Sak, A. Senior, F. Beaufays, “Long Short-Term Memory Recurrent
Neural Network Architectures for Large Scale Acoustic Modelling”,
Cornell University, Sep. 2014
[42] H. Sak, A. Senior, F. Beaufays, “Long Short-Term Memory Based
Recurrent Neural Network Architectures for Large Vocabulary Speech
Recognition”, Cornell University, Sep. 2014
[43] A. Pandit, A. Garg, “Artificial Neural Network in Healthcare: A
Systematic Review”, IEEE International Conference on Cloud
Computing, Data Science & Engineering, Mar. 2021
[44] S. K. Pandey, R. R. Janghel, “Recent Deep Learning Techniques,
Challenges and Its Applications for Medical Healthcare System: A
Review”, Department of Information Technology, India, Jan. 2019
[45] C. P. Kovesdy, “Epidemiology of Chronic Kidney Disease: an update
2022”, University of Tennessee Health Science Center, Memphis, USA,
Apr. 2022
[46] H. Khalid, A. Khan, M. Z. Khan, G. Mehmood, M. S. Qureshi, “Machine
Learning Hybrid Model for the Prediction of Chronic Kidney Disease”,
National Library of Medicine, Mar. 2023
[47] S. M. Li, M. Y. Ren, J. Gan, S. G. Zhang, M. T. Kang, H. Li, D. A.
Atchison, J. Rozema, A. Grzybowski, N. Wang, “Machine Learning to
Determine Risk Factors for Myopia Progression in Primary School
Children: The Anyang Childhood Eye Study”, National Library of
Medicine, Apr. 2022
[48] D. Ravi, C. Wong, F. Deligianni, M. Berthelot, J. A. Perez, B. Lo, G. Z.
Yang, “Deep Learning for Health Informatics”, IEEE Journal of
Biomedical and Health Informatics, Dec. 2016
[49] F. Li, L. Tran, K. H. Thung, S. Ji, D. Shen, and J. Li, “A robust deep
model for improved classification of ad/mci patients,” IEEE J. Biomed.
Health Inform, Sep. 2015.
[50] D. Kuang and L. He, “Classification on ADHD with deep learning,”
International Conference on Cloud Computing and Big Data, Nov 2014.
[51] G. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for
deep belief nets,” Neural computation, Aug 2006.
[52] M. Havaei, N. Guizard, H. Larochelle, and P.-M. Jodoin,” Deep Learning
Trends for Focal Brain Pathology Segmentation in MRI”, July 2016.
[53] J. Z. Cheng et al., “Computer-Aided Diagnosis with Deep Learning
Architecture: Applications to Breast Lesions in US Images and
Pulmonary Nodules in CT Scans,” https://www.nature.com/srep/, Apr.
2016
[54] J. Shan and L. Li, “A Deep Learning Method for microaneurysm
detection in fundus images,” IEEE International Conference on
Connected Health, Jun. 2016
[55] J. Futoma, J. Morris, and J. Lucas, “A comparison of models for
predicting early hospital readmissions,” Journal of Biomedical
Informatics, Aug. 2015.
[56] Z. C. Lipton, D. C. Kale, C. Elkan, and R. C. Wetzel, “Learning to
diagnose with LSTM recurrent neural networks,” Cornell University,
Nov. 2015
[57] S. Lakra, T. V. Prasad, G. Ramakrishna, “The Future of Neural
Networks”, ResearchGate, 6th National Conference - Computing For
Nation Development, INDIA, Feb 2012.