Explaining Deep Neural Network Using Layer-Wise
Explaining Deep Neural Network Using Layer-Wise
Abstract—Machine learning has become an integral part of increased attention. It is because of these areas that the field
technology in today’s world. The field of artificial intelligence of EXPLAINABLE AI, was born [7]. In this paper, we focus
is the subject of research by a wide scientific community. In on decisions of Deep Convolutional Neural Networks using
particular, through improved methodology, the availability of big
data, and increased computing power, today’s machine learning EfficientNet [8] neural network architecture on datasets like
algorithms can achieve excellent performance that sometimes MNIST[3], MNIST-Fashion dataset[4], Imagenette and Image-
even exceeds the human level. However, due to their nested woof which are subsets of ImageNet [5]. Explainable Artificial
nonlinear structure, these models are generally considered to Intelligence techniques used in experiments were Integrated
be ”Black boxes” that do not provide any information about Gradients [1] and Layer-wise Relevance Propagation [2].
what exactly leads them to provide a specific output. This
raised the need to interpret these algorithms and understand II. E XPLAINABLE A RTIFICIAL I NTELLIGENCE -
how they work as they are applied even in areas where they
can cause critical damage. This article describes Integrated I NTERPRETABILITY
Gradients [1] and Layer-wise Relevance Propagation [2] methods Artificial Intelligence approaches are now days used in
and presents individual experiments with. In experiments we every field spanning from e-commerce [9] to computer games
2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI) | 978-1-7281-8053-3/21/$31.00 ©2021 IEEE | DOI: 10.1109/SAMI50585.2021.9378686
Authorized licensed use limited to: University of New South Wales. Downloaded on May 18,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
I. Cık et al. • Explaining Deep Neural Network using Layer-wise Relevance Propagation and Integrated Gradients
that are understandable to its users and developers but to gather more data to enhance model performance. Others
develop methods to obtain some form of explanation from listed explainability as a debugging solution, as it may
complex models that are difficult for users to understand or be a powerful tool to debug deep neural networks.
even impossible to understand [23]. The goals that Explainable • Explain to improve - A model that can be clarified and
Artificial approaches should provide differ in the research understood is one that can be boosted easier.
community. Providing explainable models while maintaining • Explain to discover - Asking for answers is a valuable
a high level of learning performance and enabling the end- tool for learning new facts, gathering information, and
users to fully understand what leads to building trust are thus gaining knowledge.
the main goals in [24]. Instead of creating the methods for Knowledge and demand for explanation are growing in
explaining black boxes [25], we should develop models that different domains, hence the question as ”why the use of XAI
are inherently interpretable as it is stated by Rudin [15]. is not systematic?” or ”why is XAI not being used in every AI
Interpretability can differ from the task [26, 27, 28] however, system?”. It is currently a very difficult technical issue to add
there are also attempts to introduce general definitions [29]. interpretability to AI systems. In some cases expert systems
Usually, a human can comprehend 7+-2 pieces of information are explainable but inflexible and hard to use, sometimes we
at a time [30]. use Deep Neural Networks as a solution. The main advantage
At the Fairness, Accountability, and Transparency in Ma- of these algorithms is that they are effective but on the other
chine Learning, the main goal of explainability in machine hand it is virtually impossible to see inside. Further solution
learning, ” is to ensure that algorithmic decisions, as well as [35] were proposed to explain those family of the model, but
any data driving those decisions, can be explained to end-users again using the non-interpretable model to interpret or explain
and other stakeholders in non-technical terms” [31]. the non-interpretable model. The problem, remains the same.
Responsible AI (RAI) is about societal values, moral and Advanced machine learning algorithms go to the opposite
ethical considerations. Responsible AI has three main pillars end of the spectrum, generating systems able to function solely
[32]: from observations and construct their world models on which
• Accountability refers to the need for its owners, con- to base their predictions. Although, the complexity which
sumers and others with whom the program communicates gives ML algorithms exceptional predictive abilities often
to explain, and justify its decisions and actions. makes the results of the algorithms difficult to understand.
• Responsibility refers to the position of individuals them- Nevertheless, because of their structure and how they operate,
selves and the capacity of AI systems to react to their own ML algorithms are difficult to interpret. Intrinsically, ML
decisions and recognize errors or unexpected results. algorithms consider high-degree interactions between features
• Transparency refers to the need to describe, inspect and of inputs which make it difficult to disaggregate such functions
reproduce the mechanisms through which AI systems into humanly understandable forms. We take the most popular
make decisions and learns to adapt to its environment, contemporary ML model, the DNN as an example. DNN has
and to the governance of the data used created [31].In a common nonlinear multi-layer structure consisting of many
[23] transparency is one of the properties that can enable hidden layers and several neurons per layer, this architecture
interpretability. allows to generate high-level prediction by multiple levels
of linear transformations and nonlinear activations. While a
A. Why do we need XAI?
single linear transformation can be represented by looking at
There are at least four reasons, based on the explored the weights from the input features to each of the output
literature and usage nowadays: groups, multiple layers and non-linear correlations in each
• Explain to justify - There were several controversies layer suggest that a super-complicated hierarchical structure is
over AI / ML-powered systems that yielded biased or separated, which is a complex and theoretically problematic
discriminatory results over the last several years [33]. The process [36].
author showed an example from the field in which an AI
system trained to predict the risk of human pneumonia III. I NTEGRATED G RADIENTS
reached a completely wrong conclusion. Applying this Integrated Gradient (IG) [1] is a deep neural network
model in a black box manner will not reduce but increase explainability technique that visualizes the significance of its
the number of deaths associated with pneumonia [34]. input function that contributes to the prediction of the model. It
• Explain to control - Understanding system behavior pro- computes the integral of the gradients of the output prediction
vides greater visibility over unknown bugs and defects for the class with respect to the input image pixels and requires
and helps identify and correct errors in low critical no modification to the original deep neural network. One of
situations. Most data scientists are frustrated with the the main advantages of the integrated gradients method is
low or inconsistent model performance. Understanding that it can be used on various models such as image, text,
why the model performs so poorly on certain inputs, or structured data. IG can be used for:
and also to identify regions of the input space with • Understanding feature importance by extracting rules
lower average output. Besides, they seek advice on how from the network.
to design new features, remove redundant features, and • Debugging deep learning models performance.
000382
Authorized licensed use limited to: University of New South Wales. Downloaded on May 18,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
SAMI 2021 • IEEE 19th World Symposium on Applied Machine Intelligence and Informatics • January 21–23 • Herl’any, Slovakia
Fig. 1. Today’s systems without XAI against today’s systems with XAI [24]
000383
Authorized licensed use limited to: University of New South Wales. Downloaded on May 18,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
I. Cık et al. • Explaining Deep Neural Network using Layer-wise Relevance Propagation and Integrated Gradients
A recursive rule that redistributes the relevance of one layer First, we took all the test images that belong to the ”Coat”
to the next layer is the -rule class. For each image in this class, we used the LRP method
and added the individual values at the same positions ,and then
(l)
X zji (l+1) divided them by the number of images.
ri = P P rj
0 (zji0 + bj ) + ∗ sign( 0 (zji0 + bj ))
j i i
(4)
where we defined zji to be the weighted activation of a
neuron i onto neuron j in the next layer and bj the additive
bias of unit j. A small amount of is added to the denominator
of Equation 2 to avoid numerical instability [42].
V. E XPERIMENTS
In this section, we describe the experiments we performed
on various data. Our goal was to interpret the outputs from
neural networks. We tried to identify which areas were crucial
for classification at the input. Subsequently, we tried to create Fig. 4. Cumulative heatmap of the ”Coats” class.
a cumulative map for the whole group of inputs (class) and
not just for one specific input. We consider this work to be The output is the average heatmap for a given class, which
important, as artificial intelligence is also used in critical areas should visualize which area of the image most influences
where it can affect human life, so it is important to understand the classification decision for a given class in general. The
what is happening in individual algorithms. problem of misclassification comes when we have similar
heatmaps for two classes, e.g. dog breed. If our accuracy is not
A. LRP Experiments high enough, we would look at classes that have very similar
In the first experiment, we used -LRP method. In our cumulative heatmaps, and in the training pictures, we would
experiments, we used the well-known MNIST dataset [3], try to cover the areas on which they learn. In this way, we
which consists of handwritten numbers from 0 to 9, so it could get our model to consider other areas than identifying
contains 10 classes. Images measure 28 x 28 x 1 pixels. the class.
The dataset contains 60,000 training and 10,000 test images. B. Integrated Gradients Experiments
The neural network model had 2 hidden layers, each with
In the first experiment, we used the pre-trained EfficientNet-
256 neurons. On testing dataset, we achieved an accuracy of
B0 model, which was trained on ImageNet [5]. Transfer
96.75%.
learning was used to change the last layer to 10 classes output
and train the model on the Imagenette dataset. Imagenette is
a subset of 10 easily classified classes from ImageNet (tench,
English springer, cassette player, chain saw, church, French
horn, garbage truck, gas pump, golf ball, and parachute). For
each class there are 1000 images ± 40 images with resolution
of 320px x 320px. On validation test data we achieved an
accuracy of 90.05% after 20 epochs.
Fig. 3. Visualization of output and heatmap by -LRP method. Pixels are red
if their value is positive, or blue if their value is negative. The intensity of
the color depends on the distance of value from zero. Dark red pixels had the
greatest influence on the neural network decision when choosing a class. [40]
000384
Authorized licensed use limited to: University of New South Wales. Downloaded on May 18,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
SAMI 2021 • IEEE 19th World Symposium on Applied Machine Intelligence and Informatics • January 21–23 • Herl’any, Slovakia
000385
Authorized licensed use limited to: University of New South Wales. Downloaded on May 18,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.
I. Cık et al. • Explaining Deep Neural Network using Layer-wise Relevance Propagation and Integrated Gradients
000386
Authorized licensed use limited to: University of New South Wales. Downloaded on May 18,2021 at [Link] UTC from IEEE Xplore. Restrictions apply.