0% found this document useful (0 votes)
29 views32 pages

Unit V 1

Uploaded by

mukthass24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views32 pages

Unit V 1

Uploaded by

mukthass24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

UNIT-V

Post-hoc Interpretability and


Explanations
TOPICS IN UNIT V
I. Visual explanations: Partial Dependence Plots (PDP), Individual
Conditional Expectation (ICE) Plots, Ceteris Paribus (CP) Plots,
Accumulated Local Effects (ALE) Plots, Breakdown Plots.

II. Feature importance: Feature Interaction, Permutation Feature


Importance, Ablations: Leave-One-Covariate-Out, SHAP,
KernelSHAP, Anchors, Global Surrogate, LIME;

III. Example-Based : Contrastive Explanation, k-NN, Trust scores,


Counterfactuals, Prototypes/Criticisms, and Influential Instances;

IV. XAI challenges and future


POST-HOC TECHNIQUE'S
INTRODUCTION
• Definition: Post-hoc techniques, also known as post-hoc interpretability methods
or model-agnostic (model-independent) interpretability methods, are tools and
methodologies used to interpret the predictions made by machine learning
models, especially those that are considered "black-box" models.
• Types of Post-hoc Techniques:
I. Visual Explanations: Utilize plots, graphs, and diagrams to represent
relationships between input features and model predictions, making
complex model behavior more understandable.
II. Feature Importance Methods: Quantify the impact or importance of each
input feature on model predictions, providing insights into influential
features at both local (specific instances) and global (entire dataset) levels.
III. Example-based Explanations: Focus on explaining individual predictions
by highlighting similar instances from the dataset, offering insights into how
the model makes predictions and its behavior.
I. Visual Explanation
• Visual explanation encompasses a set of methods that allow us to visually inspect the
relationships between inputs and outputs or with other input features. This allows
us to better understand their influence and contribution to the predicted response.
• Visual explanations have the advantage in that they are relatively easy to interpret,
especially for models with a small set of features. They can provide global or local
explanations, depending on the method. However, they can become complex with a
larger feature set and may not properly capture or visualize correlations between
features, resulting in erroneous attributions.
• Following are the 6 visual explanation methods discussed here:
1. Partial Dependence Plots (PDP),
2. Individual Conditional Expectation (ICE) Plots,
3. Ceteris Paribus (CP) Plots,
4. Accumulated Local Effects (ALE) Plots,
5. Breakdown Plots,
1. Partial Dependence Plots (PDP)
• Partial dependence plots (PDP) show the dependence between the target
response and a set of input features of interest.
• Due to the limits of human perception, the size of the set of input features of
interest must be small (usually, one or two) thus the input features of interest are
usually chosen among the most important features. The figure below shows two
one-way plots for the bike sharing dataset
One-way PDP
• One-way PDPs tell us about the
interaction between the target
response and an input feature
of interest
• The left plot in the figure shows
the effect of the temperature
on the number of bike rentals;
we can clearly see that a higher
temperature is related with a
higher number of bike rentals.
• Similarly, we could analyze the
Number
of bike effect of the humidity on the
rentals
(target
number of bike rentals (right plot).
feature)
1. Partial Dependence Plots (PDP)
Two-way PDP
• PDPs with two input features of interest
show the interactions among the two
features.
• For example, the two-way partial
dependence plots in the figure shows the
dependence of the number of bike rentals
on joint values of temperature and
humidity.
• Here, Fix temp at a particular value (E.g. 25) and
vary humidity across its range of values.
Calculate the average predicted bike rentals for
each combination of temp and humidity.
• In a 3D surface plot, the x axis represents temp,
the y-axis represents humidity, and the z-axis
(or color intensity in a heatmap) represents the
number of bike rentals.
PDP ADVANTAGES AND LIMITATIONS
• Advantages of Partial Dependence Plots (PDPs)
• Simplicity and Interpretability: PDPs are easy to interpret as they provide a clear, average
effect of a feature on the predicted outcome. This makes it straightforward to understand how
changes in a feature impact the overall prediction.
• Global Insight: By averaging the effect of a feature over all instances, PDPs provide a
comprehensive view of the general trend and feature importance across the entire dataset,
which is useful for understanding the model's global behavior.
• Disadvantages of Partial Dependence Plots (PDPs)
• Loss of Individual Variability: PDPs do not capture individual differences between
instances. They can mask heterogeneous effects and interactions between features, leading to
potential misinterpretations when the average effect does not represent all individual
behaviors accurately.
• Assumption of Feature Independence: PDPs assume that the features are independent,
which might not be true in practice. If there are strong interactions between features, the
insights from PDPs could be misleading as they do not account for these interactions
effectively.
Explainable properties of partial dependence plots

1.Global: Partial dependence plots provide


a global view of the relationship between
a feature and the target variable across
the entire dataset. They show how the
average predicted outcome changes as the
value of a specific feature varies while
keeping all other features constant.
2.Linear: When PDPs show a linear
relationship, it means that the effect of the
feature on the target variable is consistent
and uniform across different values of the
feature. In other words, the relationship
between the feature and the target
variable can be adequately
approximated by a straight line.
Explainable properties of partial dependence plots
3) Monotonic: A monotonic relationship in PDPs means that the effect of the feature on
the target variable consistently increases or decreases with the feature's value. In a
monotonic increasing relationship, increasing the feature value leads to an increase in the
predicted outcome. Conversely, in a monotonic decreasing relationship, increasing the feature
value leads to a decrease in the predicted outcome.

4) Feature interactions captured = No: This property indicates whether the PDPs account
for interactions between features. PDPs do not capture feature interactions i.e. they only show
the marginal effect of individual features on the target variable without considering how
the effects of one feature may change depending on the values of other features.

5) Model complexity is Low: PDPs provide an interpretable way to understand the


behavior of a complex machine learning model, such as a decision tree or a random forest,
without requiring deep knowledge of the model's inner workings. PDPs visualize the model's
predictions by averaging over the effects of all other features, making them relatively
straightforward to interpret.
2. Individual Conditional Expectation (ICE) Plots

• The Individual Conditional Expectation


(ICE) plot is analogous to the PD plot in
that it visualizes the relationship of the
output on a feature but plots a
separate line for each instance as
shown in Fig.
• Figure 5.4 shows the ICE plot for
Glucose (A line is drawn for every
instance of Glucose feature). The PDP
line is the average of all lines.
• Purpose: To understand how the
model's prediction changes when a
specific feature is varied.
• To capture individual instance-level
predictions rather than an averaged
effect, revealing heterogeneous
relationships.
Target Feature (Diabetes: Yes/No)
Interpretation of ICE Plots

• Each Line Represents an Instance:


Each line in the plot corresponds to an
individual instance from the dataset.
• Slope and Shape:
• A line with a steep slope indicates a
strong effect of the feature on the
prediction for that instance.
• A flat line suggests the feature has
little or no effect on the prediction
for that instance.
• Non-linear patterns show complex
relationships between the feature
and the prediction.
• Variability: The spread of lines
shows variability in the model's
response to the feature across
different instances.
ICE PLOTS: ADVANTAGES AND LIMITATIONS
1.Reading the Provided ICE Plot:
 X-Axis: Represents the values of the feature being
varied.
 Y-Axis: Represents the predicted outcomes.
 Lines: Each line corresponds to the predictions for
one instance as the feature varies.
• Advantages of ICE Plots:
• Instance-Level Insight: Provides detailed insights
for individual instances, which can uncover
heterogeneous effects hidden in PDPs.
• Detecting Interactions: Helps in identifying
interactions between features and their varying
impacts on predictions.
• Limitations:
• Complexity: Can be complex to interpret with too
many instances, leading to cluttered plots.
• Computationally Intensive: Requires multiple
predictions for each instance, which can be
computationally expensive.
EXPLAINABLE PROPERTIES OF
ICE PLOTS
1. Global: Similar to the PDP, ICE provides global
explanations over the entire dataset for the selected
features.
2. Linear: The relationship between the feature and
the model's prediction is represented as a straight
line in each individual ICE plot.
3. Monotonic: The relationship shown in each ICE
plot consistently increases or decreases without any 4. Feature interactions captured is No
reversals. Example: If you're predicting customer because the ICE plots show the plots
satisfaction based on waiting time, a monotonic ICE between each input feature and the target
plot would show that as waiting time increases, feature and not between the features.
satisfaction always either increases or decreases,
without any instances where it decreases, then 5. Model complexity is low because ICE
increases. plots are based on simple models that
don't incorporate complex
relationships or interactions between
features.
3. Ceteris Paribus (CP) Plots
• Ceteris Paribus (CP) plots (Latin for "all else being equal")
are used to explain the predictions of a machine learning
model for a specific instance, while keeping all other features
constant and varying one or more features of interest. They are
also called as “what if” plots.
• These plots are part of local explainability techniques,
meaning they help understand how a model behaves for a
specific data point rather than globally across the dataset.
• The CP plot shows how the model's prediction changes for a
specific instance when the values of one or more features are
varied, with all other features held fixed.
WORKING OF CP PLOTS
1.Select an Instance: Pick a specific data point (observation)
to analyze.
2.Choose a Feature: Select the feature(s) you want to vary.
3.Fix Other Features: Keep all other feature values constant
at the original values of the instance.
4.Vary the Feature: Change the value of the selected
feature(s) across a range.
5.Predict: Use the trained model to calculate the prediction for
each varied value of the feature.
6.Plot the Results: Create a line or surface plot that shows the
prediction as the feature value changes.
CP PLOTS: EXAMPLE

• Scenario:You want to understand how the price of a house is influenced


by its size, for a specific house.
• Data Point: House Instance:
• Size: 1200 sq ft Size Bedrooms Age Location Price
• Bedrooms: 3 1200 3 5 Urban 50L
• Age: 5 years
• Location: Urban
• Predicted price by the model: ₹50 lakhs
• Steps
1.Fix Other Features: Keep Bedrooms, Age, and Location constant at
their current values.
2.Vary the Feature: Change the Size from 800 sq ft to 2000 sq ft.
3.Predict: Use the model to calculate the predicted price for each size while
keeping other features unchanged.
4.Plot: On the x-axis, plot the size; on the y-axis, plot the predicted price.
CP PLOTS: EXAMPLE
• Plot: On the x-axis, plot the size; on the
y-axis, plot the predicted price.
• Results: The CP plot might show:
• At 800 sq ft, the predicted price is ₹30 lakhs.
• At 1200 sq ft, the predicted price is ₹50 lakhs.
• At 2000 sq ft, the predicted price is ₹80 lakhs.
• Insights
• The CP plot shows how the size affects the
price for this specific house, revealing
whether the relationship is linear, non-linear,
or has diminishing returns.
• You can generate CP plots for other features
like bedrooms or age to understand their
local effects on the prediction.
3. Ceteris Paribus (CP) Plots
• Interpretation of Ceteris Paribus Plots:
CP plots are useful for explaining the feature
influence on model prediction for a specific
data instance. They assume independence
between the feature plotted and the
remaining model features.
• CP plots make the same underlying
assumption as PD plots in that little to no
correlation exists between model features.
• For dependent features, a change in one
will occur with a change in another, which is
not captured by CP plots.
• Figure 5.6 shows Glucose and BMI as most
influential for diabetes prediction, while
SkinThickness has the least impact; CP plots
suggest an inverse relationship between
BloodPressure/Insulin levels and diabetes
prediction, with higher levels predicting
lower risk.
Explainable properties of CP Plots
1. Local: CP plots focus on understanding the behavior of a model around a
specific data instance. Instead of looking at the entire dataset, CP plots zoom
in on how the model behaves for a particular case, providing insights into
local predictions.
2. Linear: CP plots display the relationship between a single feature and the
model's prediction in a linear manner. This means that as the feature value
changes, the model's prediction changes consistently along a straight line,
making it easier to interpret.
3. Monotonic: CP plots show monotonic relationships, meaning that as the value
of a feature increases or decreases, the model's prediction either consistently
increases or decreases, without reversing direction. This helps in
understanding how changes in a feature directly impact the model's output.
4. Feature interactions captured is "No": CP plots do not capture interactions
between features. They only focus on the relationship between a single feature
and the model's prediction while holding all other features constant. This
simplifies the interpretation by isolating the impact of individual features.
5. Model complexity is low: CP plots are based on simple models or
assumptions, making them easy to understand and interpret. They do not
involve complex calculations or algorithms, which enhances their usability for
explaining model behavior in a straightforward manner.
Comparison between PDP, ICE, and CP plots

Feature Partial Dependence Plots Individual Conditional Ceteris Paribus (CP) Plots
(PDPs) Expectation (ICE) Plots

Scope Global interpretation, averaging Detailed view of individual Local interpretation for a single
effect over all instances predictions across dataset instance

Visualization Single line (or surface) showing Multiple lines, one for each Single line showing effect for
average effect instance one instance

Interpretabil Can obscure individual variations Complex due to many lines, Straightforward for one instance,
ity due to averaging showing individual effects not for global view

Use Case Understanding overall feature Capturing heterogeneity in "What-if" analysis for individual
impact on predictions feature impact instances

How It Varies feature values while Varies feature values for Varies feature values for a selected
Works averaging effects of others each instance, keeping others instance, keeping others fixed
fixed
Ideal For Identifying general trends and Identifying subgroups and Detailed analysis of model behavior
global effects instance-specific behavior for specific cases

Limitations Misleading with correlated features; Overwhelming with large Only provides insight for one
averages can obscure details datasets; many lines instance at a time
4. Accumulated Local Effects (ALE) Plots
• Accumulated Local Effects (ALE) plots are a visualization tool
used to interpret machine learning models. They show how
features influence model predictions on average, while
addressing some limitations of Partial Dependence Plots (PDPs),
particularly when features are correlated.
• Key Concept
• ALE plots display the local effect of a feature on the model
predictions by accumulating changes in the prediction over
the feature's range.
• Unlike PDPs, which rely on averaging predictions and can be
biased by feature correlation, ALE plots compute changes by
considering the local distribution of the feature.
ALE Plots working
1. Divide the Feature into Intervals: Split the range of the
feature of interest into intervals (bins).
2. Calculate Local Effects:
1. Within each interval, compute the change in model predictions
when the feature value moves from the lower to the upper
boundary of the interval.
2. This measures the feature's local impact.
3. Accumulate Effects:
1. Integrate (or accumulate) these changes across the intervals to
show the total effect of the feature.
2. The accumulated effects are centered around 0 for interpretability.
4. Plot Results:
1. The x-axis represents the feature values.
2. The y-axis shows the accumulated effect on the model's predictions.
ALE PLOTS EXAMPLE

• Scenario:You want to analyze how the size of a house affects its predicted
price, using ALE plots.
• Steps
1.Feature: House size (sq ft), ranging from 800 to 2000.
2.Model: A trained machine learning model predicting house prices.
3.Divide the Feature:
1. Split the size range into intervals: [800-1000], [1000-1200], ..., [1800-2000].
4.Compute Local Effects:
1. For each interval, calculate the average change in predicted price when size increases
from the lower to the upper bound of the interval.
1. Example: In the [800-1000] interval, the predicted price increases by ₹2 lakhs on average.
5.Accumulate Effects: Start from 0 and add the effects interval by interval to
get the accumulated local effect.
6.Plot:
1. x-axis: House size.
2. y-axis: Accumulated effect on price (centered around 0).
ALE PLOTS: IMPORTANCE
• What the Plot Might Show
• For small houses (<1200 sq ft), price increases sharply with size.
• For medium houses (1200-1600 sq ft), the price increase slows down.
• For large houses (>1600 sq ft), the price effect flattens, showing
diminishing returns.
• Why ALE is Useful
1.Handles Correlation: Unlike PDPs, ALE accounts for feature
correlations, providing a more accurate interpretation.
2.Local Insights: Focuses on local changes in predictions, making it
robust to uneven feature distributions.
3.Efficiency: Requires fewer computations compared to PDPs when
handling correlated features.
4. Accumulated Local Effects (ALE) Plots interpretation
• Consider a model predicting house prices based on Square footage Number of House price
square footage (x1​) and the number of bedrooms (x1) bedrooms (x2)
(x2​). We want to understand how square footage
influences the predicted price while considering
interactions with the number of bedrooms.
• Suppose we divide the range of square footage into
intervals (vertical lines): [0-1000], [1000-1500],
[1500-2000], and so on.
• For each interval, we calculate the average
difference in predicted prices between adjacent
data points within the interval, considering the
interaction with the number of bedrooms (x2).
• ALE plots allow us to visualize the accumulated
local effects of individual features on model
predictions while considering interactions with
other features, providing valuable insights into the
model's behavior.
4. EXPLAINABLE PROPERTIES OF ALE PLOTS
1. Global: ALE plots provide insights into the overall behavior of the model across
the entire range of a feature. Unlike local interpretations that focus on specific
data instances, ALE plots offer a global perspective, showing how the feature
affects the model's predictions across its entire range.
2. Non-Linear: ALE plots can capture non-linear relationships between features
and model predictions. This means that the effect of changing a feature on the
model's output may not follow a straight line but can exhibit curves or other
non-linear patterns.
3. Non-Monotonic: ALE plots can represent non-monotonic relationships, where
the effect of a feature on the model's prediction does not consistently increase or
decrease with changes in the feature value. Instead, it may exhibit peaks,
valleys, or other irregular patterns.
4. Feature interactions captured is "Yes": ALE plots take into account
interactions between features. This means that they consider how changes in one
feature may influence the effect of another feature on the model's predictions.
By accounting for feature interactions, ALE plots provide a more comprehensive
understanding of how different features collectively impact the model's
behavior.
5. Model complexity is "Low": ALE plots are based on relatively simple
calculations and assumptions, making them easy to interpret and understand.
They do not involve complex algorithms or computations, which enhances their
usability for explaining the behavior of machine learning models in a
straightforward manner.
5. BREAKDOWN PLOTS
• A breakdown plot visually explains the contribution of
individual features in a model's prediction by showing
how each feature contributes to increasing or
decreasing the predicted value.
• They decompose the prediction into contributions from each feature,
showing how much each feature affects the final outcome.
• Let's consider a model f(x) that predicts an outcome based on d
features. For a specific data instance x∗, we can express the model's
prediction as:

• Here, represents the contribution of feature i to the prediction for the


instance x∗.
BREAKDOWN PLOTS: Example
• Scenario: Predicting House
Prices: Consider a machine
learning model that predicts the
price of a house based on the
following features:
1.Size of the house (sq. ft.)
2.Number of bedrooms
3.Distance to the city center
(miles)
4.Age of the house (years)
• Suppose the model predicted that a
specific house is worth $400,000.
Let’s break down how the model
arrived at this prediction.
BREAKDOWN PLOTS: Example
• This breakdown plot shows how each
feature contributes to the predicted
house price:
1.Base Value ($300,000): The starting
point, which represents the average or
baseline prediction.
2.Size of the house ($80,000): The
model adds $80,000 due to the large
size of the house.
3.Bedrooms ($15,000): The model adds
$15,000 for the number of bedrooms.
4.Distance to City (-$20,000): The
model subtracts $20,000 because the
house is far from the city center.
5.Age of the house ($25,000): The
model adds $25,000 since the house is
relatively new.
• The final predicted price is $400,000,
combining all these contributions
5. Breakdown Plots: Explainable properties
1) Local or Global: "Local"
• Breakdown plots provide explanations for
individual data instances, making them a local
interpretability method. They focus on explaining
why a model made a specific prediction for a
particular data point, rather than providing
insights into the model's behavior across the
entire dataset.
• Example: Suppose we have a model that predicts
housing prices based on features like size,
location, and number of bedrooms. A breakdown
plot would explain why the model predicted a
specific price for a particular house, considering
only the features and characteristics of that house.
5. Breakdown Plots: Explainable properties
2) Linear or Non-linear: "Linear"
• Breakdown plots are often used with linear models or models that can be approximated as linear in
the vicinity of the data point being explained. In linear models, the relationship between features
and predictions follows a linear pattern, making it easier to interpret the impact of individual
features on the model's output.
• Example: Consider a linear regression model that predicts the salary of employees based on their
years of experience. In this case, the relationship between years of experience and salary is linear,
and breakdown plots can effectively show how changes in years of experience impact the predicted
salary.

3) Monotonic or Non-monotonic: "Monotonic"


• Breakdown plots assume a monotonic relationship between features and predictions. This means
that as the value of a feature increases (or decreases), the model's prediction also increases (or
decreases) consistently. Non-monotonic relationships, where the prediction may increase initially
and then decrease, are not typically captured by breakdown plots.
• Example: Let's say we have a model that predicts the likelihood of a customer making a purchase
based on their spending habits. If the model assumes that as the customer spends more money, their
likelihood of making a purchase increases (a monotonic relationship), breakdown plots can
effectively show how changes in spending habits affect the predicted likelihood of purchase.
5. Breakdown Plots: Explainable properties
4) Feature Interactions Captured: "No"
• Breakdown plots do not explicitly capture feature interactions, especially in their basic form.
They consider each feature's contribution to the prediction independently and do not account for
how the impact of one feature may depend on the value of another feature.
• Example: Imagine a model that predicts the performance of students on a test based on their
study hours and IQ scores. While breakdown plots can show how study hours and IQ scores
individually affect the predicted test performance, they may not capture how the effect of study
hours might differ based on different levels of IQ scores.

5) Model Complexity: "Low"


• Breakdown plots are suitable for models with low complexity, such as linear regression models
or decision trees with few levels. They are less effective for highly complex models like deep
neural networks, which may have intricate relationships between features and predictions.
• Example: A simple linear regression model that predicts the temperature based on historical
weather data has low complexity. Breakdown plots can provide clear insights into how changes
in features like humidity and wind speed influence the predicted temperature in such a model.

You might also like