0% found this document useful (0 votes)

31 views5 pages

SHAP for Interpreting ML Models

This document discusses interpreting machine learning models using SHAP (SHapley Additive exPlanations). SHAP allows understanding key factors of heterogeneity in complex models like neural networks. It can calculate Shapley values quickly and visually through plots without refitting models. An example shows SHAP plots for a causal forest model on synthetic data, where the first feature has a strong causal effect. The plots clearly show the first feature's significant impact. The document further explains that Shapley values are calculated per observation and feature, marginalizing out other features to show the change in prediction given that feature versus the mean. SHAP plots provide a fast, visual way to interpret complex models.

Uploaded by

Josh Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views5 pages

SHAP for Interpreting ML Models

Uploaded by

Josh Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Topic homework: Interpretability with SHAP

In this notebook we explore the open source library SHAP for interpreting black-box machine learning
models. SHAP comes with many clearn benefits:

Allows us to understand the key factors of hetergoeneity in complex models - such as neural
networks or boosted trees
Can be caluclated quickly and visually expressed - no need to fit multiple models

Let's see an example of SHAP plots in action:

pip install econml

pip install shap

## Ignore warnings
from [Link] import CausalForestDML, LinearDML, NonParamDML
from [Link] import DRLearner
from [Link] import DomainAdaptationLearner, XLearner
from [Link] import LinearIntentToTreatDRIV
import numpy as np
import [Link]
import [Link] as plt
import shap
from [Link] import RandomForestRegressor, RandomForestClassifier
from sklearn.linear_model import Lasso

import sklearn

[Link](123)
n_samples = 5000
n_features = 10
true_te = lambda X: (X[:, 0]>0) * X[:, 0]
X = [Link](0, 1, size=(n_samples, n_features))
W = [Link](0, 1, size=(n_samples, n_features))
T = [Link](1, [Link](X[:, 0]))
y = true_te(X) * T + 5.0 * X[:, 0] + [Link](0, .1, size=(n_samples,))
X_test = X[:min(100, n_samples)].copy()
X_test[:, 0] = [Link]([Link](X[:, 0], 1), [Link](X[:, 0], 99), min(100,

Here, we see a Forest Double Machine Learning Estimator which is a forest model with residualization to
synthetic data. The data was genererated specifically with the first feature having a strong casual effect,
while the other are linear multiples of random noise.
est = CausalForestDML(random_state=123)
[Link](y, T, X=X, W=W)
shap_values = est.shap_values(X[:20])
[Link](shap_values['Y0']['T0'])

'normalize' was deprecated in version 1.0 and will be removed in 1.2. Please leave th
'normalize' was deprecated in version 1.0 and will be removed in 1.2. Please leave th
'normalize' was deprecated in version 1.0 and will be removed in 1.2. Please leave th
'normalize' was deprecated in version 1.0 and will be removed in 1.2. Please leave th

Its important to note that the shapley value is calculated for each of the 20 rows in the data given to the
est.shap_values() function. The plot shows those 20 points with a random up and down jitter to avoid
overlapping points. As a result, there is not a single shapley value per feature, but a shapley value per
feature per observation.

The shap plot was clearly to indicate that high values in the first feature has significant impact on the
model output. But what does this impact mean?

We investigate the documentation of shap plots to better understand what SHAP represents:

An easier example of SHAP would be to compare to a linear model with coefficients:

# a classic housing price dataset

X,y = [Link]()
X100 = [Link](X, 100) # 100 instances for use as the background distribution

# a simple linear model

model = sklearn.linear_model.LinearRegression()
[Link](X, y)
print("Model coefficients:\n")
for i in range([Link][1]):
print([Link][i], "=", model.coef_[i].round(4))

Model coefficients:

CRIM = -0.108
ZN = 0.0464
INDUS = 0.0206
CHAS = 2.6867
NOX = -17.7666
RM = 3.8099
AGE = 0.0007
DIS = -1.4756
RAD = 0.306
TAX = -0.0123
PTRATIO = -0.9527
B = 0.0093
LSTAT = -0.5248

Here we see the linear coefficients we are familiar with. However the value of the coefficients depends on
the scale of the feature, thus its absolute value is not indicative of its importance.
Instead, the authors of SHAP suggest a partial dependance plot, here we see it plotted for one feature of
AGE.

[Link].partial_dependence(
"RM", [Link], X100, ice=False,
model_expected_value=True, feature_expected_value=True
)

We see that, because this model is linear as AGE increase the expeceted value of the models predictions
(with all the other features marginalized out) is shown in the blue line.
To calculate SHAP values, we attempt to find an existing model 𝑓 with a subset of features 𝑆 , which is
done by integrating out the other features using conditional expected value formulation. As a result, we
see how the predicted function changes with the changing feature.

explainer = [Link]([Link], X100)

shap_values = explainer(X)

# make a standard partial dependence plot

sample_ind = 18
shap.partial_dependence_plot(
"RM", [Link], X100, model_expected_value=True,
feature_expected_value=True, ice=False,
shap_values=shap_values[sample_ind:sample_ind+1,:]
)

Permutation explainer: 507it [00:24, 15.98it/s]

From here, at a given observation 𝑥𝑖 (recall that shapley values are calculated at each observation) the
deviation of the model with respect to the model's mean (shown in the red line above) is approximately
-3.09 which is the shapley value of for this observation and feature.

[Link](shap_values[sample_ind], max_display=14)
In this notebook we took deeper dive into Shapley plots and learned that:

Shapley plots can show the impact of a feature, and is not affected by scale as linear coefficients
are
Shapley values are calculated per observation given per feature, it marginalize out other features
and looks at the change in prediction observation of the given feature with respect to the mean
Shapley plots are a fast and visual way of making complex models more interpretable.

SHAP Interpretability in Machine Learning
No ratings yet
SHAP Interpretability in Machine Learning
6 pages
An Introduction To Explainable AI With Shapley Values - SHAP Latest Documentation
No ratings yet
An Introduction To Explainable AI With Shapley Values - SHAP Latest Documentation
20 pages
SHAP Documentation: TreeExplainer & KernelExplainer
No ratings yet
SHAP Documentation: TreeExplainer & KernelExplainer
11 pages
Understanding SHAP Values in ML Models
No ratings yet
Understanding SHAP Values in ML Models
12 pages
Shap
100% (1)
Shap
214 pages
Explain Machine Learning Model Using SHAP
No ratings yet
Explain Machine Learning Model Using SHAP
28 pages
Shapley Value Feature Attribution Algorithms
No ratings yet
Shapley Value Feature Attribution Algorithms
33 pages
How Shapley Values Work - A Simple Guide
No ratings yet
How Shapley Values Work - A Simple Guide
11 pages
Shapley Values in ML Interpretability
No ratings yet
Shapley Values in ML Interpretability
60 pages
Explaining Xgboost Predictions With Shap Value A C
No ratings yet
Explaining Xgboost Predictions With Shap Value A C
13 pages
SHAP-Based Explanation Methods: A Review For NLP Interpretability
No ratings yet
SHAP-Based Explanation Methods: A Review For NLP Interpretability
11 pages
From Explanations To Feature Selection: Assessing SHAP Values As Feature Selection Mechanism
No ratings yet
From Explanations To Feature Selection: Assessing SHAP Values As Feature Selection Mechanism
8 pages
Train
No ratings yet
Train
12 pages
Aiml Lab 6
No ratings yet
Aiml Lab 6
11 pages
SHAP1
No ratings yet
SHAP1
68 pages
Understanding Shapley Values in ML
No ratings yet
Understanding Shapley Values in ML
14 pages
SHAP: Feature Attribution in ML
No ratings yet
SHAP: Feature Attribution in ML
44 pages
SHAP - Background and Application
No ratings yet
SHAP - Background and Application
2 pages
The Explanation Game: Explaining Machine Learning Models Using Shapley Values
No ratings yet
The Explanation Game: Explaining Machine Learning Models Using Shapley Values
20 pages
SHAP Values Algorithm Intro
No ratings yet
SHAP Values Algorithm Intro
22 pages
Junk 3
No ratings yet
Junk 3
11 pages
( ) Opening Up The Neural Network Classifier For Shap Score Computation
No ratings yet
( ) Opening Up The Neural Network Classifier For Shap Score Computation
11 pages
Shapley Values for ML Predictions
No ratings yet
Shapley Values for ML Predictions
14 pages
xai발표
No ratings yet
xai발표
42 pages
Problems With Shapley-Value-Based Explanations As Feature Importance Measures
No ratings yet
Problems With Shapley-Value-Based Explanations As Feature Importance Measures
10 pages
Understanding SHAP Summary Plots
No ratings yet
Understanding SHAP Summary Plots
4 pages
Interpretable ML for Business Students
No ratings yet
Interpretable ML for Business Students
45 pages
SHAP Summary Slides
No ratings yet
SHAP Summary Slides
2 pages
Sales Reward Points Prediction Using Machine Learning 1
No ratings yet
Sales Reward Points Prediction Using Machine Learning 1
7 pages
T9 Iml
No ratings yet
T9 Iml
44 pages
SHAP Insights for Actuaries
No ratings yet
SHAP Insights for Actuaries
25 pages
Unified Model Prediction Interpretation
No ratings yet
Unified Model Prediction Interpretation
9 pages
Methodology
No ratings yet
Methodology
1 page
LAK23 - Poster (Camera-Ready)
No ratings yet
LAK23 - Poster (Camera-Ready)
4 pages
Markus Loecher Berlin School of Economics and Law, 10825 Berlin, Germany E-Mail: Markus - Loecher@hwr-Berlin - de
No ratings yet
Markus Loecher Berlin School of Economics and Law, 10825 Berlin, Germany E-Mail: Markus - Loecher@hwr-Berlin - de
13 pages
Interpreting Model Predictions
No ratings yet
Interpreting Model Predictions
21 pages
Combined Breakdowns
No ratings yet
Combined Breakdowns
233 pages
AD3411
No ratings yet
AD3411
28 pages
Matplotlib Notes
No ratings yet
Matplotlib Notes
23 pages
Statistical Significance of Feature Importance Rankings
No ratings yet
Statistical Significance of Feature Importance Rankings
20 pages
Scoring Document - Explainable AI Final Assignment
No ratings yet
Scoring Document - Explainable AI Final Assignment
3 pages
ML Lab
No ratings yet
ML Lab
12 pages
Dimensionality - Reduction - Principal - Component - Analysis - Ipynb at Master Llsourcell - Dimensionality - Reduction GitHub
No ratings yet
Dimensionality - Reduction - Principal - Component - Analysis - Ipynb at Master Llsourcell - Dimensionality - Reduction GitHub
14 pages
Python Matplotlib Plotting Examples
No ratings yet
Python Matplotlib Plotting Examples
6 pages
Python Matplotlib Hands On
100% (1)
Python Matplotlib Hands On
6 pages
FML Lab 1
No ratings yet
FML Lab 1
4 pages
Py MC3
No ratings yet
Py MC3
85 pages
Unit V Notes
No ratings yet
Unit V Notes
11 pages
AI & ML Lab Journal for MCA Students
No ratings yet
AI & ML Lab Journal for MCA Students
77 pages
Lundberg, Lee - 2017 - A Unified Approach To Interpreting Model Predictions (2) - Annotated
No ratings yet
Lundberg, Lee - 2017 - A Unified Approach To Interpreting Model Predictions (2) - Annotated
11 pages
Python PCA for Data Scientists
No ratings yet
Python PCA for Data Scientists
5 pages
Hands On Data Visualization Using Matplotlib
100% (1)
Hands On Data Visualization Using Matplotlib
7 pages
SHAP-IQ Unified Approximation of Any-Order Shapley
No ratings yet
SHAP-IQ Unified Approximation of Any-Order Shapley
27 pages
21 Feature Importance Methods in ML
100% (1)
21 Feature Importance Methods in ML
41 pages
22MCA1008 - Varun ML LAB ASSIGNMENTS
100% (1)
22MCA1008 - Varun ML LAB ASSIGNMENTS
41 pages
SHAP - Shapely Values
No ratings yet
SHAP - Shapely Values
7 pages
J Toc SW Dts 6 Me 5 N QQ
No ratings yet
J Toc SW Dts 6 Me 5 N QQ
8 pages
Untitled 1.odt
No ratings yet
Untitled 1.odt
8 pages
Operations Manual - Eagle T100
No ratings yet
Operations Manual - Eagle T100
36 pages
Instruction Manual: Alfa Laval Toftejorg™ TJ20G
No ratings yet
Instruction Manual: Alfa Laval Toftejorg™ TJ20G
48 pages
Diagramas Eléctricos FORD F 150 4WD V8-5.0L 2012
No ratings yet
Diagramas Eléctricos FORD F 150 4WD V8-5.0L 2012
120 pages
Promag 53H Flowmeter for Hygienic Applications
No ratings yet
Promag 53H Flowmeter for Hygienic Applications
4 pages
Chilled Water Balancing Report
No ratings yet
Chilled Water Balancing Report
9 pages
Bearing Housing and Shaft Design
No ratings yet
Bearing Housing and Shaft Design
3 pages
02 Unit 8 Assignment
No ratings yet
02 Unit 8 Assignment
4 pages
A1 212 2010 Fuhr
No ratings yet
A1 212 2010 Fuhr
19 pages
Apparaisal of Fuzzy Social Utility: Michel - Garrabe@
No ratings yet
Apparaisal of Fuzzy Social Utility: Michel - Garrabe@
17 pages
MH222-XXX (Ce-Ltd) - 3
No ratings yet
MH222-XXX (Ce-Ltd) - 3
5 pages
Storage Inspection Checklist Guide
No ratings yet
Storage Inspection Checklist Guide
1 page
Walmart Privacy
No ratings yet
Walmart Privacy
30 pages
iCE3000 Series 9499 500 2300 Manual
100% (1)
iCE3000 Series 9499 500 2300 Manual
174 pages
Topguard TGD Elm Ul 61010-1 Csa c22.2
No ratings yet
Topguard TGD Elm Ul 61010-1 Csa c22.2
4 pages
CG956C Parts Book
100% (1)
CG956C Parts Book
223 pages
Letter of Invitation GAD Database
No ratings yet
Letter of Invitation GAD Database
3 pages
P-07 - Samsung - Tank Cleaning Pump and Stripping Pump, Technical Manual PDF
No ratings yet
P-07 - Samsung - Tank Cleaning Pump and Stripping Pump, Technical Manual PDF
162 pages
Technical Sheet UFT32
No ratings yet
Technical Sheet UFT32
5 pages
Unreadable Document Analysis
No ratings yet
Unreadable Document Analysis
8 pages
Pierce Arrow Chasis Manual
No ratings yet
Pierce Arrow Chasis Manual
136 pages
4WG-210 CASE 821C PL 4657 054 026 - Internet
80% (5)
4WG-210 CASE 821C PL 4657 054 026 - Internet
73 pages
Flasher, 30 M Shunt, Pilot Lamp To GND or V U2043B: Features
No ratings yet
Flasher, 30 M Shunt, Pilot Lamp To GND or V U2043B: Features
10 pages
Service and Maintenance Manual: Model
No ratings yet
Service and Maintenance Manual: Model
130 pages
Autonomous Luggage & Mobility Device
No ratings yet
Autonomous Luggage & Mobility Device
10 pages
Diffusion Callister
No ratings yet
Diffusion Callister
29 pages
IT Service Desk Engineer Resume
No ratings yet
IT Service Desk Engineer Resume
2 pages
Educational Leadership Internship
No ratings yet
Educational Leadership Internship
9 pages
Eye To Eye Lesson Plan Final
No ratings yet
Eye To Eye Lesson Plan Final
5 pages

SHAP for Interpreting ML Models

Uploaded by

SHAP for Interpreting ML Models

Uploaded by

Topic homework: Interpretability with SHAP

Let's see an example of SHAP plots in action:

pip install econml

pip install shap

An easier example of SHAP would be to compare to a linear model with coefficients:

# a classic housing price dataset

# a simple linear model

explainer = [Link]([Link], X100)

# make a standard partial dependence plot

Permutation explainer: 507it [00:24, 15.98it/s]

You might also like