0% found this document useful (0 votes)
47 views15 pages

Video 19 - Class Activation Mapping CAM

The document discusses Class Activation Mapping (CAM) as a method for interpreting classifier outputs, highlighting its implementation and comparison with Grad-CAM. It outlines the steps for extracting feature maps, projecting class weights, and overlaying heatmaps on images. The summary emphasizes the advantages of CAM in terms of speed and memory usage while noting its limitations compared to Grad-CAM.

Uploaded by

shubham jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views15 pages

Video 19 - Class Activation Mapping CAM

The document discusses Class Activation Mapping (CAM) as a method for interpreting classifier outputs, highlighting its implementation and comparison with Grad-CAM. It outlines the steps for extracting feature maps, projecting class weights, and overlaying heatmaps on images. The summary emphasizes the advantages of CAM in terms of speed and memory usage while noting its limitations compared to Grad-CAM.

Uploaded by

shubham jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Interpretability with Class Activation Mapping

Antonio Rueda-Toicen
Learning goals

● Use class activation mapping to interpret the output of classifiers


● Describe tradeoffs between CAM and Grad-CAM
Interpreting a prediction

A pretrained Resnet34 with Imagenet1K_V1


weights says

“tabby, tabby cat” with

probability = 0.59

The resnet has 512 activations of shape


H = W = 7 on its last layer
Class Activation Mapping

GAP = Global Average Pooling

Image from Learning Deep Features for Discriminative Localization


CAM with global average pooling
import torch
import torch.nn.functional as F

# Suppose fc_weights has shape [num_classes, channels] and activations has shape [1, batch_size, channels, H, W].
# We'll just show the relevant slices for the single 'class_idx' and the first image in the batch.
weight = fc_weights[class_idx] # shape [channels]
act = activations[0][0] # shape [channels, H, W] – 7x7 in the case of resnet34

# ----------------------------------------------------------------
# 1) The "global average pooling" from a usual forward pass:
# collapses (H, W) -> 1x1, giving us one value per channel.
# ----------------------------------------------------------------
pooled = F.adaptive_avg_pool2d(act.unsqueeze(0), 1) # shape [1, channels, 1, 1]
pooled = pooled.squeeze(0).squeeze(-1).squeeze(-1) # shape [channels]
# 'pooled' is the channel-wise average. Multiplying by 'weight' then summing would give the final logit for 'class_idx'.

score = (pooled * weight).sum() # The single scalar logit for class_idx

# ----------------------------------------------------------------
# 2) Building the CAM:
# multiply each channel map by its weight, then sum across channels.
# ----------------------------------------------------------------

cam = (act * weight.view(-1, 1, 1)).sum(dim=0) # shape [H, W]

print(cam.shape) # [H, W] this is now shape 7x7


Resizing the map

cam_resized = np.array(Image.fromarray(cam).resize(img.size, resample=Image.BILINEAR))


plt.imshow(cam_resized, cmap="jet");
Extracting the activations with a hook

# Hook for extracting the activations from the last convolutional layer
activations = []
def hook_fn(module, input, output):
activations.append(output)

# Register the hook


layer_name = 'layer4' # Last convolutional block
hook = model._modules.get(layer_name).register_forward_hook(hook_fn)

# Forward pass
output = model(input_tensor)

# Remove the hook


hook.remove()

# Get the weights of the fully connected layer


fc_weights = model.fc.weight.detach()

# Select the class index (e.g., 0 for 'tench')


class_idx = torch.argmax(output, dim=1).item()
Overlaying the resized CAM on the image
# Compute CAM
# fc_weights[class_idx] is 1-dimensional, but the einsum equation 'ij' expects 2 dimensions.
# We need to unsqueeze to add a dimension to fc_weights[class_idx]
cam = torch.einsum('ij,jkl->ikl', fc_weights[class_idx].unsqueeze(0), activations[0][0])

# Normalize CAM
cam = cam - cam.min()
cam = cam / cam.max()

# Resize CAM to match the input image


cam = cam.detach().numpy()
# Squeeze the cam array to remove the first dimension and convert to uint8
cam = cam.squeeze() # remove the first dimension
cam = (cam * 255).astype(np.uint8) # scale to 0-255 and convert to uint8
cam_resized = np.array(Image.fromarray(cam).resize(img.size, resample=Image.BICUBIC))

# Overlay CAM on the image


plt.imshow(img)
plt.imshow(cam_resized, cmap='jet', alpha=0.5)
plt.axis('off')
plt.show()
Guided Grad-CAM
Combining Grad-CAM and Guided Backprop
CAM vs GradCAM

● CAM needs a Global Average Pooling layer to be added to the model, GradCAM
works with any architecture without changes

● GradCAM can visualize outputs of any layer, CAM is limited to the final layer

● CAM runs faster and requires less memory


Image from BBC News
Lion or boar?

Image from Berlin 'lioness' on loose 'is a wild boar’ (BBC News)
Summary
Class Activation Mapping (CAM) is a tool for explainability
● CAM helps us understand whether our decisions are well supported or based on spurious
correlations
● CAM exposes hidden connections between inputs and decisions that affect model
reliability and safety
● Grad-CAM allows us to extend the method to any network without changes to the
architecture

CAM implementations steps

● Extract feature maps with hooks from final convolutional layer


● Project class weights onto activation maps
● Upsample and overlay heatmaps on input images
References
Learning Deep Features for Discriminative Localization

● https://openaccess.thecvf.com/content_cvpr_2016/html/Zhou_Learning_Deep_Feat
ures_CVPR_2016_paper.html

Class Activation Mapping explained

● https://github.com/fastai/fastbook/blob/master/18_CAM.ipynb

Basic guide to Numpy’s einsum


● https://ajcr.net/Basic-guide-to-einsum/

Class activation mapping on fiftyone


● https://voxel51.com/blog/exploring-gradcam-and-more-with-fiftyone/

You might also like