Zombie Detection with RetinaNet API
Zombie Detection with RetinaNet API
Welcome to this week's programming assignment! You will use the Object Detection API and
retrain RetinaNet to spot Zombies using just 5 training images. You will setup the model to restore
pretrained weights and fine tune the classification layers.
Important: This colab notebook has read-only access so you won't be able to save your changes. If
you want to save your work periodically, please click File -> Save a Copy in Drive to create a
copy in your account, then work from there.
Exercises
Exercise 1 - Import Object Detection API packages
Exercise 2 - Visualize the training images
Exercise 3 - Define the category index dictionary
Exercise 4 - Download checkpoints
Exercise 5.1 - Locate and read from the configuration file
Exercise 5.2 - Modify the model configuration
Exercise 5.3 - Modify model_config
Exercise 5.4 - Build the custom model
Exercise 6.1 - Define Checkpoints for the box predictor
Exercise 6.2 - Define the temporary model checkpoint
Exercise 6.3 - Restore the checkpoint
Exercise 7 - Run a dummy image to generate the model variables
Exercise 8 - Set training hyperparameters
Exercise 9 - Select the prediction layer variables
Exercise 10 - Define the training step
Exercise 11 - Preprocess, predict, and post process an image
Installation
# uncomment the next line if you want to delete an existing models directory
!rm -rf ./models/
# Compile the Object Detection API protocol buffers and install the necessary packages
!cd models/research/ && protoc object_detection/protos/*.proto --python_out=. && cp object
Imports
Let's now import the packages you will use in this assignment.
import matplotlib
import matplotlib.pyplot as plt
import os
import random
import zipfile
import io
import scipy.misc
import numpy as np
import glob
import imageio
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage
try:
# %tensorflow_version only exists in Colab.
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
tf.get_logger().setLevel('ERROR')
label_map_util
config_util: You'll use this to read model configurations from a .config file and then
modify that configuration
visualization_utils: please give this the alias viz_utils , as this is what will be used in
some visualization code that is given to you later.
colab_utils
model_builder: This builds your model according to the model configuration that you'll
specify.
### START CODE HERE (Replace Instances of `None` with your code) ###
# import the label map utility module
from None import None
Utilities
You'll define a couple of utility functions for loading images and plotting detections. This code is
provided for you.
def load_image_into_numpy_array(path):
"""Load an image from file into a numpy array.
Args:
path: a file path.
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
def plot_detections(image_np,
boxes,
classes,
scores,
category_index,
figsize=(12, 16),
image_name=None):
"""Wrapper function to visualize detections.
Args:
image_np: uint8 numpy array with shape (img_height, img_width, 3)
boxes: a numpy array of shape [N, 4]
classes: a numpy array of shape [N]. Note that class indices are 1-based,
and match the keys in the label map.
scores: a numpy array of shape [N] or None. If scores=None, then
this function assumes that the boxes to be plotted are groundtruth
boxes and plot all boxes as black with no classes or scores.
category_index: a dict containing category dictionaries (each holding
category index `id` and category name `name`) keyed by category indices.
figsize: size for the figure.
image_name: a name for the image file.
"""
image_np_with_annotations = image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_annotations,
boxes,
classes,
scores,
category_index,
use_normalized_coordinates=True,
min_score_thresh=0.8)
if image_name:
plt.imsave(image_name, image_np_with_annotations)
else:
plt.imshow(image_np_with_annotations)
# uncomment the next 2 lines if you want to delete an existing zip and training directory
# !rm training-zombie.zip
# !rm -rf ./training
Please replace instances of None below to load and visualize the 5 training images.
You can inspect the training directory (using the Files button on the left side of this Colab)
to see the filenames of the zombie images. The paths for the images will look like this:
./training/training-zombie1.jpg
./training/training-zombie2.jpg
./training/training-zombie3.jpg
./training/training-zombie4.jpg
./training/training-zombie5.jpg
To set file paths, you'll use os.path.join. As an example, if you wanted to create the path
'./parent_folder/file_name1.txt', you could write:
%matplotlib inline
### START CODE HERE (Replace Instances of `None` with your code) ###
# assign the name (string) of the directory containing the training images
train_image_dir = './training'
# plot images
for idx, train_image_np in enumerate(train_images_np):
plt.subplot(1, 5, idx+1)
plt.imshow(train_image_np)
plt.show()
In this section, you will create your ground truth boxes. You can either draw your own boxes or use
a prepopulated list of coordinates that we have provided below.
If the box is too big, the model might learn the features of the background (e.g. door,
road, etc) in determining if there is a zombie or not.
As an example, scroll to the beginning of this notebook to look at the bounding box around
the zombie.
except AssertionError as e:
print(e)
except AssertionError as e:
print(e)
break
ref_gt_boxes = [
np.array([[0.27333333, 0.41500586, 0.74333333, 0.57678781]]),
np.array([[0.29833333, 0.45955451, 0.75666667, 0.61078546]]),
np.array([[0.40833333, 0.18288394, 0.945, 0.34818288]]),
np.array([[0.16166667, 0.61899179, 0.8, 0.91910903]]),
np.array([[0.28833333, 0.12543962, 0.835, 0.35052755]]),
]
except AssertionError as e:
print(e)
break
You can also use this list if you opt not to draw the boxes yourself.
except:
gt_boxes = ref_gt_boxes
break
Whether you chose to draw your own or use the given boxes, please check your list of ground truth
box coordinates.
Below, we add the class annotations. For simplicity, we assume just a single class, though it
should be straightforward to extend this to handle multiple classes. We will also convert
everything to the format that the training loop expects (e.g., conversion to tensors, one-hot
representations, etc.).
If there is ever a 'background' class, it could be assigned the integer 0, but in this case,
you're just predicting the one zombie class.
Since you are just predicting one class (zombie), please assign 1 to the zombie class
ID.
category_index: Please define the category_index dictionary, which will have the same
structure as this:
{human_class_id :
{'id' : human_class_id,
'name': 'human_so_far'}
}
Define category_index similar to the example dictionary above, except for zombies.
This will be used by the succeeding functions to know the class id and name of
zombie images.
num_classes: Since you are predicting one class, please assign 1 to the number of classes
that the model will predict.
This will be used during data preprocessing and again when you configure the model.
### START CODE HERE (Replace instances of `None` with your code ###
# TEST CODE:
print(category_index[zombie_class_id])
Expected Output:
Data preprocessing
You will now do some data preprocessing so it is formatted properly before it is fed to the model:
label_id_offset = 1
train_image_tensors = []
# lists containing the one-hot encoded classes and ground truth boxes
gt_classes_one_hot_tensors = []
gt_box_tensors = []
# convert training image to tensor, add batch dimension, and add to list
train_image_tensors.append(tf.expand_dims(tf.convert_to_tensor(
train_image_np, dtype=tf.float32), axis=0))
# use the `plot_detections()` utility function to draw the ground truth boxes
for idx in range(5):
plt.subplot(2, 4, idx+1)
plot_detections(
train_images_np[idx],
gt_boxes[idx],
np.ones(shape=[gt_boxes[idx].shape[0]], dtype=np.int32),
dummy_scores, category_index)
plt.show()
When working with models that are at the frontiers of research, the models and checkpoints may
not yet be organized in a central location like the TensorFlow Garden
(https://github.com/tensorflow/models).
You'll often read a blog post from the researchers, who will usually provide information on:
It's good practice to do some of this "detective work", so that you'll feel more comfortable when
exploring new models yourself! So please try the following steps:
If you want some help getting started, please click on the "Initial Hints" cell to get some hints.
Initial Hints
More Hints
In the Colab, on the left side table of contents, click on the folder icon to display the file
browser for the current workspace.
Navigate to models/research/object_detection/configs/tf2 . The folder has multiple
.config files.
Look for the file corresponding to ssd resnet 50 version 1 640x640.
You can double-click the config file to view its contents. This may help you as you complete
the next few code cells to configure your model.
Set the pipeline_config to a string that contains the full path to the resnet config file, in
other words: models/research/.../... .config
configs
If you look at the module config_util that you imported, it contains the following function:
Please use this function to load the configuration from your pipeline_config .
tf.keras.backend.clear_session()
From the configs dictionary, access the object associated with the key 'model'.
model_config now contains an object of type
object_detection.protos.model_pb2.DetectionModel .
If you print model_config , you'll see something like this:
ssd {
num_classes: 90
image_resizer {
fixed_shape_resizer {
height: 640
width: 640
}
}
feature_extractor {
...
...
freeze_batchnorm: false
### START CODE HERE ###
# Read in the object stored at the key 'model' of the configs dictionary
model_config = None
num_classes is nested under ssd. You'll need to use dot notation 'obj.x' and NOT
bracket notation obj['x']` to access num_classes.
Freeze batch normalization
# See what model_config now looks like after you've customized it!
model_config
You'll use model_builder to build the model according to the configurations that you have
just downloaded and customized.
model_config: Set this to the model configuration that you just customized.
is_training: Set this to True.
You can keep the default value for the remaining parameter.
Note that it will take some time to build the model.
### START CODE HERE (Replace instances of `None` with your code) ###
detection_model = None
### END CODE HERE ###
print(type(detection_model))
Expected Output:
<class 'object_detection.meta_architectures.ssd_meta_arch.SSDMetaArch'>
Your end goal is to create a custom model which reuses parts of, but not all of the layers of
RetinaNet (currently stored in the variable detection_model .)
First, take a look at the type of the detection_model and its Python class.
Hopefully you'll find the meta_architectures folder, and within it you'll notice a file
named ssd_meta_arch.py .
Please open and view this ssd_meta_arch.py file.
vars(detection_model)
...
_box_predictor': <object_detection.predictors.convolutional_keras_box_predictor.WeightShared
...
_feature_extractor': <object_detection.models.ssd_resnet_v1_fpn_keras_feature_extractor.SSDR
Inspect _feature_extractor
# Line 302
feature_extractor: a SSDFeatureExtractor object.
Also
# Line 380
self._feature_extractor = feature_extractor
Inspect _box_predictor
View the ssd_meta_arch.py file (which is the source code for detection_model)
Notice that in the init constructor for class SSDMetaArch(model.DetectionModel),
...
box_predictor: a box_predictor.BoxPredictor object
...
self._box_predictor = box_predictor
Inspect _box_predictor
object_detection.predictors.convolutional_keras_box_predictor.WeightSharedConvolutionalBoxPr
objection_detection/predictors
Notice that there is a file named convolutional_keras_box_predictor.py. Please open that file.
vars(detection_model._box_predictor)
...
_base_tower_layers_for_heads
...
_box_prediction_head
...
_prediction_heads
In the source code for convolutional_keras_box_predictor.py that you just opened, look at the
source code to get a sense for what these three variables represent.
Inspect base_tower_layers_for_heads
# line 302
self._base_tower_layers_for_heads = {
BOX_ENCODINGS: [],
CLASS_PREDICTIONS_WITH_BACKGROUND: [],
}
# Line 377
# Stack the base_tower_layers in the order of conv_layer, batch_norm_la
# and activation_layer
base_tower_layers = []
for i in range(self._num_layers_before_predictor):
So detection_model.box_predictor._base_tower_layers_for_heads contains:
The layers for the prediction before the final bounding box prediction
The layers for the prediction before the final class prediction.
Inspect _box_prediction_head
Inspect _prediction_heads
# Line 121
self._prediction_heads = {
BOX_ENCODINGS: box_prediction_heads,
CLASS_PREDICTIONS_WITH_BACKGROUND: class_prediction_heads,
}
# Line 83
class_prediction_heads: A list of heads that predict the classes.
Remember that you are reusing the model for its feature extraction and bounding box detection.
You will create your own classification layer and train it on zombie images.
So you won't need to reuse the class prediction layer of detection_model .
tf.train.Checkpoint(
**kwargs
)
Pretend that detection_model contains these variables for which you want to restore weights:
detection_model._ice_cream_sundae
detection_model._pies._apple_pie
detection_model._pies._pecan_pie
If you just want the ice cream sundae and apple pie variables (and not the pecan pie) then you can
do the following:
tmp_pies_checkpoint = tf.train.Checkpoint(
_apple_pie = detection_model._pies._apple_pie
)
tmp_model_checkpoint = tf.train.Checkpoint(
_pies = tmp_pies_checkpoint,
_ice_cream_sundae = detection_model._ice_cream_sundae
)
Finally, define a checkpoint that uses the key model and takes in the tmp_model_checkpoint.
checkpoint = tf.train.Checkpoint(
model = tmp_model_checkpoint
)
You'll then be ready to restore the weights from the checkpoint that you downloaded.
The base tower layer (the layers the precede both the class prediction and bounding
box prediction layers).
The box prediction head (the prediction layer for bounding boxes).
Note, you won't include the class prediction layer.
Important: Be careful to avoid typos in the key names for the checkpoint. For example, if
there is a layer called _apple_pies and you accidentally added an extra "t" like this:
tmp_pies_checkpoint = tf.train.Checkpoint( _apple_piest =
detection_model._box_predictor._apple_pies ) then, when you restore the
checkpoint, it will update the variable _apple_piest , instead of _apple_pies like you
intended. This will likely make the model train slower in Exercise 10 later.
tmp_box_predictor_checkpoint = None
# Expected output:
# tensorflow.python.training.tracking.util.Checkpoint
Expected output
You should expect to see a list of variables that include the following:
...
'_base_tower_layers_for_heads': {'box_encodings': ListWrapper([]),
'class_predictions_with_background': ListWrapper([])},
'_box_prediction_head': <object_detection.predictors.heads.keras_box_head.WeightSharedConvo
...
tmp_model_checkpoint = None
# Expected output
# tensorflow.python.training.tracking.util.Checkpoint
Expected output
checkpoint_path:
Using the "files" browser in the left side of Colab, navigate to models -> research ->
object_detection -> test_data .
If you completed the previous code cell that downloads and moves the checkpoint,
you'll see a subfolder named "checkpoint".
checkpoint
ckpt-0.data-00000-of-00001
ckpt-0.index
Please set checkpoint_path to the path to the full path models/.../ckpt-0
Notice that you don't want to include a file extension after ckpt-0 .
IMPORTANT: Please don't set the path to include the .index extension in the
checkpoint file name.
Finally, call this checkpoint's .restore() function, passing in the path to the checkpoint.
checkpoint_path = None
preprocess():
predict()
takes in image, shapes which are created by the preprocess() function call.
returns a prediction in a Python dictionary
this will pass the dummy image through the forward pass of the network and create
the model variables
postprocess()
Note: Please use the recommended variable names, which include the prefix tmp_ , since these
variables won't be used later, but you'll define similarly-named variables later for predicting on
actual zombie images.
### START CODE HERE (Replace instances of `None` with your code)###
# use the detection model's `preprocess()` method and pass a dummy image
tmp_image, tmp_shapes = None
print('Weights restored!')
# Test Code:
assert len(detection_model.trainable_variables) > 0, "Please pass in a dummy image to crea
print(detection_model.weights[0].shape)
print(detection_model.weights[231].shape)
print(detection_model.weights[462].shape)
Expected Output:
(3, 3, 256, 24)
(512,)
(256,)
You can increase the batch size up to 5, since you have just 5 images for training.
num_batches: You can use 100
You can increase the number of batches but the training will take longer to complete.
When you run the training loop later, notice how the initial loss INCREASES` before
decreasing.
You can try a lower learning rate to see if you can avoid this increased loss.
optimizer: you can use tf.keras.optimizers.SGD
Training will be fairly quick, so we do encourage you to experiment a bit with these
hyperparameters!
tf.keras.backend.set_learning_phase(True)
### START CODE HERE (Replace instances of `None` with your code)###
Notice that there are some layers whose names are prefixed with the following:
WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead
...
WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead
...
WeightSharedConvolutionalBoxPredictor/BoxPredictionTower
...
WeightSharedConvolutionalBoxPredictor/ClassPredictionTower
...
Among these, which do you think are the prediction layers at the "end" of the model?
Recall that when inspecting the source code to restore the checkpoints
(convolutional_keras_box_predictor.py) you noticed that:
_base_tower_layers_for_heads : refers to the layers that are placed right before the
prediction layer
_box_prediction_head refers to the prediction layer for the bounding boxes
_prediction_heads : refers to the set of prediction layers (both for classification and
for bounding boxes)
So you can see that in the source code for this model, "tower" refers to layers that are before the
prediction layer, and "head" refers to the prediction layers.
The bounding box head variables (which predict bounding box coordinates)
The class head variables (which predict the class/category)
detection_model.trainable_variables[92]
tmp_list = []
for v in detection_model.trainable_variables:
if v.name.startswith('ResNet50V1_FPN/bottom_up_block5'):
tmp_list.append(v)
Hint: There are a total of four variables that you want to fine tune.
### START CODE HERE (Replace instances of `None` with your code) ###
# define a list that contains the layers that you wish to fine tune
to_fine_tune = None
# Test Code:
print(to_fine_tune[0].name)
print(to_fine_tune[2].name)
Expected Output:
WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead/BoxPredictor/kernel:0
WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead/ClassPredictor/kern
Train your model
You'll define a function that handles training for one batch, which you'll later use in your training
loop.
First, walk through these code cells to learn how you'll perform training using this model.
The detection_model is of class SSDMetaArch, and its source code shows that is has this
function preprocess.
This preprocesses the images so that they can be passed into the model (for training or
prediction):
You can pre-process each image and save their outputs into two separate lists
preprocessed_image_list = []
true_shape_list = []
Make a prediction
The detection_model also has a .predict function. According to the source code for predict
Returns:
prediction_dict: a dictionary holding "raw" prediction tensors:
1) preprocessed_inputs: the [batch, height, width, channels] image
tensor.
2) box_encodings: 4-D float tensor of shape [batch_size, num_anchors,
box_code_dimension] containing predicted boxes.
3) class_predictions_with_background: 3-D float tensor of shape
[batch_size, num_anchors, num_classes+1] containing class predictions
(logits) for each of the anchors. Note that this tensor *includes*
background class predictions (at class index 0).
4) feature_maps: a list of tensors where the ith tensor has shape
[batch, height_i, width_i, depth_i].
5) anchors: 2-D float tensor of shape [num_anchors, 4] containing
the generated anchors in normalized coordinates.
6) final_anchors: 3-D float tensor of shape [batch_size, num_anchors, 4]
containing the generated anchors in normalized coordinates.
If self._return_raw_detections_during_predict is True, the dictionary
will also contain:
7) raw_detection_boxes: a 4-D float32 tensor with shape
[batch_size, self.max_num_proposals, 4] in normalized coordinates.
8) raw_detection_feature_map_indices: a 3-D int32 tensor with shape
[batch_size, self.max_num_proposals].
"""
Notice that .predict takes its inputs as tensors. If you tried to pass in the preprocessed images
and true shapes, you'll get an error.
# Try to call `predict` and pass in lists; look at the error message
try:
detection_model.predict(preprocessed_image_list, true_shape_list)
except AttributeError as e:
print("Error message:", e)
But don't worry! You can check how to properly use predict :
Notice that the source code documentation says that preprocessed_inputs and
true_image_shapes are expected to be tensors and not lists of tensors.
One way to turn a list of tensors into a tensor is to use tf.concat
tf.concat(
values, axis, name='concat'
)
Now you can make predictions for the images. According to the source code, predict returns a
dictionary containing the prediction information, including:
print("keys in prediction_dict:")
for key in prediction_dict.keys():
print(key)
Calculate loss
Now that your model has made its prediction, you want to compare it to the ground truth in order
to calculate a loss.
It takes in:
Try calling .loss . You'll see an error message that you'll addres in order to run the .loss
function.
try:
losses_dict = detection_model.loss(prediction_dict, true_shape_tensor)
except RuntimeError as e:
print(e)
So you'll first want to set the ground truth (true labels and true bounding boxes) before you
calculate the loss.
This makes sense, since the loss is comparing the prediction to the ground truth, and so the
loss function needs to know the ground truth.
The source code for providing the ground truth is located in the parent class of
SSDMetaArch , model.DetectionModel .
Here is the link to the code for provide_ground_truth
def provide_groundtruth(
self,
groundtruth_boxes_list,
groundtruth_classes_list,
... # more parameters not show here
"""
Args:
groundtruth_boxes_list: a list of 2-D tf.float32 tensors of shape
[num_boxes, 4] containing coordinates of the groundtruth boxes.
Groundtruth boxes are provided in [y_min, x_min, y_max, x_max]
format and assumed to be normalized and clipped
relative to the image window with y_min <= y_max and x_min <= x_max.
groundtruth_classes_list: a list of 2-D tf.float32 one-hot (or k-hot)
tensors of shape [num_boxes, num_classes] containing the class targets
with the 0th index assumed to map to the first non-background class.
"""
You'll set two parameters in provide_ground_truth :
You can now calculate the gradient and optimize the variables that you selected to fine tune.
Use tf.GradientTape
# calculate the gradient of each model variable with respect to each loss
gradients = tape.gradient([some loss], variables to fine tune)
# Let's just reset the model so that you can practice setting it up yourself!
detection_model.provide_groundtruth(groundtruth_boxes_list=[], groundtruth_classes_list=[]
Args:
image_list: A list of [1, height, width, 3] Tensor of type tf.float32.
Note that the height and width can vary across images, as they are
reshaped within this function to be 640x640.
groundtruth_boxes_list: A list of Tensors of shape [N_i, 4] with type
tf.float32 representing groundtruth boxes for each image in the batch.
groundtruth_classes_list: A list of Tensors of shape [N_i, num_classes]
with type tf.float32 representing groundtruth boxes for each image in
the batch.
Returns:
A scalar tensor representing the total loss for the input batch.
"""
preprocessed_image_tensor = None
true_shape_tensor = None
# Make a prediction
prediction_dict = None
# Calculate the total loss (sum of both losses)
total_loss = None
return total_loss
if idx % 10 == 0:
print('batch ' + str(idx) + ' of ' + str(num_batches)
+ ', loss=' + str(total_loss.numpy()), flush=True)
print('Done fine-tuning!')
Expected Output:
Total loss should be decreasing and should be less than 1 after fine tuning. For example:
Start fine-tuning!
batch 0 of 100, loss=1.2559178
batch 10 of 100, loss=16.067217
batch 20 of 100, loss=8.094654
batch 30 of 100, loss=0.34514275
batch 40 of 100, loss=0.033170983
batch 50 of 100, loss=0.0024622646
batch 60 of 100, loss=0.00074224477
batch 70 of 100, loss=0.0006149876
batch 80 of 100, loss=0.00046916265
batch 90 of 100, loss=0.0004159231
Done fine-tuning!
You will load these images into numpy arrays to prepare it for inference.
test_image_dir = './results/'
test_images_np = []
# load images into a numpy array. this will take a few minutes to complete.
for i in range(0, 237):
image_path = os.path.join(test_image_dir, 'zombie-walk' + "{0:04}".format(i) + '.jpg')
print(image_path)
test_images_np.append(np.expand_dims(
load_image_into_numpy_array(image_path), axis=0))
Exercise 11: Preprocess, predict, and post process an image
Define a function that returns the detection boxes, classes, and scores.
Args:
input_tensor: A [1, height, width, 3] Tensor of type tf.float32.
Note that height and width can be anything since the image will be
immediately resized according to the needs of the model within this
function.
Returns:
A dict containing 3 Tensors (`detection_boxes`, `detection_classes`,
and `detection_scores`).
"""
preprocessed_image, shapes = detection_model.preprocess(input_tensor)
prediction_dict = detection_model.predict(preprocessed_image, shapes)
### START CODE HERE (Replace instances of `None` with your code) ###
# use the detection model's postprocess() method to get the the final detections
detections = None
### END CODE HERE ###
return detections
You can now loop through the test images and get the detection scores and bounding boxes to
overlay in the original image. We will save each result in a results dictionary and the autograder
will use this to evaluate your results.
# Note that the first frame will trigger tracing of the tf.function, which will
# take some time, after which inference should be fast.
label_id_offset = 1
results = {'boxes': [], 'scores': []}
for i in range(len(test_images_np)):
input_tensor = tf.convert_to_tensor(test_images_np[i], dtype=tf.float32)
detections = detect(input_tensor)
plot_detections(
test_images_np[i][0],
detections['detection_boxes'][0].numpy(),
detections['detection_classes'][0].numpy().astype(np.uint32)
+ label_id_offset,
detections['detection_scores'][0].numpy(),
category_index, figsize=(15, 20), image_name="./results/gif_frame_" + ('%03d' % i) +
results['boxes'].append(detections['detection_boxes'][0][0].numpy())
results['scores'].append(detections['detection_scores'][0][0].numpy())
# TEST CODE
print(len(results['boxes']))
print(results['boxes'][0].shape)
print()
Expected Output: Ideally the three boolean values at the bottom should be True . But if you only
get two, you can still try submitting. This compares your resulting bounding boxes for each zombie
image to some preloaded coordinates (i.e. the hardcoded values in the test cell above). Depending
on how you annotated the training images,it's possible that some of your results differ for these
three frames but still get good results overall when all images are examined by the grader. If two or
all are False, please try annotating the images again with a tighter bounding box or use the
predefined gt_boxes list.
237
(4,)
True
True
True
You can also check if the model detects a zombie class in the images by examining the scores
key of the results dictionary. You should get higher than 88.0 here.
x = np.array(results['scores'])
You can also display some still frames and inspect visually. If you don't see a bounding box around
the zombie, please consider re-annotating the ground truth or use the predefined gt_boxes here
print('Frame 0')
display(IPyImage('./results/gif_frame_000.jpg'))
print()
print('Frame 5')
display(IPyImage('./results/gif_frame_005.jpg'))
print()
print('Frame 10')
display(IPyImage('./results/gif_frame_010.jpg'))
filenames = glob.glob('./results/gif_frame_*.jpg')
filenames = sorted(filenames)
zipf.close()
imageio.plugins.freeimage.download()
anim_file = './zombie-anim.gif'
filenames = glob.glob('./results/gif_frame_*.jpg')
filenames = sorted(filenames)
last = -1
images = []
Unfortunately, using IPyImage in the notebook (as you've done in the rubber ducky detection
tutorial) for the large gif generated will disconnect the runtime. To view the animation, you can
instead use the Files pane on the left and double-click on zombie-anim.gif . That will open a
preview page on the right. It will take 2 to 3 minutes to load and see the walking zombie.
Run the cell below to save your results. Download the results.data file and upload it to the
grader in the classroom.