Patrick Gray (patrick.c.gray at duke) - https://github.com/patrickcgray
Chapter 3: Plotting and visualizing your data
with matplotlib
Introduction
matplotlib is avery powerful plotting library for making amazing visualizations for publications,
personal use, or even web and desktop applications. matplotlib can create almost any two
dimensional visualization you can think of, including histograms, scatter plots, bivariate plots, and
image displays. For some inspiration, check out the natplotlit example gallery which includes
the source code required to generate each example.
matplotlib API - state-machine versus object-oriented
One part of matplotlib that may be initially confusing is that matplotlib contains two main
methods of making plots - the object-oriented method, and the state-machine method.
A very good overview of the difference between the two usages is provided by Jake Vanderplas,
Specifically,
# Pyplot state-machine
© Object-oriented
f you need a primer on matplotlib beyond what is here | suggest: Python Like you Mean tor the
matplotlib users guide,
in general, | think you should use the object-oriented API. While more complicated, is a much more
powerful way of creating plots and should be used when developing more complicatec
visualizations. | always recommend the OO API.
Image display
We will begin by reading our example image into a NumPy memory array as shown in Chapter 2
import rasterio
import numpy as np
# open our raster dataset
dataset = rasterio.open(' . . /data/L£70220491999322EDCO1_stack.gtif")
image = dataset. read()
Let's check out some of this datasets characteristics# How many bands does this image have?
‘nun_bands = dataset.count
print("Number of bands in image: (n}\n'format(nsnum_bands))
# How many rows and columns?
rows, cols = dataset.shape
print('Inage size is: (r} rows x (c} colunns\n’ format (r=rows, ¢=cols))
# what driver was used to open the raster?
driver = dataset.driver
print('Raster driver: {4)\n" «format (d=driver))
# What is the raster's projection?
proj = dataset.crs
print(‘*Inage projection:')
print(proj)
Number of bands in image: 8
Image size is: 25@ rows x 25@ columns
Raster driver: GTift
Image projection:
=PSG:32615
Note that if you have the full dataset read in with you can do:
red_band = image[2, :, :] # this pulls out the 3rd band
print(red_band. shape)
(250, 258)
This pulls out the band at index 2 which is the 3rd band because python indexing starts at 0
Which is equal to simply doing
red_band_read = dataset.read(3) # this pulls out the 3rd band
print (red_band_read. shape)
if mp.array_equal(red_band_read, red_band): + ore they equal?
print(‘They are the sane.)
(250, 250)
They are the same.
f you have a large dataset the second dataset read(3) may be preferable because you save on
memory
Basic plotting
First thing to do is to import matplot Lit into our namespace. | will be using a special feature of
the [Python utility which allows me to “inline" matplotlib figures by entering the %natplotlib
inline command.5]: Amport matplotlib.pyplot as plt
Xmatplotlib inline
With natplotlib imported, we can summon up a figure and make our first plot:
# Array of @ - 9
x = np.arange(1@)
# 10 random numbers, between @ and 10
y = np.randon.randint (2, 10, size=10)
# plot them as Lines
plt.plot(x, y)
5]: [
specify "Ls" (“Linestyle") as a null string
# plot them as just points
plt.plot(x, y, ‘ro',
>), [ematplotlib.lines.Line2d at @x7f35a225d100>
8 .
.
6 . .
4
.
2
.
°
0 2 a 6 3
As | mentioned earlier, we'll be using Matplotib’s object oriented API, instead of its function-based
API. Let's briefly draw a distinction between these two APIs:
# prepare 50 x-coordinates and 58 y-coordinates
x = np.Linspace(-np.pi, np.pi, 50)8
y = np.sin(x)
# Plot using motplotlib's functional apr:
# a single function call produces a plot; conventent but Less flexible
plt.plot(x, y)
[
100
075
050
025
0.00
025
-0850
-075
100
# Plot using matplotlib's object-oriented API:
# we generate o figure and axis object: “fig’ and “ax”
fig, ax = plt-subplots()
# we then use these objects to draw-on and manipulate our plot
ax.plot(x, y)
[
100
075
050
025
0.00
025
-0850
-075
100
3 2 4 o© @ 2 3
Although the code that invokes the functional API is simpler, its far less powerful and flexible than
the object-oriented API, which produces figure (fig) and axes (ax) objects that we can leverage to
customize our plot. You will ikely see tutorials utilize the functional API in their examples, so itis
useful to understand the distinction here, Shortly, you will learn how to leverage Matplotlib’s object-
oriented API in powerful ways
Plotting 2D arraysOne typical thing that we might want to do would be to plot one band against another. In order to
do this, we will need to transform, or flatten , our 2 dimensional arrays of each band's values into
1 dimensional arrays
red
image[3, :, :]
image[4, :, :]
print(*Array shape before: {shp} (size is {sz})'.format(shp=red.shape, szered.size))
red_flat = np.ndarray.flatten(red)
nir_flat = np.ndarray.flatten(nir)
print(‘Array shape after: {shp} (size is {sz})'.format(shp=red_flat.shape, sz=red_flat.
Array shape before: (250, 250) (size is 62500)
Array shape after: (62500,) (size is 62500)
We have retained the number of entries in each of these raster bands, but we have flattened them
from 2 dimensions into 1.
Now we can plot them. Since we just want points, we can use scatter for a scatterplot. Since
there are no lines in a scatterplot, it has a slightly different syntax.
Fig, ax = plt.subplots()
# Make the plot
ax.scatter(red_flat, nir_flat, color:
# Add some oxis Labels
ax.set_xlabel( ‘Red Reflectance" )
ax.set_ylabel(‘NIR label’)
# Add a title
ax.set_title(‘Tasseled Cap, eh?')
Text(0.5, 1.0, ‘Tasseled Cap, eh?')
Tasseled Cap, eh?
#08
200
3000
NR tabet
y
1000
300 1500 2000 2500 3000 3500 4000 4500 5000
Red Reflectance
f we wanted the two axes to have the same limits, we can calculate the limits and apply themfig, ax = plt.subplots()
# Make the plot
ax.scatter(red_flat, nir_flat, color="r', marker='0")
# Calculate min and max
plot_min = min(red.min(), nir.min())
plot_max = max(ned.max(), nir.max())
ax.set_xLim((plot_min, plot_max))
ax.set_ylim((plot_min, plot_max))
# Add some oxis Labels
ax.set_xlabel( ‘Red Reflectance" )
ax.set_ylabel(‘NIR label’)
# Add a title
ax.set_title(‘Tasseled Cap, eh?')
Text(0.5, 1.0, ‘Tasseled Cap, eh?")
Tasseled Cap, eh?
s000
4000
2200
. 2000
1000
1000 ~~=«2000~=~=«ODS*«OSC«C
Red Reflectance
What if we want to view average intensities of each band?
# numbers 1-8
x = np.arange(1,9)
# Lets get the average value of each band
y = npemean(inage, axis=(1,2))
Fig, ax = plt.subplots()
# plot them as Lines
ax.plot(x, y)
# Add some axis Labels
ax.set_xlabel( ‘Band #')
ax,set_ylabel( ‘Reflectance Value")
w Add a title
ax,set_title("Band Intensities")
Text(@.5, 1.0, "Band Intensities")In
Band Intensities
Reflectance Value
Ba 8
ss 8
500
1 2 3 4 § 6 9 68
Band =
‘tis common to have this high reflectance in the NIR band (band 4 here)
Plotting 2D arrays - images
With so much data available to look at, it can be hard to understand what is going on with the mess
of points shown above. Luckily our datasets aren't just a mess of points - they have a spatia
structure.
To show the spatial structure of our images, we could make an image plot of one of our bands using
imshow to display an image on the axes:
# use the matpLotltb.pyplot function “imshow" for an image -- nir at first
fig, ax = plt.subplots()
ax.imshow(image[3, :, :])
0 50 «100150200
Well, it looks like there is something going on - maybe a river in the center and some bright
vegetation to the bottom left of the image. What's lacking is any knowledge of what the colors
meanLuckily, matplotlib can provide us a colorbar,
# use "imshow" for an image -- nir at first
fig, ax = plt-subplots()
ing = ax.imshow(image[3, :, :])
fig.colorbar(img, ax=ax) # we have to pass the current plot as an argument thus have to
0150-200
ff we want a greyscale image, we can manually specify a colormap:
Well also begin showing some more advanced plotting capabilities in matplotlb such as putting
‘multiple plots in a single output. Here we'll compare the NIR to Red in greyscale.
fig, (ax, ax2) = plt-subplots(1,2, figsize=(10,3)) # 2 axes on a 1x2 grid
# note that we could also use indexing for our axes:
# fig, ox = plt.subplots(1, 2)
# axl = ax[@]
# find max reflectance to put them on the same colorbar scale
max_ref = np.anax([np.anax(image[3:,:]), np-amax(image[2:,:])])
# nin in first subplot
nir = ax1.imshow(image[3, :,
axd.set_title("NIR Sand!
nir.set_clim(vmin=@, vmax=max_ref)
J, crap=plt.cm.Greys)
fig.colorbar(nir, ax=ax1)
# Now red band in the second subplot
ed = ax2.imshow(inage[2, :, :], chap=plt.cm.Greys)
ax2.set_title("Red Band")
ed.set_clim(vmin=@, vmax=max_ref)
Fig.colorbar(red, ax=ax2)
NIR Band Red Band
o 5000 0 5000
% 000 0 / 000
100 3000 100 . , 3000
150 200 80 2000
20 1000 20 1000
° °
os 100 150. 200 o 2 io so ao
Plotting 3D arrays - multispectral images
Greyscale images are nice, but the most information we can receive comes from looking at the
interplay among different bands. To accomplish this, we can map different spectral bands to the
Red, Green, and Blue channels on our monitors
Before we can do this, the natplotlib imshow help tells us that we need to normalize our
bands into a 0 - 1 range. To do so, we will perform a simple linear scale fitting 0 reflectance to 0 and
80% reflectance to 1, clipping anything larger or smaller.
Remember
If we are going from a Int16 datatype (e.g., reflectance scaled by
10,000x) to a decimal between @ and 1, we will need to use a Float!
from rasterio.plot import reshape_as_raster, reshape_as_image
# Extract reference to SWIRL, NIR, and Red bands
index = np.array([4, 3, 2))
colors = image[index, :, :].astype(np.floatss)
max_val = 5000
mmin_val
# Enforce maximum and minimum values
colors{colors{:, :, :] > max_val] = max_val
colors{colors(:, :, :] < min_val] = min_val
for b in range(colors.shape(@]):
colors(b, :, :] = colors[b, :, :] *1/ (max_val = min_val)
# rasters are in the format (bands, rows, cols] whereas images are typically [rows, col
# and so our array needs to be reshaped
print(colors.shape)
colors_reshaped = reshape_as_image(colors)
print(colors_reshaped. shape)
(3, 250, 250)
(250, 250, 3)
We've got a correctly shaped and normalized image now.Let's calculate NDVI on this dataset to compare it to the color image.
In [18. np.seterr(divide='ignore’, invalid=' ignore’)
bandNIR
banded
ndvi = (bandNIR.astype( float) -bandRed. astype( float) )/(bandNIR.astype(float)+bandRed.ast
Now let's plot it all
1) 18)" 44g, axs = plt.subplots(1, 2, Flgsize=(10, 5))
# Show the color image
axs[@].imshow(colors_reshaped)
axs[@].set_title(‘Color Image’)
# Show Nor
axs[1].imshow(ndvi, cmap="Rdy1Gn")
axs[1].set_title('NOVI')
outfie): Text(@.5, 1.8, "NDVI')
Color image
100
350
200
so 00200
Pretty informative now!
There is a slighly easier way to do this
rasterio has its own show function that is built to handle rasters. We saw this in chapter 1 briefly.
in [201 4 this functions build on matplotlib and make them custom for rasterio
from rasterio.plot import show
In [21
Fig, ax = plt. subplots (Figsize=(6,6))
# display just band 4 (WIR)
show((dataset.read(4)), axeax)
° 0 300 150 700
And if we want to show a color image we can use raste
adjust_band function
from rasterio.plot import adjust_band
rgb = image[0:3] # read in red, green, blue
rgb_norm = adjust_band(rgb) # normalize bands to range between 1.0 to 0.0
rgb_reshaped = reshape_as_image(rgb_norm) # reshape to [rows, cols, bands]
Fig, axs = plt.subplots(1, 2, figsize=(10, 5))
# the rasterio show function takes in [bands, rows, cols] so we don't need to reshape
show(rgb_norm, axzaxs[@])
axs(0].set_title("RG8 from rasterio show")
# plot with normal matpLotlib functions
axs[ 1] -imshow(rgb_reshaped)
axs[1]-set_title("RG8 in matplotlib inshow")
Text(@.5, 1.0, ‘RGB in matplotlib imshow' )RGB in matplotlib imshow
RGB from rasterio show
50 50
100 100
150 150
200 200
30 S000
displaying all three bands side by side
(22,7)
red channel" )
green channel" )
subplots(1,3, figsiz
ds’, title
‘Greens’, title:
cmap="Blues', title='blue channel’)
fig, (axr, axg, axb) = plt.
show( (dataset, 1), axsaxr,
show((dataset, 2), ax=axg,
show((dataset, 3), ax=axb,
‘center’: "blue channel"}>
If we want to see a histogram of the data we use the show_|
function
from rasterio.plot inport show hist
Fig, ax = plt.subplots(#igsize=(19,5))
show_hist(dataset, ax-ax)Histogram
mE _. at Ox7135a05c5e40>
3 100
20000
10000 |
° al
0 2000 3000)
ow
Let's look at an overlapping histogram, maybe that'll be more informative:
fig, ax = plt.subplots(Figsize=(10,5))
show_hist(dataset, ax=ax, bins=50, lw=0.0,
histtype='stepfilled’, title
World Histogram overlaid
tacked=False, alpha=0.3,
"World Histogram overlai
3000
“. at Ox7F35a0619d60>
0 1000 2000 3000 2000
ow
Let's look at another raster
world = rasterio.open("../data/world.rgb.tif")
print(world. shape)
fig, ax = plt.subplots(Figsize=(10,5))
show((world), axeax, cnap='viridis")
(256, 512)
3000In
Check out more documentation for the rasteric plotting functions here:
https://rasterio.readthedocs.io/en/latest/api/rasterio.plot html#rasterio.plot show
Wrapup
We've seen how natplotlib canbe combined with aumpy and rasterio to easily visualize
and explore our remote sensing data. In the next chapter (link to webpage or Notebook) we wil
cover how to open and read vector data