-
Notifications
You must be signed in to change notification settings - Fork 552
Description
Issue
When passing a masked array to warp.reproject, the mask is not properly taken into account.
Looking at the source code at this line, it seems that the behavior of the function is:
- set the src_nodata to the masked_array fill_value - This is not desirable, as the Raster's nodata value and masked_arrays fill_value are two independent concepts. Moreover, a masked value is not necessarily set to fill_value (e.g. if the mask has been updated)
- the mask is not actually taken into account, at L 267
source = np.array(source, copy=False)means that the - for UInt8 rasters using the whole range 0-255, it can be annoying to have to set a nodata value that might actually be used.
Expected behavior and actual behavior.
- The mask should be used during the reprojection, not only the fill_value.
- If src_nodata is not set, should raise an error or warning
- It should be possible to run the reprojection without setting a nodata value, especially for UInt8 types
Point 1 is easy to address by replacing source = np.array(source, copy=False) with source = np.ma.masked_array(source).filled(src_nodata).
Point 2 is also straightforward.
Point 3 might be more complex.
Steps to reproduce the problem.
Below is simple reproducible example:
import numpy as np
import rasterio as rio
from rasterio import warp
# Create fake raster
width = height = 10
src_data = np.random.randint(0, 255, (height, width), dtype="uint8")
src_transform = rio.transform.from_bounds(0, 0, 1, 1, width, height)
crs = rio.crs.CRS.from_epsg(4326)
nodata = 255
# Shift by a 1/3 pixel to the right, with nearest output should be equal
dst_transform = rio.transform.from_bounds(0 + src_transform[0]/3., 0, 1, 1, width, height)
dst_data, _ = warp.reproject(src_data, src_transform=src_transform, resampling=warp.Resampling.nearest, src_crs=crs, dst_crs=crs, dst_transform=dst_transform, destination=np.ones_like(src_data), dst_nodata=nodata, src_nodata=255)
np.array_equal(dst_data, src_data) # -> True
# Same but with 3 left columns masked
src_data2 = np.ma.masked_array(src_data)
src_data2[:, :3] = np.ma.masked
dst_data2, _ = warp.reproject(src_data2, src_transform=src_transform, resampling=warp.Resampling.nearest, src_crs=crs, dst_crs=crs, dst_transform=dst_transform, destination=np.ones_like(src_data), dst_nodata=nodata, src_nodata=255)
dst_data2 = np.ma.masked_array(dst_data2, mask=(dst_data2==nodata))
np.array_equal(src_data2.data, dst_data2.data) # -> True, the underlying pixels are unchanged
np.array_equal(src_data2.mask, dst_data2.mask) # -> False, the mask is not preserved
# Fixing the inapprorpiate behavior
src_data3 = np.ma.masked_array(src_data2).filled(255)
dst_data3, _ = warp.reproject(src_data3, src_transform=src_transform, resampling=warp.Resampling.nearest, src_crs=crs, dst_crs=crs, dst_transform=dst_transform, destination=np.ones_like(src_data), dst_nodata=nodata, src_nodata=255)
dst_data3 = np.ma.masked_array(dst_data3, mask=(dst_data3==nodata))
np.array_equal(src_data3, dst_data3.data) # -> True, the original pixels are unchanged
np.array_equal(src_data2.mask, dst_data3.mask) # -> True, the mask is preserved
Environment Information
rasterio info:
rasterio: 1.3.2
GDAL: 3.5.1
PROJ: 9.0.1
GEOS: 3.11.0
PROJ DATA: /usr/local/Caskroom/miniconda/base/envs/xdem/share/proj
GDAL DATA: /usr/local/Caskroom/miniconda/base/envs/xdem/share/gdal
System:
python: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:52) [Clang 13.0.1 ]
executable: /usr/local/Caskroom/miniconda/base/envs/xdem/bin/python
machine: macOS-11.6.4-x86_64-i386-64bit
Python deps:
affine: 2.3.1
attrs: 22.1.0
certifi: 2022.06.15
click: 8.0.4
cligj: 0.7.2
cython: None
numpy: 1.22.4
snuggs: 1.4.7
click-plugins: None
setuptools: 65.3.0
Installation Method
Installed from conda.