-
Notifications
You must be signed in to change notification settings - Fork 551
Description
Expected behavior and actual behavior.
Calling read with boundless=True, masked=True, window=window on a all-valid raster with no NODATA tag incorrectly masks 0-valued pixels.
Expected behaviour is for there to be no masked values in an all-valid array.
read_masks returns the correct mask (so a workaround is to read unmasked, read the mask, then do your own masking).
Bug is present in 1.4.0, 1.4.1, and 1.4.2. But not present < 1.4.0.
This branch looks like the culprit:
Lines 697 to 699 in ff66850
| mask_vrt_doc = _boundless_vrt_doc( | |
| self, nodata=0, | |
| width=max(self.width, window.width) + 1, |
Steps to reproduce the problem.
Testcase:
import numpy as np
import rasterio
# Test array: random data > 0, with one pixel set to zero.
shape = (180, 360)
transform = rasterio.transform.from_bounds(-180, -90, 180, 90, 360, 180)
a = np.random.RandomState(1).uniform(1, 2, shape)
a[100, 100] = 0
# Save to geotiff.
path = '/tmp/rasterio-boundless-masked.tif'
profile = {
"driver":'GTiff',
"height": a.shape[0],
"width": a.shape[1],
"crs": rasterio.crs.CRS.from_epsg(4326),
"transform": transform,
"dtype": np.float32,
"count": 1,
}
with rasterio.open(path, 'w', **profile) as f_out:
f_out.write(a, 1)
# Read with different options, check no nan values area present.
with rasterio.open(path) as f:
bounds_full = f.bounds
shape_full = f.shape
window = rasterio.windows.Window.from_slices((1, 180), (1, 360), boundless=True)
a_full = f.read(1, masked=True).filled(np.nan)
a_window = f.read(1, window=window, masked=True).filled(np.nan)
a_boundless = f.read(1, boundless=True, masked=True).filled(np.nan)
a_window_boundless = f.read(1, window=window, boundless=True, masked=True).filled(np.nan)
mask_full = f.read_masks(1)
mask_window = f.read_masks(1, window=window)
mask_boundless = f.read_masks(1, boundless=True)
mask_window_boundless = f.read_masks(1, window=window, boundless=True)
print(f"{shape_full=}")
print(f"{window=}")
print()
print(f"{np.isnan(a_full).sum()=:d}")
print(f"{np.isnan(a_window).sum()=:d}")
print(f"{np.isnan(a_boundless).sum()=:d}")
print(f"{np.isnan(a_window_boundless).sum()=:d}")
print()
print(f"{(mask_full == 0).sum()=:d}")
print(f"{(mask_window == 0).sum()=:d}")
print(f"{(mask_boundless == 0).sum()=:d}")
print(f"{(mask_window_boundless == 0).sum()=:d}")Output:
shape_full=(180, 360)
window=Window(col_off=1, row_off=1, width=359, height=179)
np.isnan(a_full).sum()=0
np.isnan(a_window).sum()=0
np.isnan(a_boundless).sum()=0
np.isnan(a_window_boundless).sum()=1 # ⚠️ this should be zero
(mask_full == 0).sum()=0
(mask_window == 0).sum()=0
(mask_boundless == 0).sum()=0
(mask_window_boundless == 0).sum()=0
Bug persists if using a truly boundless window larger than the raster bounds.
Bug disappears if setting the single pixel to any value other than zero (a[100, 100] = -1) or setting a nodata tag on the raster ("nodata": -9999).
Environment Information
rasterio info:
rasterio: 1.4.2
GDAL: 3.9.3
PROJ: 9.4.1
GEOS: 3.11.1
PROJ DATA: /home/user/.local/lib/python3.12/site-packages/rasterio/proj_data
GDAL DATA: None
System:
python: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:12:24) [GCC 11.2.0]
executable: /opt/conda/bin/python
machine: Linux-5.15.0-119-generic-x86_64-with-glibc2.39
Python deps:
affine: 2.4.0
attrs: 24.2.0
certifi: 2024.08.30
click: 8.1.7
cligj: 0.7.2
cython: None
numpy: 2.0.2
click-plugins: None
setuptools: 69.5.1
Installation Method
pip install rasterio==1.4.2 in conda py312_24.5.0-0 in ghcr.io/osgeo/gdal:ubuntu-small-3.9.2.