Skip to content

Severe slicing bug when slicing a cube with a boolean ndarray #6251

@valeriupredoi

Description

@valeriupredoi

🐛 Bug Report

Hi folks, buggy-bug here, please see the following:

MRE

import netCDF4
import iris
import dask
print(dask.__version__)
import numpy as np


fil = "tos_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc"

def do_lazy_iris(c):
    bo = np.zeros(c.core_data().shape[0], dtype=bool)
    bo[1560:1860] = True
    d = c[bo, ...]
    print("Lazy Iris")
    print(d[100, 30, 50].data)
    print(d[56, 30, 50].data)
    print(d[144, 30, 50].data)


def do_realized_iris(c):
    bo = np.zeros(c.data.shape[0], dtype=bool)
    bo[1560:1860] = True
    d = c[bo, ...]
    print("Realized Iris")
    print(d[100, 30, 50].data)
    print(d[56, 30, 50].data)
    print(d[144, 30, 50].data)


def do_netCDF4():
    ds = netCDF4.Dataset(fil)
    print(ds["tos"][1660, 30, 50])
    print(ds["tos"][1616, 30, 50])
    print(ds["tos"][1704, 30, 50])


cub = iris.load_cube(fil)
do_lazy_iris(cub)
print("\n")
do_realized_iris(cub)
print("\n")
do_netCDF4()

file can be downloaded directly from https://esgf.ceda.ac.uk/thredds/catalog/esg_cmip6/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/tos/gn/v20190308/catalog.html?dataset=esg_cmip6/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/tos/gn/v20190308/tos_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc

Environment

Things that matter:

  • iris==3.11.0
  • dask: 2024.8 or 2024.11
  • netCDF4 (latest version, doesn't matter)
  • all else should be the same

Results

Run the above Python code for the two versions of Dask (2024.8 and 2024.11) and you'll see the differences 😁

Fixing the issue

Hints:

  • There is some very deterministic rotation of the data from lazy vs realized, the data is the same but there is an offset of indices of exactly 44 (in this case), so the data, albeit having the same data points values, is symmetrically shifted, I'm sure that if you look at the code you will find this extra indexing that's not needed 😃
  • no such issues when slicing with int ndarrays, gotta be boolean
  • it's be good to have a test that checks against netCDF4 values, like in my MRE

Good luck 🍺

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions