-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
What happened:
Running dask.array.from_array() on a 2D numpy.ndarray and passing a specific list of chunk sizes to chunks, construction of the Dask Array failed with the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/jlamb/miniconda3/lib/python3.8/site-packages/dask/array/core.py", line 3054, in from_array
chunks = normalize_chunks(
File "/Users/jlamb/miniconda3/lib/python3.8/site-packages/dask/array/core.py", line 2726, in normalize_chunks
raise ValueError(
ValueError: Chunks do not add up to shape. Got chunks=((67, 6), (33, 6)), shape=(100, 6)
What you expected to happen:
I expected that running da.from_array(X, chunks=((67, 6), (33, 6))) on a numpy array with shape (100, 6) would create a Dask Array with two chunks, where the first chunk had size (67, 6) and the second chunk had size (33, 6).
I expected to be able to do this based on the documentation at https://docs.dask.org/en/latest/array-api.html#dask.array.from_array, which includes following in the list of valid values for the chunks argument to dask.array.from_array():
Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).
Minimal Complete Verifiable Example:
import dask.array as da
import numpy as np
X = np.random.random((100, 6))
da.from_array(
X,
chunks=((67, 6), (33, 6))
)Anything else we need to know?:
I did search for other issues with this error message, and didn't find any that seemed like the same question. I don't know for sure if the error I hit is the root cause of #6709 or not, but it seems possible. I don't even know for sure if it's an error or if I just misinterpreted the documentation 😬 .
I tried to find tests on the pattern "pass a list of specific chunk sizes to .from_array()", to see if those tests used that feature differently and maybe I was misunderstanding the docs. I ran git grep from_array dask/tests and saw that there don't appear to be any tests currently on that pattern.
Environment:
- Dask version:
dask: 2021.2.0numpy: 1.20.1
- Python version: 3.8.3
- Operating System: MacOS Mojave (10.14.6)
- Install method (conda, pip, source):
pip
conda info
active environment : None
user config file : /Users/jlamb/.condarc
populated config files : /Users/jlamb/.condarc
conda version : 4.9.2
conda-build version : not installed
python version : 3.8.3.final.0
virtual packages : __osx=10.14.6=0
__unix=0=0
__archspec=1=x86_64
base environment : /Users/jlamb/miniconda3 (writable)
channel URLs : https://repo.anaconda.com/pkgs/main/osx-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/osx-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /Users/jlamb/miniconda3/pkgs
/Users/jlamb/.conda/pkgs
envs directories : /Users/jlamb/miniconda3/envs
/Users/jlamb/.conda/envs
platform : osx-64
user-agent : conda/4.9.2 requests/2.24.0 CPython/3.8.3 Darwin/18.7.0 OSX/10.14.6
UID:GID : 501:20
netrc file : None
offline mode : False
Thanks for your time and help with this.
