Skip to content

Conversation

@jameslamb
Copy link
Member

  • Closes #xxxx
  • Tests added / passed
  • Passes black dask / flake8 dask

I recently found a need to convert a numpy 2D array into a 2-chunk Dask Array with exactly-specified, different sized chunks. I consulted the documentation at https://docs.dask.org/en/latest/array-api.html#dask.array.from_array but couldn't understand how to do this correctly, which led to me opening #7310. The way this works was explained to me in #7310 and that issue was closed.

Based on my experience with this feature of Dask Array, in this PR I'd like to propose two small changes:

  • an example in the documentation for from_array() that shows how to explicitly control chunk sizes
  • a unit test on the behavior of explicitly controlling task sizes (there are not currently any such tests in this repo, as far as I can tell, based on git grep from_array dask/tests)

Thanks for your time and consideration.

Copy link
Member

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this @jameslamb! I think the docstring change is definitely an improvement, but the test case is already covered, most of the tests for arrays are nested within the array dir. You can persuade yourself that the error case is being tested (this is what I just did) by changing ValueError to KeyError here

raise ValueError(
and running pytest -k array -n auto to see the failures.

@jameslamb
Copy link
Member Author

Thanks for opening this @jameslamb! I think the docstring change is definitely an improvement, but the test case is already covered, most of the tests for arrays are nested within the array dir. You can persuade yourself that the error case is being tested (this is what I just did) by changing ValueError to KeyError here

raise ValueError(

and running pytest -k array -n auto to see the failures.

oh! I did not realize that that tests were spread across multiple directories so I only checked with git grep from_array dask/tests.

removed in c1dcb56

@jsignell jsignell merged commit 7584d58 into dask:master Mar 8, 2021
dcherian added a commit to dcherian/dask that referenced this pull request Mar 18, 2021
* upstream/main:
  Change default branch from master to main (dask#7198)
  Add Xarray to CI software environment (dask#7338)
  Update repartition argument name in error text (dask#7336)
  Run upstream tests based on commit message (dask#7329)
  Use pytest.register_assert_rewrite on util modules (dask#7278)
  add example on using specific chunk sizes in from_array() (dask#7330)
  Move numpy skip into test (dask#7247)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants