Skip to content

Conversation

@jrbourbeau
Copy link
Member

This PR adds xarray to our CI software environments so that the few tests in dask/array/tests/test_xarray.py will run. FWIW Xarray's dependencies, NumPy, pandas, and setuptools, are already installed in CI so we shouldn't need to pull in any additional packages other than Xarray itself.

cc @shoyer @crusaderky for visibility

@jakirkham
Copy link
Member

Would this cause a dependency cycle? Wondering since Xarray also makes use of Dask

Also do we do any other downstream dependency testing? If so, how do we handle that currently? If not, are there other downstream dependencies we would want to test and if so how should we manage that?

@jsignell
Copy link
Member

jsignell commented Mar 8, 2021

Would this cause a dependency cycle? Wondering since Xarray also makes use of Dask

xarray doesn't depend on dask, it's an optional import, so should be fine.

> conda search xarray --info
xarray 0.17.0 pyhd3eb1b0_0
--------------------------
file name   : xarray-0.17.0-pyhd3eb1b0_0.conda
name        : xarray
version     : 0.17.0
build       : pyhd3eb1b0_0
build number: 0
size        : 510 KB
license     : Apache-2.0
subdir      : noarch
url         : https://repo.anaconda.com/pkgs/main/noarch/xarray-0.17.0-pyhd3eb1b0_0.conda
md5         : 8702480113641ff680e55c3e70f78ea3
timestamp   : 2021-03-01 18:58:22 UTC
dependencies: 
  - numpy >=1.17
  - pandas >=0.25
  - python >=3.6
  - setuptools >=41.2

@jrbourbeau
Copy link
Member Author

There are a few places in Dask where we have some Xarray-specific logic. For example

dask/dask/array/core.py

Lines 4109 to 4110 in b4f2e42

elif type(a).__module__.startswith("xarray.") and hasattr(a, "data"):
return asarray(a.data)

Our tests in tests_xarray.py just test that those code paths work as expected, so it seemed reasonable to me that we would run those tests in our CI since they are testing logic inside of Dask.

@jakirkham
Copy link
Member

Yep agree it's worth testing. Just trying to figure out how best to handle downstream dependencies like Xarray

@jrbourbeau
Copy link
Member Author

Gotcha. Today we've mostly been relying on downstream projects to do their own upstream testing (e.g. Xarray does this) and surface errors to us when changes to Dask cause downstream breakages. Overall, this setup has been benefitial -- for example this issue was reported last week and we were able to fix things before releasing.

I have less experience on how to do downstream testing well. I know Bokeh has a downstream CI build. We might want to ask them what their experience with that has been like.

@gforsyth
Copy link
Contributor

gforsyth commented Mar 8, 2021

I know numba uses this to check on downstream breakage: https://github.com/numba/texasbbq

*edit:
which, given that https://github.com/jrbourbeau/dask-integration-testing is linked in the README, I'm going to gather you're familiar with 😁

@jakirkham
Copy link
Member

Well let's stick with this then to keep things simple for now. If we start collecting downstream test dependencies, we may want to split these out into their own CI job

@jakirkham
Copy link
Member

Sorry I forgot to ask, did we only need to add the dependency for tests to run or are there other changes needed as well?

@jrbourbeau
Copy link
Member Author

Just adding xarray causes the tests to be run 👍

Thanks all for reviewing!

@jrbourbeau jrbourbeau merged commit a3f7601 into dask:master Mar 8, 2021
@jrbourbeau jrbourbeau deleted the xarray-ci branch March 8, 2021 20:02
dcherian added a commit to dcherian/dask that referenced this pull request Mar 18, 2021
* upstream/main:
  Change default branch from master to main (dask#7198)
  Add Xarray to CI software environment (dask#7338)
  Update repartition argument name in error text (dask#7336)
  Run upstream tests based on commit message (dask#7329)
  Use pytest.register_assert_rewrite on util modules (dask#7278)
  add example on using specific chunk sizes in from_array() (dask#7330)
  Move numpy skip into test (dask#7247)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants