Skip to content

Conversation

@jsignell
Copy link
Member

@jsignell jsignell commented Feb 18, 2021

It came up in #7170 (comment) that the skip should be inside the test in case numpy is not installed in the testing env.

@bnavigator
Copy link
Contributor

I can confirm, that applying this patch to 2021.2.0 works for our testsuite in openSUSE Tumbleweed.

@crusaderky
Copy link
Collaborator

This PR duplicates for the most part #7243, with the notable exception of test_distributed.py which #7243 does not cover.
I think this brings a good reason for implementing a new workflow in the mindeps suite to run the tests on dask[distributed] but with neither numpy nor pandas installed.

@jsignell
Copy link
Member Author

This PR duplicates for the most part #7243

Yay! Somehow I missed that work. Thanks for doing that.

@jsignell
Copy link
Member Author

I just rebased and reverted the commits that should already be handled. I'll add another mindeps test that only adds distributed.

@crusaderky
Copy link
Collaborator

See discussion on #7244 about the failures you're getting on fsspec

@jsignell
Copy link
Member Author

Yeah thanks for mentioning that. I saw a reference to that in the PR from yesterday. I added a skip to get these tests passing in the meantime because I think it adds value to know the current state of dependencies.

@jsignell
Copy link
Member Author

@crusaderky do you have any idea about this failure? I can't repro locally.

=================================== FAILURES ===================================
______________________________ test_clone[False] _______________________________

layers = False

    @pytest.mark.parametrize("layers", [False, True])
    def test_clone(layers):
        dsk1 = {("a", h1): 1, ("a", h2): 2}
Exception ignored in: <coroutine object gen_cluster.<locals>._.<locals>.test_func.<locals>.coro at 0x7f8f6ca2e0e0>
Traceback (most recent call last):
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/utils_test.py", line 917, in coro
    await c._close(fast=s.status == Status.closed)
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/client.py", line 1347, in _close
    asyncio.shield(self._handle_scheduler_coroutine), 0.1
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/asyncio/tasks.py", line 429, in wait_for
    fut = ensure_future(fut, loop=loop)
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/asyncio/tasks.py", line 614, in ensure_future
    raise ValueError('loop argument must agree with Future')
ValueError: loop argument must agree with Future
        dsk2 = {"b": (add, ("a", h1), ("a", h2))}
        dsk3 = {"c": 1}
        if layers:
            dsk1 = HighLevelGraph.from_collections("a", dsk1)
            dsk2 = HighLevelGraph(
                {"a": dsk1, "b": dsk2}, dependencies={"a": set(), "b": {"a"}}
            )
            dsk3 = HighLevelGraph.from_collections("c", dsk3)
        else:
            dsk2.update(dsk1)
    
        t1 = Tuple(dsk1, [("a", h1), ("a", h2)])
        t2 = Tuple(dsk2, ["b"])
        t3 = Tuple(dsk3, ["c"])
    
        c1 = clone(t2, seed=1, assume_layers=layers)
        c2 = clone(t2, seed=1, assume_layers=layers)
        c3 = clone(t2, seed=2, assume_layers=layers)
        c4 = clone(c1, seed=1, assume_layers=layers)  # Clone of a clone has different keys
        c5 = clone(t2, assume_layers=layers)  # Random seed
        c6 = clone(t2, assume_layers=layers)  # Random seed
        c7 = clone(t2, omit=t1, seed=1, assume_layers=layers)
    
        assert c1.__dask_graph__() == c2.__dask_graph__()
        assert_no_common_keys(c1, t2, layers=layers)
        assert_no_common_keys(c1, c3, layers=layers)
        assert_no_common_keys(c1, c4, layers=layers)
        assert_no_common_keys(c1, c5, layers=layers)
        assert_no_common_keys(c5, c6, layers=layers)
        assert_no_common_keys(c7, t2, omit=t1, layers=layers)
>       assert dask.compute(t2, c1, c2, c3, c4, c5, c6, c7) == ((3,),) * 8

dask/tests/test_graph_manipulation.py:185: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dask/base.py:564: in compute
    results = schedule(dsk, keys, **kwargs)
/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/client.py:2655: in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/client.py:1970: in gather
    asynchronous=asynchronous,
/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/client.py:839: in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

loop = <tornado.platform.asyncio.AsyncIOLoop object at 0x7f8f6d4a4fd0>
func = <bound method Client._gather of <Client: scheduler='tcp://127.0.0.1:44241'>>
callback_timeout = None
args = ([[<Future: pending, key: b>], [<Future: pending, key: 1f658ba46913a753b7777a91b03e3b51>], [<Future: pending, key: f60...Future: pending, key: c5c3f8e93c0364523f3c032146cf9c30>], [<Future: pending, key: aefd54a16c15e02a26aa9cedaaee9ada>]],)
kwargs = {'direct': None, 'errors': 'raise', 'local_worker': None}

    def sync(loop, func, *args, callback_timeout=None, **kwargs):
        """
        Run coroutine in loop running in separate thread.
        """
        callback_timeout = parse_timedelta(callback_timeout, "s")
        # Tornado's PollIOLoop doesn't raise when using closed, do it ourselves
        if PollIOLoop and (
            (isinstance(loop, PollIOLoop) and getattr(loop, "_closing", False))
            or (hasattr(loop, "asyncio_loop") and loop.asyncio_loop._closed)
        ):
            raise RuntimeError("IOLoop is closed")
        try:
            if loop.asyncio_loop.is_closed():  # tornado 6
>               raise RuntimeError("IOLoop is closed")
E               RuntimeError: IOLoop is closed

/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/utils.py:305: RuntimeError
--------------------------- Captured stderr teardown ---------------------------
distributed.utils - ERROR - loop argument must agree with Future
Traceback (most recent call last):
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/utils_test.py", line 912, in coro
    result = await future
GeneratorExit

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/utils.py", line 655, in log_errors
    yield
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/site-packages/distributed/client.py", line 1347, in _close
    asyncio.shield(self._handle_scheduler_coroutine), 0.1
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/asyncio/tasks.py", line 429, in wait_for
    fut = ensure_future(fut, loop=loop)
  File "/usr/share/miniconda3/envs/test-environment/lib/python3.7/asyncio/tasks.py", line 614, in ensure_future
    raise ValueError('loop argument must agree with Future')
ValueError: loop argument must agree with Future

ref: https://github.com/dask/dask/pull/7247/checks?check_run_id=1936658665

@bnavigator
Copy link
Contributor

Maybe related to dask/distributed#4212, pytest-dev/pytest-asyncio#168

@crusaderky
Copy link
Collaborator

@jsignell all of those tests are supposed to run on the threaded scheduler. Your log is saying that a completely unrelated test assigned a distributed.Client as the default scheduler and afterwards it was not properly cleaned up.

@jsignell
Copy link
Member Author

a completely unrelated test assigned a distributed.Client as the default scheduler and afterwards it was not properly cleaned up.

Ah ok that makes sense because I probably run the tests in a different order locally because I set pytest -n8 rather than -n3 how it is on CI.

@jsignell
Copy link
Member Author

jsignell commented Mar 5, 2021

I think this can be safely merged.

try:
import fsspec # noqa: F401
except ImportError:
collect_ignore_glob.extend(["dask/bag/*", "dask/bytes/*"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised to see us skipping all the Dask bag tests if fsspec isn't installed. There are tests in, for example, dask/bag/tests/test_bag.py which don't require any filesystem operations.

Copy link
Member Author

@jsignell jsignell Mar 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree but we import bytests in bag/core.py and bytes needs fsspec. We could separate it out, but in light of #7244 it didin't seem worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for clarifying. Yeah, agreed in light of #7244 this seems okay

@jrbourbeau
Copy link
Member

I'll add another mindeps test that only adds distributed

Do we want to include that build in this PR? I see we've added a continuous_integration/environment-mindeps-distributed.yaml file for it, but not a corresponding CI build

@jsignell
Copy link
Member Author

jsignell commented Mar 5, 2021

I'll add another mindeps test that only adds distributed

I origianlly added it but it didn't pass yet and then given the plan to add more deps I took it off the CI build.

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsignell!

@jrbourbeau jrbourbeau merged commit ee273db into dask:master Mar 8, 2021
@jsignell jsignell deleted the skip-numpy branch March 8, 2021 15:53
dcherian added a commit to dcherian/dask that referenced this pull request Mar 18, 2021
* upstream/main:
  Change default branch from master to main (dask#7198)
  Add Xarray to CI software environment (dask#7338)
  Update repartition argument name in error text (dask#7336)
  Run upstream tests based on commit message (dask#7329)
  Use pytest.register_assert_rewrite on util modules (dask#7278)
  add example on using specific chunk sizes in from_array() (dask#7330)
  Move numpy skip into test (dask#7247)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants