Skip to content

Make trivial dependencies mandatory #7244

@crusaderky

Description

@crusaderky

This is a follow-up from a very old discussion on #5511, which resurfaced again today on #7243.

dask features several trivial optional dependencies: cloudpickle, fsspec, toolz, and partd. "trivial" here means that

  1. they are either pure-python or with a transparent pure-python option,
  2. they are tiny,
  3. they exclusively have dependencies of their own that match the above two criteria.

As they are optional, when somebody types pip install dask (as a distracted user will likely do) they are not installed.

There is nothing in our CI, except for the superficial continuous_integration/scripts/test_imports.sh, verifying that dask can actually work without them.

toolz is a hard dependency of dask.base, so I strongly suspect there is nobody out there who doesn't use it: we're talking about an hypotethical user who either manually builds dask graphs or wrote a custom dask collection and who doesn't use any of the many fundamental tools in dask.base.

Proposed change

  • Make cloudpickle, fsspec, toolz, and partd mandatory dependencies.
  • The [bag] and [delayed] targets in setup.py would become empty, for backwards compatibility purposes only.
  • Clean up the codebase of all conditional imports of dask.bag, dask.delayed, or one of the above libraries.

CC @ryan-williams ; please pull in any other known "light" dask users that install neither numpy nor pandas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions