Skip to content

Conversation

@fjetter
Copy link
Member

@fjetter fjetter commented Nov 28, 2024

This migrates Blockwise to the Task class (xref #9969)

@fjetter fjetter changed the title Make blockwise refactor Blockwise uses Task class Nov 28, 2024
@fjetter fjetter changed the title Blockwise uses Task class Blockwise uses Task class Nov 28, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Nov 28, 2024

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

     15 files  ± 0       15 suites  ±0   4h 3m 59s ⏱️ - 9m 54s
 13 258 tests + 3   12 196 ✅ + 3   1 062 💤 ±0  0 ❌ ±0 
164 476 runs  +45  141 555 ✅ +41  22 921 💤 +4  0 ❌ ±0 

Results for commit d1501d7. ± Comparison against base commit 307038b.

♻️ This comment has been updated with latest results.

Comment on lines 458 to 471
def substitute(self, subs):
if self.key in subs or self.target.key in subs:
sub_key = subs.get(self.key, self.key)
val = subs.get(self.target.key, self.target.key)
if sub_key == self.key and val == self.target.key:
return self
if isinstance(val, GraphNode):
return val.substitute({}, key=sub_key)
return Alias(sub_key, val)
return self
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The substitution logic could be factored out into a separate PR on request. It's not used anywhere else so I chose to keep in in here but if it helps with review, I'll break it out

io_deps=None,
keys=None,
):
"""Tensor operation
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • consider putting doc string back. This is now an internal function so I want to only put it back if it helps developers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the dosctring around would be helpful to future me.

Comment on lines +88 to +92
return [
"Comparing two dask graph nodes:",
f" left: {left.key} right: {right.key}",
" Diff:",
] + diff
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still doesn't look great but it's something

@fjetter fjetter force-pushed the make_blockwise_refactor branch from 2d1f186 to 9cecb08 Compare November 29, 2024 10:34
io_deps=None,
keys=None,
):
"""Tensor operation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the dosctring around would be helpful to future me.

@fjetter
Copy link
Member Author

fjetter commented Dec 2, 2024

Comparing performance to main I ran the test_climatology::test_highlevel_api on small scale and collected profiles. Primarily I looked at the initial ramp up time between starting graph construction client side and when the computation kicks off cluster side. Below are more detailed notes about the differences. Broadly speaking, picking is much faster because we're now deduplicating the subgraphs/tasks properly again (that's a regression in 2024.11.X) and saves us a bit more than a minute. Beyond this, performance stays pretty much the same. What we payed for the conversion on main we're roughly paying now for the substitution (which is where the blockwise materialization happens). Substitution in TaskSpec-land is still more expensive than the tuple based substitution and therefore this is more expensive. I'm still working on comparisons to how this all looked like before the introduction of task spec (on main we're already converting to this during optimization).

Note that the numbers below are collected with this regression coiled/benchmarks#1622 which blows up the graph size. We're dealing here with 500k tasks (after optimziation it's 300k)

Blockwise TaskSpec

Client side

  • 37s pickle
  • 22s store / optimization / materialization
    • 3.5s HLG culling
    • 4.8s convert_legacy_graph
    • 8.8s make_blockwise_graph / materialization (mostly substitution / new objects from substitution)
    • 4.3 linear fusion

Scheduler side

  • 29s unpickle
  • 33.72s total offload time on scheduler
    • ?? order
      • 4.3s process runnables
      • 14.46s _connecting_to_roots
      • 3.3s _connecting_to_roots second/first pass
      • 3.5s ndependencies
      • 2s revert dependencies
    • 1.5 materialization (dependencies)
    • 1.3s materialization (convert legacy, source unknown)

Main TaskSpec

Client side

  • 26s store / optimization / materialization
    • 3s HLG culling
    • 20s convert_legacy_graph
    • 2.9s linear fusion
    • 0.5s make_blockwise_graph / materialization
  • 1:30min pickle

Scheduler side

  • 43s unpickle
  • 36s total offload time on scheduler
    • Pretty similar to above. This makes sense since we already converted the graph client side

Comment on lines 243 to 249
Applies a task, ``func``, across blocks from many different input
collections. We arrange the pattern with which those blocks interact with
sets of matching indices. E.g.::
make_blockwise_graph(task, 'z', 'i', 'x', 'i', 'y', 'i')
yield an embarrassingly parallel communication pattern and is read as
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having those doc strings for make_blockwise_graph was a little off. We now only have this function as public interface so I moved the docs over here and migrated the doctests to the current version. Some of them are a little tedious so I skipped them but overall this should be working

@fjetter fjetter merged commit 652829d into dask:main Dec 3, 2024
6 of 7 checks passed
@fjetter fjetter deleted the make_blockwise_refactor branch December 3, 2024 10:17
@martindurant
Copy link
Member

Can we please ask for a migration guide, and how to make our code backward compatible?

@fjetter
Copy link
Member Author

fjetter commented Dec 4, 2024

Can we please ask for a migration guide, and how to make our code backward compatible?

An update to the documentation is overdue and something I plan for in the next couple of days. Can you point to a package or issue that is affected (I assume dask-awkward)? blockwise isn't used that widely as a user facing API so I expected the splash zone to be small to moderate in size. I'm also happy to help with a migration (most of the time this is straight forward)

Edit: Just saw dask-contrib/dask-awkward#556 I'll hop over there and will provide help (tomorrow)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants