Skip to content

Conversation

@jakirkham
Copy link
Member

To avoid reading data when it is not needed, try checking nbytes and types beforehand. If the metadata is already there, continue on without reading data. Otherwise fallback to the reading data, but do make sure to cache the results of that read to avoid doing it again.

cc @madsbk @mrocklin

To avoid reading `data` when it is not needed, try checking `nbytes` and
`types` beforehand. If the metadata is already there, continue on
without reading `data`. Otherwise fallback to the reading `data`, but do
make sure to cache the results of that read to avoid doing it again.
value = self.data[key]
except KeyError:
value = self.actors[key]
nbytes = self.nbytes[key] or sizeof(value)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appeared to just use getitem on nbytes, which AFAICT is a dict. So maybe getting value is never needed? If it is, maybe we should be changing this to .get(...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch. I don't know.

Copy link
Member Author

@jakirkham jakirkham Mar 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and used .get(...) instead and raised issue ( #3629 ) to determine whether the fallback to get value is needed.

If it is needed, this should fix a bug. If not, we can go back and remove it separately 🙂

@mrocklin
Copy link
Member

+1 from me. Thanks @jakirkham

@jakirkham
Copy link
Member Author

Thanks for the guidance Matt 😄

@mrocklin
Copy link
Member

mrocklin commented Mar 24, 2020 via email

Appears GitHub CI failed to checkout the code and clicking restart in
the UI does not work. So pushing a dummy commit to restart it.
Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a very sneaky piece of code, nice catch @jakirkham !

Copy link
Contributor

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@quasiben
Copy link
Member

Thanks everyone for the reviews and @jakirkham for the work. Merging in

@quasiben quasiben merged commit 706de86 into dask:master Mar 24, 2020
@jakirkham jakirkham deleted the avoid_value_req_in_send_task_state_to_scheduler branch March 24, 2020 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants