-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Small improvements to da.pad
#6213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…Based on the implementation of numpy. Also change signature of pad helper functions slightly. Also add additional check to udf_pad_test.
jcrist
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mark-boer, overall this looks good to me.
dask/array/creation.py
Outdated
| result_idx = broadcast_to(result_idx, pad_shape, chunks=pad_chunks) | ||
|
|
||
| if mode == "mean": | ||
| result_idx = _round_if_needed(result_idx, array.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _round_if_needed is only used here and is rather short, I have a small preference to inline it here. There is a maintenance and understanding cost to many-small-methods (see http://matthewrocklin.com/blog/work/2019/06/23/avoid-indirection).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no worries. I saw this function in the numpy code and I copied the basic functionality.
dask/array/creation.py
Outdated
| def pad_stats(array, pad_width, mode, *args): | ||
| def _round_if_needed(arr, dtype): | ||
| if np.issubdtype(dtype, np.integer): | ||
| rint(arr, out=arr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight preference to avoid using out=arr here. It doesn't make things more efficient, and I find it harder to read (I have to know that rint is a ufunc, and that the out parameter means I'm mutating arr).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether to use the out=arr functionality. Personally I don't know what the performance gain is to doing it this way. I'll just use the out=rint(arr) instead.
|
Thx, for the quick review. I changed your suggestions. |
|
Thanks! |
This PR makes a couple of small improvements to
dask.array.padda.pad#6127black dask/flake8 daskThis PR does not address the issue with wrap, reflect and symmetric. That will require a rewrite of pad_reuse, I'll have a look at this in a next PR.