Skip to content

Conversation

@eric-wieser
Copy link
Member

@eric-wieser eric-wieser commented Apr 17, 2018

Fixes gh-10450.

This is the alternative to #10470 and #10488.

I've made these changes on top of one another, so the commit diffs are largely meaningless (will squash before merging)

The effect here is that all of these will pass args=(in1, in2) to __array_prepare__ and __array_wrap__:

  • ufunc(in1, in2) - unchanged
  • ufunc(in1, in2, None) - previously:
    • prepare: (in1, in2, None)
    • wrap: (in1, in2, None)
  • ufunc(in1, in2, out=None) - previously:
    • prepare: (in1, in2, None)
  • ufunc(in1, in2, out=(None,)) - previously
    • prepare: (in1, in2, (None,))

Whereas these will pass args=(in1, in2, out);

  • ufunc(in1, in2, out) - unchanged
  • ufunc(in1, in2, out=out) - previously
    • wrap: (in1, in2)
  • ufunc(in1, in2, out=(out,)) - previously
    • prepare: not called on out at all!
    • wrap: (in1, in2)

@mhvk
Copy link
Contributor

mhvk commented Apr 17, 2018

Overall, I think this does the right thing and it looks good. My main question is whether it is really all that helpful to introduce the new struct with in and out attributes. It would seem that if full_args is simply a tuple of length either nin or nin+nout, things would work even easier as it then just is the correct arr_prep_args (and the one nice consequence, checks for args.out == NULL, could just be replaced with size checks).

Whether or not the new code is worth may depend mostly on where you see it going: as you note, there is substantial refactoring still to be done, and if you think the structure will make that much easier, then perhaps it is worth keeping it.

@eric-wieser
Copy link
Member Author

This structure makes it more suitable for use with __array_ufunc__, so I'd be inclined to keep it.

Copy link
Member Author

@eric-wieser eric-wieser Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhvk: The separation between in and out makes this a little more transparent (no idx_offset or start_idx)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but at the cost of a new function that combines in and out to remake arr_prep_args; just correcting the implementation of that would have a much smaller footprint.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That combination had to be added somewhere - part of the original bug was that only *args was being used, rather than a normalized args struct. Either we combine them at the call site, or we do it during the parse step.

Since we treat in and out differently in a bunch of places, it seems most useful to keep them separate till the last minute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I should leave this test as it was - this was inherited from the earlier implementations

@eric-wieser eric-wieser force-pushed the __array_wrap__-non-None-outputs branch from 8a0866f to 765fd45 Compare April 20, 2018 05:35
@eric-wieser
Copy link
Member Author

Alright, squashed

@eric-wieser eric-wieser force-pushed the __array_wrap__-non-None-outputs branch from 765fd45 to 5d8b135 Compare April 20, 2018 06:07
@eric-wieser
Copy link
Member Author

And now with much clearer tests, a comparison to the old behavior, and a couple new test cases

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Only one suggestion remaining, but at it is purely a suggested short-cut, I'll approve now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't one just do

full_args->in = PyTuple_Slice(args, 0, nin);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see no reason why that wouldn't work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I remember why - PyTuple_Slice does a tonne of extra checks we don't need:

https://github.com/python/cpython/blob/master/Objects/tupleobject.c#L400-L440

Given that ufunc overhead seems to be something that comes up a lot, I though that this was worth optimizing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overhead is negligible compared to other things (like just looking up a method...) -- after all, it is just a few checks of ints -- and one has to weigh the overhead against the advantage of clarify, but I do not really mind either way. If you don't change it, maybe add a comment? (saying that we do what PyTuple_Slice would do, but without the overhead of its checks?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, adding a comment seems reasonable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back, PyTuple_GetSlice removes the need for my optimization above too - so I'll switch

@eric-wieser eric-wieser force-pushed the __array_wrap__-non-None-outputs branch from 5d8b135 to 665087f Compare April 21, 2018 03:20
@eric-wieser
Copy link
Member Author

Updated with PyTuple_GetSlice and a few extra comments in the source

@mhvk
Copy link
Contributor

mhvk commented Apr 22, 2018

OK, the slice indeed also had the logic for the full slice in place, so now makes it quite a bit shorter, nice. Sadly, it needs a rebase now...

…repare__ or __array_wrap__ arguments

This fixes numpygh-10450.

The effect here is that all of these will pass `args=(in1, in2)` to `__array_prepare__` and `__array_wrap__`:

* `ufunc(in1, in2)` - unchanged
* `ufunc(in1, in2, None)` - previously:
   * prepare: `(in1, in2, None)`
   * wrap: `(in1, in2, None)`
* `ufunc(in1, in2, out=None)` - previously:
   * prepare: `(in1, in2, None)`
* `ufunc(in1, in2, out=(None,))` - previously
   * prepare: `(in1, in2, (None,))`

Whereas these will pass `args=(in1, in2, out)`;

* `ufunc(in1, in2, out)` - unchanged
* `ufunc(in1, in2, out=out)` - previously
   * wrap: `(in1, in2)`
* `ufunc(in1, in2, out=(out,))` - previously
   * prepare: not called on out at all!
   * wrap: `(in1, in2)`
@eric-wieser eric-wieser force-pushed the __array_wrap__-non-None-outputs branch from 665087f to 40868df Compare April 22, 2018 19:41
@eric-wieser
Copy link
Member Author

Rebased.

make_full_arg_tuple(
ufunc_full_args *full_args,
npy_intp nin, npy_intp nout,
PyObject *args, PyObject *kwds)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the output arguments be last in the declaration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting them first matches memcpy(dst, src, n) and f(dst, args...) more closely resembles dst = f(args...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly wondered why the routine doesn't just return full_args (and NULL if it fails).

(Beyond that, @mattip, are you OK with the current implementation? My sense is that it is ready to go in.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an improvement, from here refactoring will be easier

Py_INCREF(args);
return args;
/* This should have been checked by the caller */
assert(nin <= nargs && nargs <= nin + nout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert will be ignored in release

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, I don't think this can ever happen

out_kwd = kwds ? PyDict_GetItem(kwds, npy_um_str_out) : NULL;

if (out_kwd != NULL) {
assert(nargs == nin);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a check elsewhere? assert will be ignored in release

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this check already happens in the first round of argument parsing

@eric-wieser
Copy link
Member Author

@mhvk: Anything else needed here?

@eric-wieser
Copy link
Member Author

I'm hoping this might make fixing #9022 easier

@mhvk
Copy link
Contributor

mhvk commented Apr 26, 2018

OK, let's get it in!

@charris charris changed the title BUG: Pass non-None outputs to __array_prepare__ and __array_wrap__ BUG: Pass non-None outputs to __array_prepare__ and __array_wrap__ Jun 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants