Initial sort() by support and complex sort() change#16700
Initial sort() by support and complex sort() change#16700vrakesh wants to merge 6 commits intonumpy:mainfrom
Conversation
|
Some initial performance comparisions In [5]: carr = np.arange(27, dtype=np.complex128)[::-1].reshape(3,3,3)
# New keysort() c function internally handling complex numbers
In [6]: %timeit carr.sort()
11.2 µs ± 749 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [7]: carr = np.arange(27, dtype=np.complex128)[::-1].reshape(3,3,3)
# Complex sorting with using lexsort+ take_ along_axis
In [8]: %timeit np.take_along_axis(carr, np.lexsort((carr.imag, carr.real,)), axis=0)
14.9 µs ± 724 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) |
seberg
left a comment
There was a problem hiding this comment.
Thanks, I like this approach much more. It could be interesting how it holds up with other use-cases, such as:
c = np.random.random(size=(1000, 1000)) * (1 - 1j)
%timeit np.sort(c, by=(c.real, c.imag), axis=0)
%timeit np.sort(c, by=(c.real, c.imag), axis=0)
There was a problem hiding this comment.
I think you can delete rit entirely. It is only actually used in the !needcopy copy, since in the other path indbuffer is used, and there is no reason to actually copy it back. You can use indbuffer identically in that branch, since we do not actually return the index array.
There was a problem hiding this comment.
Deleting rit includes deleting ret entirely.
There was a problem hiding this comment.
Yes, I was also thinking the same. Will get it done
There was a problem hiding this comment.
This seems incorrect, self can have any datatype, so it should be selstride != PyArray_DESCR(self)->elsize.
There was a problem hiding this comment.
I have to admit, I am curious if we cannot just replace the below code with a manual memmove, you have to move elements (elsize), we know that every item ends up in exactly one other place, and we do not have to check the indices for out-of-bound values...
so a simple for loop with a memmove inside could be just as well. If it turns out a bit slow for some (smaller) dtypes, we could take the approach we have in npy_fastputmask to have the compiler generate different versions for the most common dtype sizes.
After doing that, we could experiment with an in-place algorithm, which seems like it should be possible as long as we also modify indbuffer at the same time.
In any case, right now I think a simple for-loop with a memmove inside is probably just as fast (no overhead and no index checking) or aster. And probably even slightly easier to read.
There was a problem hiding this comment.
Ah will, look into this.
hameerabbasi
left a comment
There was a problem hiding this comment.
Question: Do we want to allow the sorting keys to be broadcastable?
numpy/core/fromnumeric.py
Outdated
There was a problem hiding this comment.
| if by is not None and not isinstance(by, tuple): | |
| if by is not None and not isinstance(by, (tuple, list)): |
To match block.
There was a problem hiding this comment.
Nit:
| PyArray_DIMS(mps[0]), | |
| PyArray_NDIM(mps[0])))) { | |
| PyArray_DIMS(mps[0]), | |
| PyArray_NDIM(mps[0])))) { |
hameerabbasi
left a comment
There was a problem hiding this comment.
A couple of notes, and a question, do we want the sorting keys to be broadcastable to the array shape?
|
@hameerabbasi I am not sure, if we want it to be broadcast able, thoughts?? @seberg |
|
Broadcastable sounds nice, but unless lexsort has it (which I do not think), it also doesn't strike me as a vital feature for sorting? |
8e00f51 to
69aabd9
Compare
|
On the most recent commit. Performance numbers In [1]: import numpy as np
In [2]: c = np.random.random(size=(1000, 1000)) * (1 - 1j)
...: %timeit np.sort(c, by=(c.real, c.imag), axis=0)
65.6 ms ± 556 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: c = np.random.random(size=(1000, 1000)) * (1 - 1j)
In [4]: %timeit np.take_along_axis(c, np.lexsort((c.imag, c.real,)), axis=0)
70 ms ± 617 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) |
There was a problem hiding this comment.
So this is not much faster, but I still like it a bit more, plus there is slightly more opportunity to speed it up if we ever want to, and less memory overhead.
Can't you do this in the for-loop directly? Maybe we could also make a small helper for it, since I think it should look identical in both paths of the needscopy if/else.
There was a problem hiding this comment.
should the shape of individual keys should be same as shape of the array we are sorting by ?
There was a problem hiding this comment.
Yes they should, IIUC. We're not broadcasting. #16700 (comment)
There was a problem hiding this comment.
will this do stable sort even if other sort kinds provided and is that expected behavior ?
There was a problem hiding this comment.
That is a very good point, we have to use stable sort (except for the first sort I guess), otherwise the approach does not work. We should maybe give an error if by is passed together with the sort-kind (or sort-kind other than "stable").
There was a problem hiding this comment.
can we choose a better name to describe what this int is supposed to do ?
There was a problem hiding this comment.
Is there a reason we want to deprecate this ?
There was a problem hiding this comment.
I think this was copied over from lexsort(?). 0-d arrays can't be reasonably sorted along an axis (except None, which has ravel behavior), but we allow it.
There was a problem hiding this comment.
@vrakesh this also means that I think we should just go through with the deprecation. Which I think just means deleting the first if block so that check_and_adjust_axis is always run. np.array(0).sort(axis=None) fails, so it should fail here as well.
There was a problem hiding this comment.
This one is important I think, and should get a test-case.
There was a problem hiding this comment.
nit:
| for(i=0;i<N;i++){ | |
| for (i = 0; i < N; i++) { |
There was a problem hiding this comment.
| for(i=0;i<N;i++){ | |
| for (i = 0; i < N; i++) { |
numpy/core/fromnumeric.py
Outdated
There was a problem hiding this comment.
I think it would help here to call out what order the keys are sorted in (similar to lexsort documentation). Since trying to follow lexsort it seems like to sort lexicograpphically with keys for complex numbers one should do arr.sort(by=(c.imag, c.real))
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
maybe also add the carr.real, carr.imag test mentioned above. Also, would help to check for different shapes of keys and mismatch between keys and arr shape for error assertions.
There was a problem hiding this comment.
Can we push these declarations down-file to where they are first used, C99-style?
There was a problem hiding this comment.
Not sure on this one, every other function in the same file does not follow it, shouldn't be a problem changing, only question is uniformity.
There was a problem hiding this comment.
That's because other functions were written before the language allowed it. IMO we should write all new code in c99 style.
There was a problem hiding this comment.
| swaps = malloc(NPY_LIKELY(n > 0) ? n * sizeof(int) : 1); | |
| swaps = malloc(NPY_LIKELY(n > 0) ? n * sizeof(*swaps) : 1); |
There was a problem hiding this comment.
I agree the swaps variable can be removed, will do it.
There was a problem hiding this comment.
| if(sortedbuffer == NULL) { | |
| if (sortedbuffer == NULL) { |
There was a problem hiding this comment.
This should be addressed at the point those things happen, not here
There was a problem hiding this comment.
Should be removed. We try to err on the side of not changing the C API if possible.
numpy/core/fromnumeric.py
Outdated
There was a problem hiding this comment.
+1 for not doing this conversion. The argument is documented as as sequence.
b114f5f to
2396c7a
Compare
|
@seberg Got the in-place part done as well. Does this look good to go? |
There was a problem hiding this comment.
I think it would be good to refactor this into a small helper. Deduplicates the code/explains what is going on in this case (through the function name), and adds an obvious place to add a comment of whats going on, and maybe a reference for the approach (if you have).
I will look at the rest soon, sorry, please ping me if I forget...
There was a problem hiding this comment.
Sure, will refactor it into a helper.
319fc3f to
edaa2ea
Compare
numpy/core/fromnumeric.py
Outdated
There was a problem hiding this comment.
This seems like it should be unnecessary, because the check should live in ndarray.sort().
I am actually happy with this decision: I.e. force a tuple() right now, and think about generalizing later. In which case I think we should go with explicitly tuple, and even disallow lists.
There was a problem hiding this comment.
Thanks Sebastian, will address these changes
There was a problem hiding this comment.
| #include "mapping.h" |
Probably not necessary anymore.
There was a problem hiding this comment.
Some style nitpicks, like a space before the *. I like to try and do some doxygen style comments nowadays, but no need (style nitpicks also apply below, always space before and after the *).
There was a problem hiding this comment.
You actually mean "contiguous" here and not aligned (although aligned is also true). I think you can actually just remove the stride part of the comment, stride always means the same thing in the NumPy code base.
There was a problem hiding this comment.
Not sure if it helps without thinking too much about it, but maybe we can write: The following code reorders the data with respect to the index. The inner while loop places a single element to the right place until it reaches a fully cycle (an already ordered element). Each element may be visited twice, but will be sorted on the first visit. The second visit finds it already sorted and immediately continues.
numpy/core/src/multiarray/methods.c
Outdated
There was a problem hiding this comment.
Still holds, this path should be removed, IMO (it can be replaced with a deprecation warning, but it is likely nicer to do that in a followup)
numpy/core/src/multiarray/methods.c
Outdated
There was a problem hiding this comment.
Style nit: No need for the \ and I would probably just put the text on the next line with 8 space indentation.
numpy/core/tests/test_multiarray.py
Outdated
There was a problem hiding this comment.
This test should be transformed into one using by=, lets not modify the complex right now.
There was a problem hiding this comment.
This one is important I think, and should get a test-case.
numpy/core/src/multiarray/methods.c
Outdated
There was a problem hiding this comment.
Would be good to add a test for this choice, and I think if we limit so much (which I like), we might as well limit it to only tuples. Otherwise "sequence" would be arbitrary python sequences.
numpy/__init__.pyi
Outdated
There was a problem hiding this comment.
| by: Union[None, Sequence[ndarray], ndarray] = ..., | |
| by: Optional[Sequence[ArrayLike]] = ..., |
Currently only sequences (i.e. lists and tuples) of array-like objects are allowed, correct?
There was a problem hiding this comment.
Thanks, I am thinking if we should just make it tuple here. I don't mind making it sequences, but not sure there is much reason for not being strict (but I may be missing some earlier discussion).
There was a problem hiding this comment.
Makes sense, will make the change
There was a problem hiding this comment.
If we're going with tuple then Optional[Tuple[ArrayLike, ...]] should do the trick.
edaa2ea to
ab2214b
Compare
|
@seberg Addressed the changes, hopefully this is good to go |
|
I guess we should close this PR, unfortunately. We discussed this and maybe we should not block deprecating complex by having this new API. It was a nice idea, but likely complex sorting efficiently may not be important enough to avoid I don't hate an API like this though and this was a great push! But pushing this forward seems like a larger thing and probably not vital. |
This is the first step in addressing gh-15981
Related mailing list thread can be found here
As stated in the original thread, We need to start by having a sort() function for complex numbers that can do it based on keys, rather than plain arithmetic ordering.
There are two broad ways to approach a sorting function that supports keys (Not just for complex numbers).
keykwarg to thesort()(function and method). To support key based sorting on arrays.sortby(c_arr, key=(c_arr.real, c_arr.imag)In this PR I have chosen approach 1 for the following reasons
Approach 1 means it is more easier to deal with both in-place method and the function. Since we can make the change in the c-sort function, have minimal change in the python layer. This I hope results, minimal impact on current code that handles complex sorting. One example within numpy is is
linalgmodule'ssvd()function.With approach 2 when we deprecate complex arithmetic ordering, existing methods using sort() for complex types, need to update their signature.
As it stands the PR does the following 3 things within the Python-C Array method implementation of sort
key(When no key is passed) which mimics the current arithmethic ordering in Numpy .Py_LexSortand generate indices.take_along_axisvia C call back and copy over the result to the original array (pseudo in-place).I am requesting feedback/help on implementing
take_along_axislogic in C level in a in-place manner and the approach in general.This will further feed into
max()andmin()as well. Once we figure this out. Next step would be to deprecate arithmetic ordering for complex types (Which I think will be a PR on it's own)UPDATE:
The latest version uses A new Function PyArray_Keysort() to argsort a 1D slice of indices, and then use the indices to move contents of the same 1D slice