ENH: define a gufunc for vecdot (with BLAS support) by mhvk · Pull Request #25416 · numpy/numpy

mhvk · 2023-12-18T15:42:00Z

Instead of changing the python definition of vecdot, as I tried to do in #25411, it seemed better to just define a proper gufunc for vecdot, so that we can have good blas support and just use the standard definition of axis. This PR does that in the second commit (the first reverts the addition of a python version from #25155).

Note that I don't think I have added a ufunc before (or at least forgotten about it), so please check that my implementation makes sense. In particular, it would seem that the inner loop could be selected more logically, since mostly I'm just wrapping loops defined in arraytypes.c.src. Hence, ping @seberg.

seberg · 2023-12-18T15:56:41Z

so please check that my implementation makes sense. In particular, it would seem that the inner loop could be selected more logically, since mostly I'm just wrapping loops defined in arraytypes.c.src

That seems all OK to me on first sight, nice! I remember that we have set_matmul_flags which might be important for some niche cases also for vecdot.
Matmul sometimes falls back to einsum which might be nicer. Overall, I would also be happy to move the definitions here and maybe even remove them from arraytypes.c.src (leaving the implementation NULL).

Also ping @mtsokol as I think he was maybe considering looking into this.

numpy/_core/code_generators/generate_umath.py

numpy/_core/tests/test_ufunc.py

mhvk · 2023-12-18T17:54:07Z

@seberg - thanks! I checked the flag setting - I don't think the reaons apply to vecdot --

numpy/numpy/_core/src/umath/ufunc_object.c

Lines 263 to 275 in 2cc2112

    
               /* 
        
                * The default output flag NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE allows 
        
                * perfectly overlapping input and output (in-place operations). While 
        
                * correct for the common mathematical operations, this assumption is 
        
                * incorrect in the general case and specifically in the case of matmul. 
        
                * 
        
                * NPY_ITER_UPDATEIFCOPY is added by default in 
        
                * PyUFunc_GeneralizedFunction, which is the variant called for gufuncs 
        
                * with a signature 
        
                * 
        
                * Enabling NPY_ITER_WRITEONLY can prevent a copy in some cases. 
        
                */ 
        
               ((PyUFuncObject *)matmul)->op_flags[2] = (NPY_ITER_WRITEONLY |

mhvk · 2023-12-18T17:59:31Z

Matmul sometimes falls back to einsum which might be nicer.

You mean to fall back to einsum when blas is not present?

Overall, I would also be happy to move the definitions here and maybe even remove them from arraytypes.c.src (leaving the implementation NULL).

You mean move all _dot definitions from arraytypes.c.src? I can do that, but perhaps first see whether there are other comments.

Also, I saw that matmul had a comment that one could (I think) avoid the outer loop --

numpy/numpy/_core/src/umath/matmul.c.src

Lines 442 to 446 in 2cc2112

    
                   /* 
        
                    * TODO: refactor this out to a inner_loop_selector, in 
        
                    * PyUFunc_MatmulLoopSelector. But that call does not have access to 
        
                    * n, m, p and strides. 
        
                    */

This seems even more relevant for vecdot, but is this easy to do?

seberg · 2023-12-18T18:37:22Z

Yeah, don't worry about the arraytypes.c.src, was just thinking it might be nicer to move towards a future where we rely mostly on the ufunc. I don't think it is necessary here.

Using einsum could be nice, because einsum may actually have vectorized versions for these kernels, I somewhat think matmul does that, but I am not quite sure how matmul does so without checking.

This seems even more relevant for vecdot, but is this easy to do?

I am not 100% sure again, but basically, you will have to make this a "new-style" ufunc, to use the get_loop() function. That function is passed the fixed_strides arrays so it should have the strides available. OTOH, I am not sure I took care that all core dimensions are filled for gufuncs.

So, it should be doable, but I think its only useful if you have the interest and time to tinker with it.

seberg

The only larger code block is effectively a copy of other code. So, I think this is almost good to go. Yes, it isn't super nice to reach all over the code for these inner-loops, but it is also not terrible and a gufunc is much nicer.

Some smaller notes to maybe think about:

IIRC, vecdot was actually defined twice, all namespaces should only have true aliases to the ufunc.
Do we actually still need the inner1d test at all for anything, or can we replace it everywhere? (Maybe better as a follow-up).
We need to make a call on Object, I think either call .conjugate() or just not define it (for now) I don't have an opinion either way (we can always add it easily after all).

numpy/_core/code_generators/generate_umath.py

numpy/_core/tests/test_ufunc.py

mhvk · 2023-12-19T16:35:44Z

Some smaller notes to maybe think about:

* IIRC, vecdot was actually defined twice, all namespaces should only have true aliases to the ufunc.

That is now dealt with - there remains a function np.linalg.vecdot that has a more limited, array-API compliant signature.

* Do we actually still need the `inner1d` test at all for anything, or can we replace it everywhere?  (Maybe better as a follow-up).

Right now the only usage left is for a version with a docstring. That can certainly be changed, but maybe separate from this PR?

* We need to make a call on Object, I think either call `.conjugate()` or just not define it (for now) I don't have an opinion either way (we can always add it easily after all).

I added a new loop which calls .conjugate() on the first argument.

Thanks!

seberg · 2023-12-19T16:49:07Z

np.linalg.vecdot that has a more limited, array-API compliant signature.

If the only difference is being more limited, lets make it an alias? But thanks. Need another closer look, but I suspect this is fine to go in especially if anyone else has a look and likes it.

EDIT: That is just my opinion/tendency, I don't care too much.

mhvk · 2023-12-19T17:43:19Z

EDIT: That is just my opinion/tendency, I don't care too much.

I don't care too much either, but in linalg, we do the same for matmul, so for this PR probably best to just follow precedent...

Who else would be good for review?

ngoldbaum · 2023-12-20T03:46:15Z

I’ll try to take a look at this.

seberg

Had a bit of a closer look, the object loop needs a bit of cleanup IMO. But otherwise I am comfortable with merging, although if Nathan can have a quick look that would be great.

seberg · 2023-12-20T16:40:59Z

numpy/_core/src/umath/matmul.c.src

+        }
+        else {
+            PyObject *obj1conj = PyObject_CallMethod(*((PyObject **)ip1),
+                                                     "conjugate", NULL);


Would be nice to make this use an interned string, creating that is quite a bit of overhead. OTOH, tricky here for subinterpreters, so I don't care.

Decided not to bother...

seberg · 2023-12-20T16:42:03Z

numpy/_core/src/umath/matmul.c.src

+    for (i = 0; i < n; i++, ip1 += is1, ip2 += is2) {
+        if ((*((PyObject **)ip1) == NULL) || (*((PyObject **)ip2) == NULL)) {
+            tmp1 = Py_False;
+            Py_INCREF(Py_False);


NULL is None not False. This looks stolen, but I tihnk it shold be:

obj1 = *((PyObject **)ip1 if (obj1 == NULL) { obj1 = Py_None; }

and the same for obj2. No need for any incref/decref at all.

This comes straight from the loop in arraytypes - is it even possible for the pointers to be NULL?

Assuming it is, I don't think we can replace by Py_None, since adding None to anything is not possible. Might it then make sense to just error straight away? At least that would be obvious.

But for matmul we do the same, of just assigning Py_None, so reasonable to do that here too... Will try that.

Yes, and that doesn't matter: NULL is synonymous for None if it should happen. Yes, it will error, but explicitly erroring is actually also wrong (because the error will at least mention None, while mentioning NULL has no meaning in python).

Basically, None isn't an assumption, it is the correct definition... maybe we should have a npy_load_object() helper to do it. If arraytypes does it differently, it is simply buggy...

seberg · 2023-12-20T17:57:30Z

numpy/_core/src/umath/matmul.c.src

+        }
+        else {
+            tmp2 = PyNumber_Add(tmp, tmp1);
+            Py_XDECREF(tmp);


Could use result = NULL and then use Py_SETREF(result, PyNumber_Add(result, ...) also. Just a bit less awkard with tmp = tmp2 below IMO.

Yes, much clearer!

seberg · 2023-12-20T17:58:20Z

numpy/_core/src/umath/matmul.c.src

+    }
+    tmp3 = (PyObject**) op;
+    tmp2 = *tmp3;
+    *((PyObject **)op) = tmp;


Also here: Py_SETREF(*((PyObject **)op), tmp)

seberg · 2023-12-20T19:25:33Z

numpy/_core/src/umath/matmul.c.src

+        /*
+         * TODO: refactor this out to an inner_loop_selector?
+         */
+        @TYPE@_@DOT@(ip1, is1_n, ip2, is2_n, op, dn, NULL);


Ah, missed one thing...

This is annoying, because we really should have an error return, for object, the error paths are not particularly well defined (might do more operations).

It needs moving to the new API to fix, so maybe you can add a PyErr_Occured() check here for object (as much as I dislike it)?

EDIT: Pretty much any array filled with nonsense should be a valid test for this (i.e. uninitialized None). But it needs to be more than 1-D, so we get the outer loop.

OK, done (and test added)

ngoldbaum

Nice, I'm glad we nerdsniped you into writing a BLAS-accelerated gufunc! 😛

Everything looks great to me, I just had two minor comments.

numpy/_core/tests/test_ufunc.py

ngoldbaum · 2023-12-20T21:28:20Z

numpy/linalg/_linalg.py

+    """
+    Computes the vector dot product.
+
+    This function is Array API compatible, contrary to :func:`numpy.vecdot`.


Should this wrinkle should be mentioned in the array API compat notes?

I could, but don't know where those live - here, I just copied what was done for linalg.matmul.

I was thinking of this table.

Looking at that, I don't see special text for matmul -- I think this falls under np.vecdot being a superset of what the array api allows (except for the definition of the axis argument, but that is subject of data-apis/array-api#724

But maybe I should rephrase this part of the docstring to something like

This function is restricted to arguments compatible with the Array API, contrary to :func:`numpy.vecdot`.

or something similar?

Yeah, that language is clearer I think. No need to touch the table I linked to.

seberg · 2023-12-20T21:52:06Z

numpy/_core/src/umath/matmul.c.src

+         */
+        char *ip1=args[0], *ip2=args[1], *op=args[2];
+        @TYPE@_@DOT@(ip1, is1, ip2, is2, op, n_inner, NULL);
+#if @TYPE@ == OBJECT


Hmmm, does this work if OBJECT is undefined, I think? I sometimes use #define IS_@TYPE@

OK, just goes to show how I don't really grok pre-processing... I just added an extra iteration item...

mhvk · 2023-12-20T22:51:47Z

OK, should now be all set (after too many pushes to try to get object be special-cased and add clongdouble in the repeat loop).

numpy/_core/code_generators/ufunc_docstrings.py

Also use it in the tests instead of the old inner1d from the test umath functions.

ngoldbaum · 2023-12-21T19:24:59Z

Thanks @mhvk!

mhvk added 01 - Enhancement component: numpy.ufunc labels Dec 18, 2023

mhvk mentioned this pull request Dec 18, 2023

BUG: Vecdot subclasses and axis #25411

Closed

mhvk force-pushed the vecdot-ufunc branch from 0747ee5 to 4bf5604 Compare December 18, 2023 15:45

mhvk commented Dec 18, 2023

View reviewed changes

numpy/_core/code_generators/generate_umath.py Show resolved Hide resolved

mhvk commented Dec 18, 2023

View reviewed changes

numpy/_core/tests/test_ufunc.py Show resolved Hide resolved

seberg reviewed Dec 19, 2023

View reviewed changes

numpy/_core/code_generators/generate_umath.py Show resolved Hide resolved

numpy/_core/tests/test_ufunc.py Show resolved Hide resolved

REVERT: revert python vecdot implementation

00a6b96

mhvk force-pushed the vecdot-ufunc branch from 4bf5604 to 4b95e3f Compare December 19, 2023 16:32

mhvk added this to the 2.0.0 release milestone Dec 19, 2023

mhvk requested a review from mtsokol December 19, 2023 18:58

seberg reviewed Dec 20, 2023

View reviewed changes

mhvk force-pushed the vecdot-ufunc branch 2 times, most recently from a43334a to 75456d8 Compare December 20, 2023 21:11

ngoldbaum reviewed Dec 20, 2023

View reviewed changes

mhvk force-pushed the vecdot-ufunc branch 2 times, most recently from a72b945 to 8456f57 Compare December 20, 2023 21:48

seberg reviewed Dec 20, 2023

View reviewed changes

mhvk force-pushed the vecdot-ufunc branch 3 times, most recently from 4c68744 to 87fd5cf Compare December 20, 2023 22:46

mhvk force-pushed the vecdot-ufunc branch from 87fd5cf to cf5f7cf Compare December 20, 2023 22:49

ngoldbaum reviewed Dec 21, 2023

View reviewed changes

numpy/_core/code_generators/ufunc_docstrings.py Show resolved Hide resolved

ENH: Define a vecdot gufunc

843a5e3

Also use it in the tests instead of the old inner1d from the test umath functions.

mhvk force-pushed the vecdot-ufunc branch from cf5f7cf to 843a5e3 Compare December 21, 2023 17:41

ngoldbaum merged commit 1fcda82 into numpy:main Dec 21, 2023

mhvk deleted the vecdot-ufunc branch December 21, 2023 20:15

ngoldbaum mentioned this pull request Dec 22, 2023

ENH: Add a dedicated ufunc for numpy.vecdot #25393

Closed

mhvk mentioned this pull request Dec 22, 2023

TypeError: unknown ufunc vecdot (9 failures with numpy 2.0.dev) astropy/astropy#15784

Closed

Uh oh!

Conversation

mhvk commented Dec 18, 2023

Uh oh!

seberg commented Dec 18, 2023

Uh oh!

Uh oh!

Uh oh!

mhvk commented Dec 18, 2023

Uh oh!

mhvk commented Dec 18, 2023

Uh oh!

seberg commented Dec 18, 2023

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mhvk commented Dec 19, 2023

Uh oh!

seberg commented Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mhvk commented Dec 19, 2023

Uh oh!

ngoldbaum commented Dec 20, 2023

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg Dec 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg Dec 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

seberg commented Dec 19, 2023 •

edited

Loading

seberg Dec 20, 2023 •

edited

Loading

seberg Dec 20, 2023 •

edited

Loading