Skip to content

Conversation

@charris
Copy link
Member

@charris charris commented Aug 22, 2020

This removes the old default routines in 'strfunc.c' that were never
used and looked to have unicode/byte problems. The functions now
self initialize from arrayprint when called, so the explicit initialization
previously done when arrayprint module was loaded is no longer needed.

Closes #17138.

@eric-wieser
Copy link
Member

eric-wieser commented Aug 22, 2020

The danger here is that now we can't debug issues with arrays at module load time, as attempting to print them will hit cyclic imports.

I think we might want to move arrayprint out of lib and into core

@charris
Copy link
Member Author

charris commented Aug 22, 2020

@eric-wieser Why do you think that would be a problem? The original didn't work properly until arrayprint was loaded because the fallback was buggy and that initialization took place after the _umath_multiarray module was loaded.

@eric-wieser
Copy link
Member

If the fallback was really unusable, then I guess it makes no difference.

Comment on lines 40 to 70
Copy link
Member

@eric-wieser eric-wieser Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like this would be better:

NPY_NO_EXPORT PyObject *
array_repr(PyArrayObject *self)
{
    if (PyArray_ReprFunction != NULL) {
        return PyObject_CallFunctionObjArgs(PyArray_ReprFunction, self, NULL);
    }

    /*
     * We need to do a delayed import here as initialization on module load
     * leads to circular import problems.
     */
    static PyObject *repr = NULL;
    npy_cache_import("numpy.core.arrayprint", "_default_array_repr", &repr);
    if (repr != NULL) {
        return PyObject_CallFunctionObjArgs(repr, self, NULL);
    }

    // Chain an explanation onto the `ImportError`, since something has gone very wrong,
    // likely due to something already being wrong for someone making changes to numpy.
    PyObject *exc, *val, *tb;
    PyErr_Fetch(&exc, &val, &tb);
    PyErr_Format(PyExc_RuntimeError, "Unable to configure ndarray.__repr__");
    npy_PyErr_ChainExceptionsCause(exc, val, tb);
    return NULL;
}

There's no need to do the import if we've configured a repr function by other means. This also results in a better traceback, without silencing the ImportError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that using master and inserting

array([1,2,3]).__repr__()

in arrayprint before setting the repr function results in a segfault, the current default isn't working.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's good to know. I'm in favor of removing the broken fallback, but I think the above implementation is safer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at PEP 0490 and implemented a helper function npy_PyExc_SetStringChained. It is local to the strfuncs file but we may want to move it after discussion to npy_3kcompat.

Copy link
Member

@eric-wieser eric-wieser Aug 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP 490 was rejected, PEP 3134 is the one that got accepted. I don't particularly like npy_PyExc_SetStringChained, because:

  • It is named after a rejected PEP
  • We only use it in two places
  • It saves only three lines
  • It doesn't generalize well, you end up with a Chained variant for every one of the PyExc_... functions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, AFAICT, PEP 0490 was a followup to PEP 3134 (nee 344) extending it to C code. In any case, there are 30+ uses of npy_cache_import and the thought was that we might want to deal with some of those uses also. If not, we should just follow current practice and return NULL on failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, sorry for accusing you of having the wrong PEP.

In any case, there are 30+ uses of npy_cache_import and the thought was that we might want to deal with some of those uses also.

That's a fair argument. I'd maybe consider this failing import special, as someone who hits it is likely already trying to debug a bigger issue due to a change they made. Either is fine by me though - both are better than your original patch which discarded the ImportError.

This removes the old default routines in 'strfunc.c' that were never
used and looked to have unicode/byte problems. The functions now
self initialize when called, so the explicit initialization when the
arrayprint module is loaded is no longer needed.
@mattip mattip merged commit 67c68b8 into numpy:master Aug 23, 2020
@mattip
Copy link
Member

mattip commented Aug 23, 2020

Thanks @charris



static void
npy_PyErr_SetStringChained(PyObject *type, const char *message)
Copy link
Member

@eric-wieser eric-wieser Aug 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PyErr_FormatFromCause seems to be the name used internally by python, added in python/cpython@467ab19

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a better name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use the limited API, I think we can just use _PyErr_FormatFromCause directly here and elsewhere.

@charris charris deleted the cleanup-strfuncs branch September 3, 2020 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove array_{repr,str}_builtin functions from strfuncs.c

3 participants