Skip to content

DISCUSS: remove remaining usages of character codes in array reprs #25508

@ngoldbaum

Description

@ngoldbaum

I realize this is a little late in the game to get this done, but in case there's appetite for it, I think it might be a good idea to change array reprs to no longer use dtype character codes in NumPy 2.0. Specifically, this happens with np.str, np.bytes_, and np.void:

>>> np.array(["hello", "world"])
array(['hello', 'world'], dtype='<U5')

I think it would fit in better with the numeric dtypes to instead print this:

array(['hello', 'world'], dtype=np.dtypes.StrDType(5))

And similarly for the bytes dtype and unstructured voids.

One wrinkle is we don't actually support doing this with unstructured voids right now:

In [2]: np.dtypes.VoidDType(10)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 np.dtypes.VoidDType(10)

TypeError: Preliminary-API: Flexible/Parametric legacy DType '<class 'numpy.dtypes.VoidDType'>' can only be instantiated using `np.dtype(...)`

But I think updating the reprs of bytes_ and str are probably more important if we don't want to allow a similar syntax with void.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions