-
-
Notifications
You must be signed in to change notification settings - Fork 12.2k
ENH: add a StringDType scalar type that wraps a UTF-8 string #28165
Copy link
Copy link
Labels
Description
Describe the issue:
Hi @ngoldbaum,
I wonder if it would make sense to show a warning when dtype=str is passed to array/asarray saying that dtype=np.str_ is preferred.
np.dtypes.StringDType.type gives str but when passed to e.g. np.asarray(..., dtype=...) str gives np.str_ dtype. WDYT?
Reproduce the code example:
import numpy as np
arr1 = np.array([1, 2, 3], dtype=np.dtypes.Int32DType)
assert np.asarray(arr1, dtype=arr1.dtype.type).dtype == arr1.dtype
arr2 = np.array(["foo", "bar"], dtype=np.dtypes.StringDType)
np.asarray(arr2, dtype=arr2.dtype.type)Error message:
TypeError Traceback (most recent call last)
TypeError: Casting from StringDType to a fixed-width dtype with an unspecified size is not currently supported, specify an explicit size for the output dtype instead.
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
Cell In[30], line 6
3 assert np.asarray(arr1, dtype=arr1.dtype.type).dtype == arr1.dtype
5 arr2 = np.array(["foo", "bar"], dtype=np.dtypes.StringDType)
----> 6 np.asarray(arr2, dtype=arr2.dtype.type)
TypeError: cannot cast dtype StringDType() to <class 'numpy.dtypes.StrDType'>.Python and NumPy Versions:
2.3.0.dev0+git20250115.1e10174
3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
Runtime Environment:
No response
Context for the issue:
No response
Reactions are currently unavailable