-
-
Notifications
You must be signed in to change notification settings - Fork 12k
MAINT: Refactor of numpy/core/_type_aliases.py
#24651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We talked a bit about this and I tried to remove some C-side stuff and change the exposure a bit via: https://github.com/seberg/numpy/tree/typeinfo-remove (it could probably be simplified more, but at least 300 lines of weird C-code gone 🎉). That also removes the redefinition of the "integer priority" list in Python. I am a bit unsure about float128. On the one hand, I hate it and want everyone to use longdouble. OTOH, I think it may be used (unnecessarily) quite a lot. |
@seberg Thank you! I built |
|
It should return the dtype, you are fetching the |
|
In that table, I'd argue |
|
In fact, the table is also wrong on Linux where |
|
@eric-wieser thank you for your feedback! |
numpy/core/_type_aliases.pynumpy/core/_type_aliases.py~~
numpy/core/_type_aliases.py~~numpy/core/_type_aliases.py
Sorry for confusion! |
Hi @seberg @rgommers,
I'm starting here a new PR for reviewing as the discussion in the original one concluded.
Here I'm sharing a proposition of
numpy/core/_type_aliases.pyfile refactor.While working on removing other scalar aliases I had some difficulties understanding how
allTypes/sctypeDictdictionaries are being created. For example, I noticed that thefloatalias is originally present insctypeDict, then it's removed and then added again.Also, as NEP 52 suggest,
float96andfloat128could be removed (extended precision) and it would also cover modifying_type_aliases.py. This refactor makes it easier to introduce new changes to the available aliases list.I decided to try to refactor this file to decompose the creation of these dictionaries into well-defined stages.
I think the purpose of this file is straightforward: Read types from the compiled
multiarraymodule, and expose them as Python dictionaries, while mapping C naming conventions to Python conventions (e.g. interpretfloatasdoubleinstead ofsingle).I identified which "group" of entries is present in each dictionary (could it be modified for NumPy 2.0?):
words(e.g. "float", "cdouble", "int_")words+bits(e.g. "int16", "complex64")symbols(e.g. "p", "L")symbols+bytes(e.g. "c8", "b1")abstracts(e.g. "inexact", "integer")numbers(e.g. 1, 2, 3)aliases(e.g. "complex_", "longfloat")extra/other aliases(e.g. "bool")Additionally, here's a decomposition of
wordsandwords+bitsinto three columns:In the refactored file I build each of these groups to then join them into final dicts. I do it in a loop in
def _build_dicts(), where sizedint&uintaliases must take under consideration a "priority" of types - in case at least two of them are same bitsize.After that I rename and add aliases to the created groups. In the last stage I clean entries that aren't present in the original implementation (these are custom cases).
The last, and separate, stage creates
sctypeswhich I copied, as the implementation was clear to me.Important to mention, this modifies a crucial piece of code that could be tricky to debug (e.g. canonical int names are aliased incorrectly on some architecture (I tested it locally only on macbook with Intel)).
I renamed the old implementation to
_expired_type_aliases.pyand added a testtest_new_vs_expired_type_aliasesthat checks if dictionaries created by the new implementation are exactly the same as done by the previous code. This should flag in the CI if the new implementation fails to mimic the previous one on a specific architecture. (After some time, the original impl could be completely removed)Please share your feedback!