-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
BUG: np.dtype and np.dtypes.* runtime signatures (C)
#30146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM after a quick scan, but let's ping @seberg . |
This fixes `inspect.signature` for - `np.random.BitGenerator` - `np.random.Generator` - `np.random.MT19937` - `np.random.PCG64` - `np.random.PCG64DXSM` - `np.random.Philox` - `np.random.RandomState` - `np.random.SFC64` - `np.random.SeedSequence` - `np.random.bit_generator.SeedlessSeedSequence` This also fixes a typo in `bit_generator.pxd` that accidentally defined an empty unused class `np.random.bit_generator.SeedlessSequence`. Related to #30104, #30114, #30121, #30124, #30126, #30137, #30138, #30140, #30143, #30146, #30147, and #30155
seberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, should be fine if a bit awkward, but then this whole construct is awkward, so...
We could move it to Python, but unless we want to add type specfic docs it doesn't seem super worthwhile, so I don't have a strong opinion.
| break; | ||
| case NPY_DATETIME: | ||
| case NPY_TIMEDELTA: | ||
| signature = "(unit, /)"; // datetime64, timedelta64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should support it, but it's actually not I think?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that's true. In the stubs it's annotated in such a way that it'll be rejected. But I figured that since it'll always error right now, we might as well prepare for when it'll actually be supported while we're at it.
But I don't mind removing this if you think that's better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, yeah, mostly noticed it, I think I might lean slightly to keep it correct and not include it for now, but if you prefer this, then also good.
| } | ||
| snprintf(new_doc, new_doc_size, "%s%s\n--\n\n%s", | ||
| short_name, signature, base_doc); | ||
| ((PyTypeObject *)dtype_class)->tp_doc = new_doc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit awkward for user dtypes like rational, since they have no good name, they still just use numpy.dtype[name]. Although, I suppose it's technically memory safe at least :).
I am wondering if we should just not set any tp_doc here and then just use add_newdoc from Python (it doesn't like when docs are set).
At least for the builtin ones (user dtypes are a bit too bad, but also probably OK to just not have docs, heh.).
But I don't mind it too much, since we leak the reference everywhere you could just use PyUnicode_Format("%s%s\n--\n\n", ...) and then leak it to avoid the annoying size claculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit awkward for user dtypes like
rational, since they have no good name, they still just usenumpy.dtype[name].
Hmm I'm not sure I follow
I am wondering if we should just not set any
tp_dochere and then just useadd_newdocfrom Python (it doesn't like when docs are set).
The add_newdoc seems a bit hacky to me, and I can imagine that there's also a bit of overhead involved at import time. Tbh I put in in C instead of Python because that's where the docs are currently set at. But I don't mind moving it to Python
But I don't mind it too much, since we leak the reference everywhere you could just use
PyUnicode_Format("%s%s\n--\n\n", ...)and then leak it to avoid the annoying size claculation.
Eh not sure what you mean with that 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check what rational is, it's called numpy.dtype[rational] not numpy.dtypes.<SomeName>. It's kinda fine because there are exactly those two schemes, but the naming scheme is not guaranteed here.
The add_newdoc seems a bit hacky to me,
Oh, it is hacky, but it is a hack we use a lot and it's there for a reason (it's a pain to do large docstrings in C). So I feel we use it for everything anyway, so there is no point in avoiding it.
But... this matters only if we have real docs, if we add just this, then it's not too useful.
Eh not sure what you mean with that
Was thinking to just replace the snprintf and custom malloc by creating a Python string and then fetching its data. But OK.
| } | ||
|
|
||
| /* Set the text signature for inspect.signature() */ | ||
| const char *base_doc = ((PyTypeObject *)dtype_class)->tp_doc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well move the base doc from above here, since we set tp_doc later anyway I think.
|
I think I'll create an alternative PR for this in pure Python, so we can see which one we like better. |
np.dtype and np.dtypes.* runtime signaturesnp.dtype and np.dtypes.* runtime signatures (C)
|
There is #10167 to refactor the hacky |
|
closing in favor of #30169 |
This fixes `inspect.signature` for - `np.random.BitGenerator` - `np.random.Generator` - `np.random.MT19937` - `np.random.PCG64` - `np.random.PCG64DXSM` - `np.random.Philox` - `np.random.RandomState` - `np.random.SFC64` - `np.random.SeedSequence` - `np.random.bit_generator.SeedlessSeedSequence` This also fixes a typo in `bit_generator.pxd` that accidentally defined an empty unused class `np.random.bit_generator.SeedlessSequence`. Related to numpy#30104, numpy#30114, numpy#30121, numpy#30124, numpy#30126, numpy#30137, numpy#30138, numpy#30140, numpy#30143, numpy#30146, numpy#30147, and numpy#30155
This fixes `inspect.signature` for - `np.random.BitGenerator` - `np.random.Generator` - `np.random.MT19937` - `np.random.PCG64` - `np.random.PCG64DXSM` - `np.random.Philox` - `np.random.RandomState` - `np.random.SFC64` - `np.random.SeedSequence` - `np.random.bit_generator.SeedlessSeedSequence` This also fixes a typo in `bit_generator.pxd` that accidentally defined an empty unused class `np.random.bit_generator.SeedlessSequence`. Related to numpy#30104, numpy#30114, numpy#30121, numpy#30124, numpy#30126, numpy#30137, numpy#30138, numpy#30140, numpy#30143, numpy#30146, numpy#30147, and numpy#30155
For this one I had to go out of my confort zone, into the scary lands of C. So be sure to check if I didn't mess up anywhere. I've checked a whole nbunch of related signatures in IPython to ensure that everything still works as it should, and the tests are pretty thorough, but C tends to strictly follow Murphy's law for some reason, so I guess you never know 🤷
Anyway, this fixes the
inspect.signatureerrors when called onnp.dtype,np.dtype.newbyteorder, and any of thenumpy.dtypessubclasses ofdtype(includingStringDType)Related to #30104, #30114, #30121, #30124, #30126, #30137, #30138, #30140, and #30143