-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
MAINT: Avoid assumptions about how memory is allocated #29925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
seberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering slightly if a more raw thing might be helpful, but not sure, then you still need to update the shape correctly. Although... all of these functions that reasonably resize clearly have to update shape all the time anyway?
(But otherwise the None return is a thorn, although I am not strictly sure it's worth removing...)
One thing: I think you should just use dims[NPY_MAXDIMS] at least for any of the temporary ones just to simplify the code...
db194f6 to
545a8e5
Compare
mhvk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seberg - thanks for the comments! With your suggestion of just using a fixed dims[NPY_MAXDIMS] the code is indeed clearer (and more correct, in not forgetting to check for allocation errors...).
Note that I found another culprit, the textreader (which only uses UserRENEW, not UserNEW, hence an earlier grep failed to find it...). Here the use of PyArray_Resize is even nicer, since the equivalent of dims is already available in the form of result_shape.
| goto error; | ||
| } | ||
| if (needs_init) { | ||
| memset(PyArray_BYTES(data_array), 0, PyArray_NBYTES(data_array)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need for initializing the memory if the dtype needs it - already done by PyArray_SimpleNewFromDescr.
| data_ptr = PyArray_DATA(data_array) + row_count * row_size; | ||
| data_allocated_rows = new_rows; | ||
| if (needs_init) { | ||
| memset(data_ptr, '\0', (new_rows - row_count) * row_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyArray_Resize also takes care of initialization. Nice number of lines saved!
545a8e5 to
3b81728
Compare
seberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, a bit closer look, I am happy with this overall, a few comments (but I can make a pass tomorrow or so).
(Would be curious if this is noticable speed wise, but may try myself also.)
numpy/_core/src/multiarray/ctors.c
Outdated
| new_data = PyDataMem_UserRENEW( | ||
| PyArray_BYTES(ret), nbytes, PyArray_HANDLER(ret)); | ||
| dims[0] = elcount; | ||
| res = PyArray_Resize(ret, &new_dims, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should put NY_CORDER as last argument for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I've created the internal version, I just omitted that entry - it was unused.
numpy/_core/src/multiarray/ctors.c
Outdated
| elcount = (i >> 1) + (i < 4 ? 4 : 2) + i; | ||
| npy_intp nbytes = 0; | ||
| PyObject *res = NULL; | ||
| if (!npy_mul_sizes_with_overflow(&nbytes, elcount, elsize)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mul_sizes_with_overflow protects from overflows (mainly on 32bit systems).
PyArray_Resize() should already do that (I dunno if I trust it to, though).
So this branch should be unnecessary now (which also cleans up the slightly weird error logic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, indeed it does check. And I think we should trust it...
| /* Fool deallocator not to delete these*/ | ||
| ((PyArrayObject_fields *)temp)->nd = 0; | ||
| ((PyArrayObject_fields *)temp)->dimensions = NULL; | ||
| Py_INCREF(newtype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we still mess with one error, but yeah I don't mind doing this slightly longer dance here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it is still not perfect, but it doesn't fit resize and here I just wanted not to mess with temp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got a PR halfway that changes setstate to basically deallocate all but itself (using that part of array_dealloc) and hopefully then allocate using the relevant part of PyArray_FromDescrAndData_intPyArray_FromDescr_int.
For here, though, I'd suggest to leave it at this -- at least the deallocation of temp is now not messed with.
| } | ||
| new_data = PyDataMem_UserRENEW(PyArray_DATA(self), | ||
| newnbytes == 0 ? elsize : newnbytes, | ||
| newnbytes == 0 ? 1 : newnbytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, was changed everywhere else!
3b81728 to
e259933
Compare
mhvk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review! The multiple times I forgot to decref the None from PyArray_Resize convinced me that we should after all have an internal version...
numpy/_core/src/multiarray/ctors.c
Outdated
| elcount = (i >> 1) + (i < 4 ? 4 : 2) + i; | ||
| npy_intp nbytes = 0; | ||
| PyObject *res = NULL; | ||
| if (!npy_mul_sizes_with_overflow(&nbytes, elcount, elsize)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, indeed it does check. And I think we should trust it...
numpy/_core/src/multiarray/ctors.c
Outdated
| new_data = PyDataMem_UserRENEW( | ||
| PyArray_BYTES(ret), nbytes, PyArray_HANDLER(ret)); | ||
| dims[0] = elcount; | ||
| res = PyArray_Resize(ret, &new_dims, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I've created the internal version, I just omitted that entry - it was unused.
| /* Fool deallocator not to delete these*/ | ||
| ((PyArrayObject_fields *)temp)->nd = 0; | ||
| ((PyArrayObject_fields *)temp)->dimensions = NULL; | ||
| Py_INCREF(newtype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it is still not perfect, but it doesn't fit resize and here I just wanted not to mess with temp.
|
Looking at the freeing bug, there is one fun little difference here for objects. Unlike |
|
Hmm, I actually think it is better to be safe. But really, what should happen should be the same between contruction of an an array and resizing. I saw that |
I don't mind cutting this corner (since we size it to exactly what we wrote, everything else would be horribly wrong in more than one way). But... it is maybe not worthwhile avoiding this (at least for shrinking). FWIW, for growing, I wouldn't mind being able to just init rather than zero fill.
|
|
@seberg - was looking at this PR and trying to see what needs to be done. I think we were down to wondering whether or not objects should be initialized as NULL or |
|
Yes, agreed. I tried timing briefly, reading a csv file with: Which went from 26ms to 29ms on my computer (hot file cache). 46 to 37 if I use That isn't nothing, but also not a crazy slowdown, I guess. So yeah, I am fine with just doing this. I suppose conceptually, I still think these internal uses are a bit different from Python side resizing (where a |
Found this caused problems while trying to implement storing the dimensions and strides on the instance. Splitting it out since it is generally bad form to assume too much about the inner workings of, e.g., the deallocator, and it has no real benefits beyond trying to be clever.
…nternally. Also fix the minimum allocation used inside, which should be 1, not elsize. And use the default function to deallocate dimension & strides.
Found this while trying to implement storage on the instance, but factored it out since it seems better form to just have one place in numpy where arrays are resized inplace (which ideally would not happen at all, but here is quite logical). For PyArray_FromFile, there is little benefit beyond that, but in PyArray_FromIter, it allows removing quite a bit of code that duplicated what PyArray_Resize does.
e259933 to
9b8027f
Compare
|
I think the reading slow downs are probably at the acceptable level, especially as it only affects objects, which are not very common. I also think we can solve them if we allow passing something in to tell |
|
p.s. Rebased to get rid of the merge conflict. |
An additional helper But yeah, I am tempted to add a flag to the Anway, we can just put it in when this passes, I may follow up. |
|
Agreed with the "init_if_needed" idea. And also with doing it in follow-up 😺 (test passed so I went ahead and merged...) |
|
Thanks for merging, sorry for forgetting. |
While trying to store data on the instance, I found that there were a few places where code needlessly relies on details of memory allocation. Since those cases are useful to fix even though I'm considering a better scheme that would not have problems, I thought I might as well put them in a separate PR.
For ease of reference, the three commits are:
MAINT: Avoid "fooling deallocator" in array_descr_set
Found this caused problems while trying to implement storing the
dimensions and strides on the instance. Splitting it out since it is
generally bad form to assume too much about the inner workings of, e.g.,
the deallocator, and it has no real benefits beyond trying to be clever.
MAINT: Use PyArray_Resize to grow arrays in FromFile and FromIter
Found this while trying to implement storage on the instance, but
factored it out since it seems better form to just have one place
in numpy where arrays are resized inplace (which ideally would not
happen at all, but here is quite logical).
For PyArray_FromFile, there is little benefit beyond that, but in
PyArray_FromIter, it allows removing quite a bit of code that
duplicated what PyArray_Resize does.
MAINT: Ensure PyArray_Resize replaces empty allcation with 1 byte.
Minimum allocation should be 1, not elsize. Also use the default
function to deallocate dimension & strides.