Skip to content

Conversation

@mhvk
Copy link
Contributor

@mhvk mhvk commented Oct 10, 2025

While trying to store data on the instance, I found that there were a few places where code needlessly relies on details of memory allocation. Since those cases are useful to fix even though I'm considering a better scheme that would not have problems, I thought I might as well put them in a separate PR.

For ease of reference, the three commits are:

MAINT: Avoid "fooling deallocator" in array_descr_set
Found this caused problems while trying to implement storing the
dimensions and strides on the instance. Splitting it out since it is
generally bad form to assume too much about the inner workings of, e.g.,
the deallocator, and it has no real benefits beyond trying to be clever.

MAINT: Use PyArray_Resize to grow arrays in FromFile and FromIter
Found this while trying to implement storage on the instance, but
factored it out since it seems better form to just have one place
in numpy where arrays are resized inplace (which ideally would not
happen at all, but here is quite logical).

For PyArray_FromFile, there is little benefit beyond that, but in
PyArray_FromIter, it allows removing quite a bit of code that
duplicated what PyArray_Resize does.

MAINT: Ensure PyArray_Resize replaces empty allcation with 1 byte.
Minimum allocation should be 1, not elsize. Also use the default
function to deallocate dimension & strides.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering slightly if a more raw thing might be helpful, but not sure, then you still need to update the shape correctly. Although... all of these functions that reasonably resize clearly have to update shape all the time anyway?

(But otherwise the None return is a thorn, although I am not strictly sure it's worth removing...)

One thing: I think you should just use dims[NPY_MAXDIMS] at least for any of the temporary ones just to simplify the code...

@mhvk mhvk force-pushed the array-avoid-assumptions branch from db194f6 to 545a8e5 Compare October 12, 2025 02:10
Copy link
Contributor Author

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seberg - thanks for the comments! With your suggestion of just using a fixed dims[NPY_MAXDIMS] the code is indeed clearer (and more correct, in not forgetting to check for allocation errors...).

Note that I found another culprit, the textreader (which only uses UserRENEW, not UserNEW, hence an earlier grep failed to find it...). Here the use of PyArray_Resize is even nicer, since the equivalent of dims is already available in the form of result_shape.

goto error;
}
if (needs_init) {
memset(PyArray_BYTES(data_array), 0, PyArray_NBYTES(data_array));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for initializing the memory if the dtype needs it - already done by PyArray_SimpleNewFromDescr.

data_ptr = PyArray_DATA(data_array) + row_count * row_size;
data_allocated_rows = new_rows;
if (needs_init) {
memset(data_ptr, '\0', (new_rows - row_count) * row_size);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyArray_Resize also takes care of initialization. Nice number of lines saved!

@mhvk mhvk force-pushed the array-avoid-assumptions branch from 545a8e5 to 3b81728 Compare October 12, 2025 02:19
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, a bit closer look, I am happy with this overall, a few comments (but I can make a pass tomorrow or so).

(Would be curious if this is noticable speed wise, but may try myself also.)

new_data = PyDataMem_UserRENEW(
PyArray_BYTES(ret), nbytes, PyArray_HANDLER(ret));
dims[0] = elcount;
res = PyArray_Resize(ret, &new_dims, 0, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should put NY_CORDER as last argument for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've created the internal version, I just omitted that entry - it was unused.

elcount = (i >> 1) + (i < 4 ? 4 : 2) + i;
npy_intp nbytes = 0;
PyObject *res = NULL;
if (!npy_mul_sizes_with_overflow(&nbytes, elcount, elsize)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mul_sizes_with_overflow protects from overflows (mainly on 32bit systems).
PyArray_Resize() should already do that (I dunno if I trust it to, though).

So this branch should be unnecessary now (which also cleans up the slightly weird error logic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, indeed it does check. And I think we should trust it...

/* Fool deallocator not to delete these*/
((PyArrayObject_fields *)temp)->nd = 0;
((PyArrayObject_fields *)temp)->dimensions = NULL;
Py_INCREF(newtype);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we still mess with one error, but yeah I don't mind doing this slightly longer dance here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it is still not perfect, but it doesn't fit resize and here I just wanted not to mess with temp.

Copy link
Contributor Author

@mhvk mhvk Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a PR halfway that changes setstate to basically deallocate all but itself (using that part of array_dealloc) and hopefully then allocate using the relevant part of PyArray_FromDescrAndData_int PyArray_FromDescr_int.

For here, though, I'd suggest to leave it at this -- at least the deallocation of temp is now not messed with.

}
new_data = PyDataMem_UserRENEW(PyArray_DATA(self),
newnbytes == 0 ? elsize : newnbytes,
newnbytes == 0 ? 1 : newnbytes,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, was changed everywhere else!

@mhvk mhvk force-pushed the array-avoid-assumptions branch from 3b81728 to e259933 Compare October 12, 2025 21:42
Copy link
Contributor Author

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! The multiple times I forgot to decref the None from PyArray_Resize convinced me that we should after all have an internal version...

elcount = (i >> 1) + (i < 4 ? 4 : 2) + i;
npy_intp nbytes = 0;
PyObject *res = NULL;
if (!npy_mul_sizes_with_overflow(&nbytes, elcount, elsize)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, indeed it does check. And I think we should trust it...

new_data = PyDataMem_UserRENEW(
PyArray_BYTES(ret), nbytes, PyArray_HANDLER(ret));
dims[0] = elcount;
res = PyArray_Resize(ret, &new_dims, 0, 0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've created the internal version, I just omitted that entry - it was unused.

/* Fool deallocator not to delete these*/
((PyArrayObject_fields *)temp)->nd = 0;
((PyArrayObject_fields *)temp)->dimensions = NULL;
Py_INCREF(newtype);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it is still not perfect, but it doesn't fit resize and here I just wanted not to mess with temp.

@seberg
Copy link
Member

seberg commented Oct 15, 2025

Looking at the freeing bug, there is one fun little difference here for objects. Unlike Resize the old code always just NULLed everything for initialization and that means also shrinking doesn't need to clean up here.
But Resize actually fills as if you called np.zeros, which is a bit weird. Dunno if that should worry, since if we have objects, we probably don't need to worry about the speed of that much.

@mhvk
Copy link
Contributor Author

mhvk commented Oct 15, 2025

Hmm, I actually think it is better to be safe. But really, what should happen should be the same between contruction of an an array and resizing. I saw that PyArray_FromDescr_int has a whole dance between calloc and fill_zero_info.func, which seems very similar to what PyArray_ZeroContiguousBuffer does (and maybe could be factored out?). So, it seems reasonable to do the same for resizing of arrays in FromIter and FromFile.

@seberg
Copy link
Member

seberg commented Oct 15, 2025

Hmm, I actually think it is better to be safe.

I don't mind cutting this corner (since we size it to exactly what we wrote, everything else would be horribly wrong in more than one way). But... it is maybe not worthwhile avoiding this (at least for shrinking). FWIW, for growing, I wouldn't mind being able to just init rather than zero fill.

PyArray_FromDescr_int could maybe use the PyArray_ZeroContiguousBuffer helper. But it has to do parts of the same setup to be able to decide whether it can use calloc or not.

@mhvk
Copy link
Contributor Author

mhvk commented Nov 11, 2025

@seberg - was looking at this PR and trying to see what needs to be done. I think we were down to wondering whether or not objects should be initialized as NULL or None - my sense for now is that this only matters for speed in a corner case where speed is not very important, so maybe keep it simple here? If we ever want to, we can rewrite resize to initialize objects to NULL instead of None...

@seberg
Copy link
Member

seberg commented Nov 11, 2025

Yes, agreed. I tried timing briefly, reading a csv file with:

In [3]: with open("/tmp/asdf.csv", "w") as f:
   ...:     f.write("1\n" * 1000000)
   ...: 

In [4]: %timeit np.loadtxt("/tmp/asdf.csv", dtype=object)

Which went from 26ms to 29ms on my computer (hot file cache). 46 to 37 if I use dtype="O,i", delimiter="," and a two column file.

That isn't nothing, but also not a crazy slowdown, I guess. So yeah, I am fine with just doing this. I suppose conceptually, I still think these internal uses are a bit different from Python side resizing (where a np.zeros style API is also weird, but full initialization makes more sense).

mhvk added 4 commits November 12, 2025 12:36
Found this caused problems while trying to implement storing the
dimensions and strides on the instance.  Splitting it out since it is
generally bad form to assume too much about the inner workings of, e.g.,
the deallocator, and it has no real benefits beyond trying to be clever.
…nternally.

Also fix the minimum allocation used inside, which should be 1, not elsize.
And use the default function to deallocate dimension & strides.
Found this while trying to implement storage on the instance, but
factored it out since it seems better form to just have one place
in numpy where arrays are resized inplace (which ideally would not
happen at all, but here is quite logical).

For PyArray_FromFile, there is little benefit beyond that, but in
PyArray_FromIter, it allows removing quite a bit of code that
duplicated what PyArray_Resize does.
@mhvk mhvk force-pushed the array-avoid-assumptions branch from e259933 to 9b8027f Compare November 12, 2025 17:55
@mhvk
Copy link
Contributor Author

mhvk commented Nov 12, 2025

I think the reading slow downs are probably at the acceptable level, especially as it only affects objects, which are not very common. I also think we can solve them if we allow passing something in to tell PyArray_ZeroContiguousBuffer whether zeroing an object means writing NULL, or inserting Py_None. Indeed, that might well help code elsewhere...

@mhvk
Copy link
Contributor Author

mhvk commented Nov 12, 2025

p.s. Rebased to get rid of the merge conflict.

@seberg
Copy link
Member

seberg commented Nov 12, 2025

. I also think we can solve them if we allow passing something in to tell PyArray_ZeroContiguousBuffer whether zeroing an object means writing NULL, or inserting Py_None. Indeed, that might well help code elsewhere...

An additional helper PyArray_InitContigouosBufferIfNeeded() may be useful, but they seem two very different concepts. ZeroContiguousBuffer is explicitly for np.zeros() which isn't used in many places, so I don't think it would be helpful to mix up the two concepts.

But yeah, I am tempted to add a flag to the _int version (call it "unsafe" if you like) that causes only init-if-needed for growing and maybe also promises to only shrunk unused space.
Basically, because any sane code where this isn't OK is buggy anyway (i.e. file readers/fromiter).

Anway, we can just put it in when this passes, I may follow up.

@mhvk mhvk merged commit 60878d2 into numpy:main Nov 12, 2025
77 checks passed
@mhvk
Copy link
Contributor Author

mhvk commented Nov 12, 2025

Agreed with the "init_if_needed" idea. And also with doing it in follow-up 😺 (test passed so I went ahead and merged...)

@mhvk mhvk deleted the array-avoid-assumptions branch November 12, 2025 19:20
@seberg
Copy link
Member

seberg commented Nov 13, 2025

Thanks for merging, sorry for forgetting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants