Skip to content

Conversation

@mwtoews
Copy link
Member

@mwtoews mwtoews commented May 14, 2024

Recent CI failures have picked up a recent NumPy 2.0 change, noted in the migration guide under "Adapting to changes in the copy keyword".

A simple fix appears to be to simply add copy=None to the __array__ signature. This value is not used by the method.


Here is an example failure:

________________________ TestCoords.test_data_promotion ________________________

self = <shapely.tests.geometry.test_coords.TestCoords object at 0x7f6bbc72fb90>

    def test_data_promotion(self):
        coords = np.array([[12, 34], [56, 78]], dtype=np.float32)
>       processed_coords = np.array(LineString(coords).coords)
E       DeprecationWarning: __array__ implementation doesn't accept a copy keyword, so passing copy=False failed. __array__ must implement 'dtype' and 'copy' keyword arguments.

shapely/tests/geometry/test_coords.py:16: DeprecationWarning

Note that test failures with macos in this PR are unrelated.

@jorisvandenbossche
Copy link
Member

Thanks! I had noted this as a TODO item in #1972, but didn't yet get to it (and didn't expect our CI to fail for it, but so this is a good reminder ;))

Ideally we would honor the copy keyword if passed. I think in our case, this just means raising an error if copy is False, because the coordinates we return are always a copy?
But this is less critical, and could also be done later in a follow-up.

@mwtoews
Copy link
Member Author

mwtoews commented May 14, 2024

I had a play with various copy=True/False/None versions, and it all seems minor impact, mostly because CoordinateSequence objects aren't really used as separate objects. Each .coords property access creates a new sequence object by default.

A simple implementation would be:

    def __array__(self, dtype=None, copy=None):
        if copy:
            return self._coords.copy()
        return self._coords

then tested (somewhere):

def test_coords_array():
    coord_seq = point.coords
    assert np.array(coord_seq) is not np.array(coord_seq)
    assert np.array(coord_seq, copy=False) is np.array(coord_seq, copy=False)
    assert np.array(coord_seq, copy=True) is not np.array(coord_seq, copy=True)

@coveralls
Copy link

coveralls commented May 14, 2024

Pull Request Test Coverage Report for Build 9087673717

Details

  • 6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall first build on numpy-copy at 88.138%

Totals Coverage Status
Change from base Build 9076561797: 88.1%
Covered Lines: 2623
Relevant Lines: 2976

💛 - Coveralls

@jorisvandenbossche
Copy link
Member

Hmm, yes, it depends on how you interpret the CoordinateSequence object. Is that just a wrapper around an already materialized numpy array self._coords (how it is implemented now in practice) or is it a logical abstraction around the geometry's coord seq (how it was actually also implemented in shapely 1.x)?

Because in the former case, you can indeed say that np.array(geom.coords, copy=False) is fine because getting the .coords object in geom.coords itself has already made a copy, and so the subsequent conversion to numpy does not require a copy.
But in the latter case you could argue that np.array(geom.coords, copy=False) should raise an error because it can never the the geom.coords without involving a copy of the actual values (like, np.array(geom.coords, copy=False) is np.array(geom.coords, copy=False) will never be true).

Now, since this CoordinateSequence class is a separate object from the Geometry objects, it might be that the former interpretation is the only viable one.

@mwtoews mwtoews changed the title BUG: add copy=None to CoordinateSequence.__array__ for NumPy 2 ENH: add copy=None to CoordinateSequence.__array__ May 14, 2024
@mwtoews
Copy link
Member Author

mwtoews commented May 14, 2024

From The __array__() method docs "where copy=False should raise an exception if a copy is needed". And the basic dispatch example raises ValueError for this case. This PR is now updated, renamed as an enhancement.

@mwtoews
Copy link
Member Author

mwtoews commented Jun 13, 2024

Any further review/comment @jorisvandenbossche ?

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this fell off my radar. Let's get this in to fix the CI failures

Comment on lines +52 to +53
elif copy is True:
return self._coords.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On thing I am still wondering: if we say that a copy is always created (in the error message above), then maybe we should not actually do the explicit copy here?

Of course, if one saved the .coords separately, one might still expect this actual copy ..

coords = geom.coords
coords_arr1 = np.asarray(coords, copy=True)
coords_arr2 = np.asarray(coords, copy=True)

i.e. one would expect in the case above that those two numpy arrays are not views of each other .. So I assume the above explicit copy is fine. We are just a bit inconsistent about the copy=False vs copy=True

@jorisvandenbossche jorisvandenbossche merged commit 07fde27 into shapely:main Jun 14, 2024
@mwtoews mwtoews deleted the numpy-copy branch June 14, 2024 09:38
mwtoews added a commit to mwtoews/shapely that referenced this pull request Jul 7, 2024
@mwtoews mwtoews mentioned this pull request Aug 23, 2024
@theroggy theroggy added this to the 2.1 milestone Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants