Skip to content

Comments

Revert to using non-Numy affine transformation operation#3313

Closed
sgillies wants to merge 1 commit intomaint-1.4from
issue3310
Closed

Revert to using non-Numy affine transformation operation#3313
sgillies wants to merge 1 commit intomaint-1.4from
issue3310

Conversation

@sgillies
Copy link
Member

Resolves #3310

@sgillies sgillies self-assigned this Feb 17, 2025
@sgillies
Copy link
Member Author

@groutr what do you think? Looks like we have a bad interaction between Python and Numpy math on a particular platform.

@groutr
Copy link
Contributor

groutr commented Feb 18, 2025

Very interesting. I can take a deeper look at it tomorrow when I'm back at a computer.

@groutr
Copy link
Contributor

groutr commented Feb 18, 2025

After thoroughly reading the bug reports more in-depth, I'm a big 👎 on this solution. I doubt this is an issue of mixing Python and Numpy and seems to point to the BLAS/LAPACK library. I don't think universally reverting a big improvement of rasterio 1.4 is the correct way to resolve this.

If we are adamant about the issue being fixed in rasterio, I think a less intrusive fix would be to limit the precision by rounding before applying the floor operation. There may be value in first scrutinizing how BLAS/LAPACK are being packaged on opensuse as mentioned in https://bugzilla.suse.com/show_bug.cgi?id=1236116#c5. If the test suite for BLAS/LAPACK is having issues, then everything that relies on BLAS/LAPACK will also be affected.

@lpechacek the reason numpy is used for matrix multiplication here is because, without it, the process of transforming any large number of points is unbearably slow.

@lpechacek
Copy link

lpechacek commented Feb 20, 2025

After thoroughly reading the bug reports more in-depth, I'm a big 👎 on this solution. I doubt this is an issue of mixing Python and Numpy and seems to point to the BLAS/LAPACK library. I don't think universally reverting a big improvement of rasterio 1.4 is the correct way to resolve this.

Understood. I also originally thought that the problem is in BLAS and how Numpy uses it. That is why I originally titled my openSUSE report as an "unreliable calculation". After reading comments from colleagues whose opinion I value, I changed my mind and accepted the idea that there are different kinds of "floating point number" (beyond float, double, and long double).

If we are adamant about the issue being fixed in rasterio, I think a less intrusive fix would be to limit the precision by rounding before applying the floor operation.

Whatever solves the issue. :)

There may be value in first scrutinizing how BLAS/LAPACK are being packaged on opensuse as mentioned in https://bugzilla.suse.com/show_bug.cgi?id=1236116#c5. If the test suite for BLAS/LAPACK is having issues, then everything that relies on BLAS/LAPACK will also be affected.

Well, we can work on openSUSE's BLAS/LAPACK but it won't help Debian. I haven't researched on the impact on distributions but I guess that all Linux distributions that support x86 (32 bit) are affected.

@lpechacek the reason numpy is used for matrix multiplication here is because, without it, the process of transforming any large number of points is unbearably slow.

Thanks for the explanation!

@lpechacek
Copy link

@lpechacek the reason numpy is used for matrix multiplication here is because, without it, the process of transforming any large number of points is unbearably slow.

Thanks for the explanation!

Pardon me, I fail to understand what "any large number of points" means. I tested the performance of rasterio.io index() and xy() and in all cases I got shorter processing times with the patch from this pull request than with the Numpy implementation.

Mind that I just a packager and know essentially nothing about the Rasterio module and related topics. Still I've got CS education and when I see an inner-loop routine that performs complicated preparation steps, I know that its performance will be degraded for individual values. Then I expect some economies of scale from processing large vectors (here I used sets of 300-3M coordinates) but I fail to detect them too.

In any case, the performance for single value lookup (e.g. index() does not accept vectors) is better with the patch here. Just saying. :)

For the record, my testing procedure:

user@vbox:~/rasterio-1.4.3> PYTHONPATH=build/lib.linux-i686-cpython-311 python3 
Python 3.11.11 (main, Dec 04 2024, 21:44:34) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> import numpy as np
>>> import rasterio
>>> src = rasterio.open('tests/data/RGB.byte.tif')
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(300), np.random.random_sample(300)), number = 100000)
10.638185453999995
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(300), np.random.random_sample(300)), number = 100000)
10.657873143000415
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(300), np.random.random_sample(300)), number = 100000)
10.645034958999986
>>> 
user@vbox:~/rasterio-1.4.3> wget --quiet https://github.com/rasterio/rasterio/pull/3313/commits/7608a04567c9960d626b6836ae99b2665e978cee.patch
user@vbox:~/rasterio-1.4.3> patch build/lib.linux-i686-cpython-311/rasterio/transform.py < 7608a04567c9960d626b6836ae99b2665e978cee.patch
patching file build/lib.linux-i686-cpython-311/rasterio/transform.py
Hunk #1 FAILED at 6.
1 out of 1 hunk FAILED -- saving rejects to file build/lib.linux-i686-cpython-311/rasterio/transform.py.rej
patching file build/lib.linux-i686-cpython-311/rasterio/transform.py
user@vbox:~/rasterio-1.4.3> # The patch rejects is the changelog entry
user@vbox:~/rasterio-1.4.3> PYTHONPATH=build/lib.linux-i686-cpython-311 python3 
Python 3.11.11 (main, Dec 04 2024, 21:44:34) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> import numpy as np
>>> import rasterio
>>> src = rasterio.open('tests/data/RGB.byte.tif')
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(300), np.random.random_sample(300)), number = 100000)
8.270643490000111
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(300), np.random.random_sample(300)), number = 100000)
8.358940623999843
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(300), np.random.random_sample(300)), number = 100000)
8.308591027000148
>>> 
>>> # Maybe the 300-item vector is too short and Rasterio will choke on a larger vector?
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(30000), np.random.random_sample(30000)), number = 1000)
3.994692398999632
>>> timeit.timeit(lambda: src.xy(np.random.random_sample(3000000), np.random.random_sample(3000000)), number = 1)
0.295577516999856
>>> 

The single-point lookup:

user@vbox:~/rasterio-1.4.3> cat timing.py 
import timeit
import rasterio

with rasterio.open('tests/data/RGB.byte.tif') as src:
    left, bottom, right, top = src.bounds
    print(timeit.timeit(lambda: src.index(left, top), number=500000))

user@vbox:~/rasterio-1.4.3> PYTHONPATH=build/lib.linux-i686-cpython-311 python3 timing.py 
34.141054769999755
user@vbox:~/rasterio-1.4.3> patch build/lib.linux-i686-cpython-311/rasterio/transform.py < 7608a04567c9960d626b6836ae99b2665e978cee.patch
patching file build/lib.linux-i686-cpython-311/rasterio/transform.py
Hunk #1 FAILED at 6.
1 out of 1 hunk FAILED -- saving rejects to file build/lib.linux-i686-cpython-311/rasterio/transform.py.rej
patching file build/lib.linux-i686-cpython-311/rasterio/transform.py
user@vbox:~/rasterio-1.4.3> PYTHONPATH=build/lib.linux-i686-cpython-311 python3 timing.py 
27.806108064
user@vbox:~/rasterio-1.4.3> echo $[100*28/34]
82

@groutr
Copy link
Contributor

groutr commented Feb 24, 2025

If we performed this computation in a way that avoids directly computing an inverse, we can test if the accuracy errors are being introduced by the inverse transform. Since I don't have 32bit hardware to test on, I'm curious to see what the following results in.

import numpy as np
import rasterio
# This is the transform from the RGB.byte.tif test file.
test_file = rasterio.open("./tests/data/RGB.byte.tif")
transform = np.array(test_file.transform).reshape(3,3)
bounds = test_file.bounds
height = test_file.height
width = test_file.width
# transform = [300.0379266750948,0.0,101985.0,0.0,-300.041782729805, 2826915.0, 0.0, 0.0, 1.0]
# bounds = (101985.0, 2611485.0, 339315.0, 2826915.0)
# height = 791
# width = 718

pxs = np.array([[bounds[0], bounds[3]], [bounds[2], bounds[3]], [bounds[2], bounds[1]], [bounds[0], bounds[1]]])
coords = np.linalg.solve(transform, pxs)   # solve for the coordinates, but without directly inverting the transform
expected = np.array([[0, width, width, 0], [0, 0, height, height]])
np.allclose(np.floor(coords[:2]), expected)

If that test passes, then perhaps AffineTransformer could be improved by not relying on affine to directly compute an inverse, which would be preferred.

class AffineTransformer(TransformerBase):
    """A pure Python class related to affine based coordinate transformations."""
    def __init__(self, affine_transform):
        super().__init__()
        if not isinstance(affine_transform, Affine):
            raise ValueError("Not an affine transform")
        self._transformer = affine_transform
        self._transformer_arr = np.array(affine_transform, dtype='float64').reshape(3, 3)

    def _transform(self, xs, ys, zs, transform_direction):
        bi = np.broadcast(xs, ys)
        input_matrix = np.empty((3, bi.size))
        input_matrix[0] = bi.iters[0]
        input_matrix[1] = bi.iters[1]
        input_matrix[2] = 1

        if transform_direction is TransformDirection.forward:
            transformed = np.matmul(self._transformer_arr, input_matrix) 
        elif transform_direction is TransformDirection.reverse:
            transformed = np.linalg.solve(self._transformer_arr, input_matrix)
        return transformed[0], transformed[1]

@lpechacek
Copy link

Thanks for the proposal!

I'm curious to see what the following results in.

Triggers a run time error:

lpechacek@fmn:/data/src/rasterio> cat > /tmp/rasterio-test-script.py
import numpy as np
import rasterio
# This is the transform from the RGB.byte.tif test file.
test_file = rasterio.open("./tests/data/RGB.byte.tif")
transform = np.array(test_file.transform).reshape(3,3)
bounds = test_file.bounds
height = test_file.height
width = test_file.width
# transform = [300.0379266750948,0.0,101985.0,0.0,-300.041782729805, 2826915.0, 0.0, 0.0, 1.0]
# bounds = (101985.0, 2611485.0, 339315.0, 2826915.0)
# height = 791
# width = 718

pxs = np.array([[bounds[0], bounds[3]], [bounds[2], bounds[3]], [bounds[2], bounds[1]], [bounds[0], bounds[1]]])
coords = np.linalg.solve(transform, pxs)   # solve for the coordinates, but without directly inverting the transform
expected = np.array([[0, width, width, 0], [0, 0, height, height]])
np.allclose(np.floor(coords[:2]), expected)
lpechacek@fmn:/data/src/rasterio> python3 /tmp/rasterio-test-script.py
Traceback (most recent call last):
  File "/tmp/rasterio-test-script.py", line 15, in <module>
    coords = np.linalg.solve(transform, pxs)   # solve for the coordinates, but without directly inverting the transform
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/site-packages/numpy/linalg/_linalg.py", line 413, in solve
    r = gufunc(a, b, signature=signature)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: solve: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 4 is different from 3)

If that test passes, then perhaps AffineTransformer could be improved by not relying on affine to directly compute an inverse, which would be preferred.

This makes the tests pass. The patch is https://build.opensuse.org/projects/home:LPechacek:branches:Application:Geo/packages/python-rasterio/files/v1.4.3-transform-hotfix.patch?expand=1

The build results, which include running the tests, are at https://build.opensuse.org/package/show/home:LPechacek:branches:Application:Geo/python-rasterio .

@groutr
Copy link
Contributor

groutr commented Feb 26, 2025

@lpechacek Thank you for testing the np.linalg.solve method! I'm relieved to hear that it passes the tests. I'm a +1 on that method as I think it's a more mathematically and numerically sound approach.

@sgillies I can submit the PR later this evening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants