Skip to content

Comments

Introducing cache.invalidate() and invalidate_all()#3276

Merged
sgillies merged 3 commits intomainfrom
issue3275
Dec 15, 2024
Merged

Introducing cache.invalidate() and invalidate_all()#3276
sgillies merged 3 commits intomainfrom
issue3275

Conversation

@sgillies
Copy link
Member

@sgillies sgillies commented Dec 14, 2024

I like "invalidate" as a method name. These two methods help solve the problem pointed out in the last sentence of RFC 9111 Section 4.4. If you combine Rasterio with, say, uploading files to S3 using boto3, those PUTs don't go through Rasterio's cache and thus can't be smartly invalidated.

Resolves #3275

@sgillies sgillies added this to the 1.5.0 milestone Dec 14, 2024
@sgillies sgillies self-assigned this Dec 14, 2024

def test_invalidate_all():
"""Cache is entirely invalidated."""
cache.invalidate_all()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a smoke test. I'm inclined to rely on GDAL's testing. invalidate_all() simply calls a GDAL method with no arguments. There's no way to use it wrongly. invalidate() uses the same path parsing pattern used elsewhere in rasterio. It's reliable.

Copy link
Member

@snowman2 snowman2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks 👍

@vincentsarago vincentsarago self-requested a review December 14, 2024 09:41
Copy link
Member

@vincentsarago vincentsarago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@sgillies
Copy link
Member Author

@snowman2 @vincentsarago I'm glad we agree! I'd like to improve the documentation. The exact semantics of VSICurlPartialClearCache() are undocumented and the underlying function is not easy for me to read: https://github.com/OSGeo/gdal/blob/f9095eae45ce0f943bf4521f9981d20c49b039f4/port/cpl_vsil_curl.cpp#L3972.

@rouault can you explain the nuances of the function? For example, does the prefix /vsis3/f clear responses to both /vsis3/foo/... and /vsis3/fu/... requests? Does the prefix /vsis clear responses to all S3 requests? Or do path components of URLs need to be fully spelled out?

@rouault
Copy link
Contributor

rouault commented Dec 14, 2024

can you explain the nuances of the function?

does OSGeo/gdal#11493 clarify things?

rouault added a commit to rouault/gdal that referenced this pull request Dec 14, 2024
rouault added a commit to OSGeo/gdal that referenced this pull request Dec 14, 2024
@sgillies sgillies merged commit 64474c1 into main Dec 15, 2024
@sgillies sgillies deleted the issue3275 branch December 15, 2024 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Give users control over the GDAL vsicurl cache

4 participants