Revert "Revert "Use `urllib` handler for `s3://` and `gs://`, improve `url_exists` through HEAD requests (#34324)"" by haampie · Pull Request #34498 · spack/spack

haampie · 2022-12-13T13:08:36Z

revert-revert-final-FINAL-2.docx

This reverts commit 8035eeb.

It also now removes a dubious initial HEAD request in read_from_url, which used to prevent a follow-up GET request, but was broken 4 years ago in fce1c41. The idea there was to prevent a GET request to potentially large files; this is typically not an issue since attachments are often downloaded only once you start reading from the stream. So, let's just remove.

… `url_exists` through HEAD requests (spack#34324)"" This reverts commit 8035eeb.

In the past 4 years the initial HEAD request was never used to prevent a GET request on wrong content-type. So let's just remove the HEAD request then.

haampie · 2022-12-14T17:52:18Z

@spackbot run pipeline

spackbot-app · 2022-12-14T17:52:26Z

I've started that pipeline for you!

scottwittenburg · 2022-12-14T19:23:44Z

lib/spack/spack/util/web.py

-        # It would be nice to do this with the HTTP Accept header to avoid
-        # one round-trip.  However, most servers seem to ignore the header
-        # if you ask for a tarball with Accept: text/html.
-        req.get_method = lambda: "HEAD"


In the commit you said broke stuff, is it the check if accept_content_type: that was the problem? I think that was just to avoid always checking for text/html, but I'm not following what the problem was. At any rate, if you say there's no need for this anymore anyway, I'm fine with this particular change.

How it's supposed to work: in certain cases (spack checksum), it should only send GET requests if the content-type is something specific, which is obtained through a cheap, initial HEAD request. It was broken cause the GET request was no longer conditional. In the general case it's not necessary to send a HEAD request first, and I think this is how this function is used almost always -- this was not broken.

scottwittenburg

Other than the bit I commented on (the bit that's not just reverting a revert), this all looks as good to me today as it did last week 😆 Thanks @haampie!

… `url_exists` through HEAD requests (spack#34324)"" (spack#34498) This reverts commit 8035eeb. And also removes logic around an additional HEAD request to prevent a more expensive GET request on wrong content-type. Since large files are typically an attachment and only downloaded when reading the stream, it's not an optimization that helps much, and in fact the logic was broken since the GET request was done unconditionally.

spackbot-app bot added binary-packages commands core PR affects Spack core functionality environments fetching stage tests General test capability(ies) utilities labels Dec 13, 2022

haampie force-pushed the revert-revert-url_exists-2 branch from 5ca84bc to 6affc28 Compare December 13, 2022 13:31

haampie mentioned this pull request Dec 13, 2022

Use file paths/urls correctly #34452

Merged

haampie force-pushed the revert-revert-url_exists-2 branch from 6affc28 to cfb9ff1 Compare December 13, 2022 16:44

haampie added 2 commits December 14, 2022 09:54

Revert "Revert "Use urllib handler for s3:// and gs://, improve…

64802e6

… `url_exists` through HEAD requests (spack#34324)"" This reverts commit 8035eeb.

Remove extra head request

8cf56eb

In the past 4 years the initial HEAD request was never used to prevent a GET request on wrong content-type. So let's just remove the HEAD request then.

haampie force-pushed the revert-revert-url_exists-2 branch from cfb9ff1 to 8cf56eb Compare December 14, 2022 08:55

haampie requested a review from scottwittenburg December 14, 2022 09:38

scottwittenburg reviewed Dec 14, 2022

View reviewed changes

scottwittenburg approved these changes Dec 14, 2022

View reviewed changes

haampie merged commit ea02944 into spack:develop Dec 14, 2022

haampie deleted the revert-revert-url_exists-2 branch December 14, 2022 22:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Revert "Use `urllib` handler for `s3://` and `gs://`, improve `url_exists` through HEAD requests (#34324)""#34498

Revert "Revert "Use `urllib` handler for `s3://` and `gs://`, improve `url_exists` through HEAD requests (#34324)""#34498
haampie merged 2 commits intospack:developfrom
haampie:revert-revert-url_exists-2

haampie commented Dec 13, 2022 •

edited

Loading

Uh oh!

haampie commented Dec 14, 2022

Uh oh!

spackbot-app bot commented Dec 14, 2022

Uh oh!

scottwittenburg Dec 14, 2022

Uh oh!

haampie Dec 14, 2022 •

edited

Loading

Uh oh!

scottwittenburg left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

haampie commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haampie commented Dec 14, 2022

Uh oh!

spackbot-app bot commented Dec 14, 2022

Uh oh!

scottwittenburg Dec 14, 2022

Choose a reason for hiding this comment

Uh oh!

haampie Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scottwittenburg left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haampie commented Dec 13, 2022 •

edited

Loading

haampie Dec 14, 2022 •

edited

Loading