Skip to content

Revert "Revert "Use urllib handler for s3:// and gs://, improve url_exists through HEAD requests (#34324)""#34498

Merged
haampie merged 2 commits intospack:developfrom
haampie:revert-revert-url_exists-2
Dec 14, 2022
Merged

Revert "Revert "Use urllib handler for s3:// and gs://, improve url_exists through HEAD requests (#34324)""#34498
haampie merged 2 commits intospack:developfrom
haampie:revert-revert-url_exists-2

Conversation

@haampie
Copy link
Copy Markdown
Member

@haampie haampie commented Dec 13, 2022

revert-revert-final-FINAL-2.docx

This reverts commit 8035eeb.

It also now removes a dubious initial HEAD request in read_from_url, which used to prevent a follow-up GET request, but was broken 4 years ago in fce1c41. The idea there was to prevent a GET request to potentially large files; this is typically not an issue since attachments are often downloaded only once you start reading from the stream. So, let's just remove.

@spackbot-app spackbot-app bot added binary-packages commands core PR affects Spack core functionality environments fetching stage tests General test capability(ies) utilities labels Dec 13, 2022
@haampie haampie force-pushed the revert-revert-url_exists-2 branch from 5ca84bc to 6affc28 Compare December 13, 2022 13:31
@haampie haampie force-pushed the revert-revert-url_exists-2 branch from 6affc28 to cfb9ff1 Compare December 13, 2022 16:44
… `url_exists` through HEAD requests (spack#34324)""

This reverts commit 8035eeb.
In the past 4 years the initial HEAD request was never used to prevent a
GET request on wrong content-type. So let's just remove the HEAD request
then.
@haampie haampie force-pushed the revert-revert-url_exists-2 branch from cfb9ff1 to 8cf56eb Compare December 14, 2022 08:55
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Dec 14, 2022

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Dec 14, 2022

I've started that pipeline for you!

# It would be nice to do this with the HTTP Accept header to avoid
# one round-trip. However, most servers seem to ignore the header
# if you ask for a tarball with Accept: text/html.
req.get_method = lambda: "HEAD"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit you said broke stuff, is it the check if accept_content_type: that was the problem? I think that was just to avoid always checking for text/html, but I'm not following what the problem was. At any rate, if you say there's no need for this anymore anyway, I'm fine with this particular change.

Copy link
Copy Markdown
Member Author

@haampie haampie Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How it's supposed to work: in certain cases (spack checksum), it should only send GET requests if the content-type is something specific, which is obtained through a cheap, initial HEAD request. It was broken cause the GET request was no longer conditional. In the general case it's not necessary to send a HEAD request first, and I think this is how this function is used almost always -- this was not broken.

Copy link
Copy Markdown
Contributor

@scottwittenburg scottwittenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the bit I commented on (the bit that's not just reverting a revert), this all looks as good to me today as it did last week 😆 Thanks @haampie!

@haampie haampie merged commit ea02944 into spack:develop Dec 14, 2022
@haampie haampie deleted the revert-revert-url_exists-2 branch December 14, 2022 22:47
benkirk pushed a commit to benkirk/spack that referenced this pull request Dec 16, 2022
… `url_exists` through HEAD requests (spack#34324)"" (spack#34498)

This reverts commit 8035eeb.

And also removes logic around an additional HEAD request to prevent
a more expensive GET request on wrong content-type. Since large files
are typically an attachment and only downloaded when reading the
stream, it's not an optimization that helps much, and in fact the logic
was broken since the GET request was done unconditionally.
amd-toolchain-support pushed a commit to amd-toolchain-support/spack that referenced this pull request Feb 16, 2023
… `url_exists` through HEAD requests (spack#34324)"" (spack#34498)

This reverts commit 8035eeb.

And also removes logic around an additional HEAD request to prevent
a more expensive GET request on wrong content-type. Since large files
are typically an attachment and only downloaded when reading the
stream, it's not an optimization that helps much, and in fact the logic
was broken since the GET request was done unconditionally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binary-packages commands core PR affects Spack core functionality environments fetching stage tests General test capability(ies) utilities

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants