s3: cache client instance by haampie · Pull Request #34372 · spack/spack

haampie · 2022-12-07T15:10:03Z

Use a cache of the form

(mirror name, fetch/push) -> Client(...)

for S3 connections, to avoid the overhead of creating it on every request.

With this PR:

In [1]: import spack.util.web

In [2]: %timeit spack.util.web.url_exists("s3://spack-binaries/does/not/exist")
105 ms ± 7.54 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Before:

In [1]: import spack.util.web

In [2]: %timeit spack.util.web.url_exists("s3://spack-binaries/does/not/exist")
2.2 s ± 238 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

scottwittenburg

Thanks @haampie! Caching s3 clients is a great idea, and this looks to be a fairly straightforward (let's put that in air quotes though) adaptation of what we had before to support that. I wasn't really familiar with this bit of the code before, so I added some questions to check my understanding. Feel free to ignore them unless you have time to answer.

Otherwise, LGTM. I'm not sure how well the unit tests mock s3 read/write, so maybe I should keep an eye on the pipelines after this is merged 😜

scottwittenburg · 2022-12-08T20:05:03Z

lib/spack/spack/test/web.py

-    }
+    mirror = spack.mirror.Mirror.from_dict(
+        {
+            "fetch": {


Something I'm confused about is why you changed from an arbitrary dictionary of connection data (mock_connection_data) to this thing that is actually a Mirror. For one thing, my understanding was that the endpoint_url was optional (and only used/needed for S3 buckets that don't live in AWS S3, but rather minio or similar). If this were a real S3 mirror config, would it have another url? I.e. the s3:\\ url?

Also, I'm surprised it's valid to specify all three auth methods simultaneously, I would have thought you'd want to specify only one of access_token, profile, or access_pair. Maybe specifying all three just helps with testing though?

Many different urls correspond to one (or no) mirror, so I thought it was an easier key to use

scottwittenburg · 2022-12-08T20:06:22Z

lib/spack/spack/s3_handler.py


 def _s3_open(url):
    parsed = url_util.parse(url)
-    s3 = s3_util.create_s3_session(parsed, connection=s3_util.get_mirror_connection(parsed))


Was it a bug that this didn't pass url_type="fetch" previously?

I think so yes

scottwittenburg · 2022-12-08T20:07:40Z

lib/spack/spack/util/web.py

    if url_result.scheme == "s3":
        # Check for URL-specific connection information
-        s3 = s3_util.create_s3_session(
-            url_result, connection=s3_util.get_mirror_connection(url_result)


Is this another case where url_type="fetch" should have been passed before?

scottwittenburg · 2022-12-08T20:14:42Z

lib/spack/spack/util/s3.py

+import spack.config
 import spack.util.url as url_util

+#: Map (mirror name, method) tuples to s3 client instances.


Based on our conversation in slack, I was expecting a mapping from (url, method) to s3 client instance, rather than (name, method) to client instance.

scottwittenburg · 2022-12-08T20:17:26Z

lib/spack/spack/util/s3.py

+    mirrors = [
+        (name, mirror)
+        for name, mirror in all_mirrors.items()
+        if url_str.startswith(get_mirror_url(mirror))


Maybe I get it. This is where you've converted from the provide url to a mirror name, which you've possibly done in support of the future goal of using mirror names (rather than mirror urls) everywhere?

Actually the old code was already like this, but maybe a comment got lost

haampie · 2022-12-08T22:15:02Z

@spackbot run pipeline

spackbot-app · 2022-12-08T22:15:05Z

I'm sorry, gitlab does not have your latest revision yet, I can't run that pipeline for you right now.

One likely possibility is that your PR pipeline has been temporarily deferred, in which case, it is awaiting a develop pipeline, and will be run when that finishes.

Please check the gitlab commit status message to see if more information is available.

Details

Unexpected response from gitlab: {'message': '404 Commit Not Found'}

spackbot-app bot added core PR affects Spack core functionality fetching tests General test capability(ies) utilities labels Dec 7, 2022

haampie mentioned this pull request Dec 7, 2022

Various binary cache improvements #34371

Open

haampie force-pushed the fix/cache-s3-connection branch from 964f348 to 59bef1d Compare December 7, 2022 15:11

s3: cache client instance

ee6732c

haampie force-pushed the fix/cache-s3-connection branch from 59bef1d to ee6732c Compare December 7, 2022 15:17

haampie requested review from josephsnyder and scottwittenburg December 7, 2022 16:36

haampie added 2 commits December 8, 2022 09:22

Merge branch 'develop' into fix/cache-s3-connection

1523847

circular import issues

6a60b3e

haampie force-pushed the fix/cache-s3-connection branch from b4e554c to 6a60b3e Compare December 8, 2022 10:04

order

a206aff

scottwittenburg approved these changes Dec 8, 2022

View reviewed changes

haampie merged commit 7e054cb into spack:develop Dec 9, 2022

haampie deleted the fix/cache-s3-connection branch December 9, 2022 07:50

amd-toolchain-support pushed a commit to amd-toolchain-support/spack that referenced this pull request Feb 16, 2023

s3: cache client instance (spack#34372)

fb8b52f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s3: cache client instance#34372

s3: cache client instance#34372
haampie merged 4 commits intospack:developfrom
haampie:fix/cache-s3-connection

haampie commented Dec 7, 2022 •

edited

Loading

Uh oh!

scottwittenburg left a comment

Uh oh!

scottwittenburg Dec 8, 2022

Uh oh!

haampie Dec 8, 2022

Uh oh!

scottwittenburg Dec 8, 2022

Uh oh!

haampie Dec 8, 2022

Uh oh!

scottwittenburg Dec 8, 2022

Uh oh!

scottwittenburg Dec 8, 2022

Uh oh!

scottwittenburg Dec 8, 2022

Uh oh!

haampie Dec 8, 2022

Uh oh!

haampie commented Dec 8, 2022

Uh oh!

spackbot-app bot commented Dec 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

haampie commented Dec 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scottwittenburg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

haampie commented Dec 8, 2022

Uh oh!

spackbot-app bot commented Dec 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haampie commented Dec 7, 2022 •

edited

Loading