spack repo update: fix broken partial package repository fetching#50997
spack repo update: fix broken partial package repository fetching#50997becker33 merged 15 commits intospack:developfrom
spack repo update: fix broken partial package repository fetching#50997Conversation
45d0005 to
3959566
Compare
|
did you try this? is it relevant? the comment is from 9 years ago. |
|
We do see repositories getting throttled at LLNL from GitHub, but I'm not sure how I'd test the load on GitHub's side. I think it's worth erring on the side of caution here until we hear otherwise from GitHub. I looked at brew and it still looks like they're preferring full clones over shallow clones for taps. |
|
In my experience the initial shallow fetch is significantly faster, and for updates I expect the same. There is also the use case of CI with ephemeral environments where full clones make little sense. |
The initial fetch is faster, but shallow updates don't really work. (I thought they'd work but it turns out they don't after living with If > 20 commits have happened since you last ran To correctly do shallow updates we'd have to modify the depth argument dynamically to be the number of commits since the user's HEAD and I believe that's what folks are talking about here, and that operation is computationally expensive on GitHub's side. If we drop the depth argument all together on the second time we run |
|
For CI/CD use cases I agree it would be really nice to keep the shallow cloning, I just don't know how to do that yet. (Edit: testing with blobless clones I'm seeing near identical download times and comparable disk usage.) |
55cd369 to
c000a07
Compare
spack repo update: fix broken partial package repository fetching
becker33
left a comment
There was a problem hiding this comment.
The depth argument in _clone_or_pull is now vestigial.
How robust is the testing for this?
|
@alecbcs can we:
Doing this would:
Thoughts? |
3f26b62 to
9e88961
Compare
|
I've dropped the I've tested this on a shallow download and it successfully updated the repository by deepening it to a full clone. |
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
Co-authored-by: Todd Gamblin <[email protected]> Signed-off-by: Alec Scott <[email protected]>
Co-authored-by: Todd Gamblin <[email protected]> Signed-off-by: Alec Scott <[email protected]>
Co-authored-by: Todd Gamblin <[email protected]> Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
20feb86 to
5673c46
Compare
Signed-off-by: Alec Scott <[email protected]>
03e31f4 to
721db75
Compare
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
Signed-off-by: Alec Scott <[email protected]>
9ff71c6 to
5f557d0
Compare
Signed-off-by: Alec Scott <[email protected]>
5f557d0 to
8e0f9ff
Compare
…pack#50997) * Switch to full clones / fetches to improve ironically download speed --------- Signed-off-by: Alec Scott <[email protected]> Signed-off-by: Alec Scott <[email protected]> Co-authored-by: Todd Gamblin <[email protected]> Signed-off-by: Angelica Loshak <[email protected]>
When we first added
spack repo updateI tried to be clever and perform partial clones to save users bandwidth.After merging the initial
spack repo updatefunctionality in #50868, I started to notice periodic errors when updating my repository with the command and discovered that partial updates don't work how I thought they would. Right now we rungit fetch --depth=20and if there have been > 20 commits to the packages repository since you last pulled the develop branch you'll get an from git error about disconnected histories.This happens because git only downloads the latest 20 commits and doesn't not how to fast forward your existing repository if there are gaps in the history.
(TLDR;
spack repo updateis broken right now and we shouldn't release v1.0 as it currently is. Sorry!!)To fix this we should instead perform a full fetch of the repository when updating an existing repository to get the full history. Both CocoaPods and brew tried doing partial downloads of their package repositories and ultimately moved away from it due to requests from GitHub to reduce compute time on the backend resources. We're not nearly as big as either of those projects, but I think we should learn from them and follow suit in performing full clones for now until we hear otherwise from GitHub.