-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add --dry-run option to fetch
#5973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This was modelled after the `git-lfs push --dry-run` option. I've removed some apparently unused code (computing and reporting of ready nodes, and the output channel of the main fetch function which was always `nil`). For testing, I added `--dry-run` testing to a couple of cases, but I wonder if we should add it for _all_ cases. This is a part of what was discussed in git-lfs#5888.
|
Hi @redsun82! Thanks again for all these PRs! I am starting to work my way through them and thought I'd begin here. I just wanted to let you know I haven't forgotten about these PRs and that we really appreciate all your hard work (and patience!) |
chrisd8088
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks great, thank you so much! We also really appreciate the removal of unnecessary, stale code; that's a very pleasant bonus.
I had a few small suggestions before we merge this PR, but other than those minor details, I think this is all set. Thank you again!
| cfg.Remote(), tq.WithProgress(meter), tq.DryRun(fetchDryRunArg), | ||
| ) | ||
|
|
||
| if out != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t/t-fetch.sh
Outdated
| grep "Git LFS fsck OK" fsck.log | ||
|
|
||
| git lfs fetch --dry-run 2>&1 | tee fetch.log | ||
| ! grep "fetch .* => a\.dat" fetch.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ! negation here does work as expected, because this happens to be the last command in the test. However, it's a tricky situation because any command that we add in the future which follows this one will cause the check to become one which passes in all cases, even if the grep finds the pattern in the file and succeeds, meaning the ! should result in a failure exit code from the pipeline.
That would happen because we have the set -e option set, and as Bash's documentation states, that option has the following behaviour:
Exit immediately if a pipeline ... returns a non-zero status.
The shell does not exit if the command that fails is part of the command list immediately following awhileoruntilkeyword, part of the test in anifstatement, part of any command executed in a&&or||list except the command following the final&&or||, any command in a pipeline but the last, or if the command’s return status is being inverted with!.
Over time we inadvertently accumulated a number of these silently non-functional checks due to the use of negated pipelines, which we resolved in commit a5d20de of PR #5282.
In order to try to avoid that problem in the future, I'd suggest we change this check and the one in the test below to use a different idiom:
| ! grep "fetch .* => a\.dat" fetch.log | |
| grep "fetch .* => a\.dat" fetch.log && exit 1 | |
| true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! I actually though that a ! command would count as a single command inverting the exit code as far as -e was concerned, but I stand corrected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually though that a
!command would count as a single command inverting the exit code as far as !-e! was concerned
I think pretty much everyone who's worked on this project, including myself, has had the same expectation. And it's sneaky because a command with !, if it's the last one in the test, it all appears to work as you'd expect: the test fails if the command succeeds and ! inverts the exit code. But that's only because the inverted exit code from that last command is what's returned from the test; -e doesn't get involved in this special case.
I like the command && exit 1 || true technique you've used. It's a little tidier than the two-line idiom we used in a number of other places, and I don't think it will cause any problems if someone later adds subsequent commands to the test and forgets to remove the || true from the preceding command list.
|
Intriguingly, we're getting a flaky test result from the modified The lack of the expected output line for |
Hmm, that's indeed odd. I couldn't reproduce locally... Yeah, needs some more looking into. |
|
@chrisd8088 oh silly me, I had added the printing coroutine without waiting for it to finish. Should be fixed now! |
Indeed, I just discovered that same issue. Adding a |
chrisd8088
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling the flaky test problem!
I think we'll have to make a few more changes, as I've suggested, to avoid the reverse situation where the command never finishes in the non-dry-run case. But with those additional changes, I think we'll be all set.
I'm happy to push a commit to your branch, if you grant me permission; otherwise I'll leave this in your hands. And thank you again very much for working on these PRs!
|
@chrisd8088 I've implemented your suggestions, but also went ahead and gave you access to my fork, so feel free to add any commits! Thanks for the feedback, it's good to brush up on some Go knowledge 🙂 |
In prior commits in this PR we refactored and simplified the fetchAndReportToChan() function in our "commands" package so it is now just called fetch(), and also revised this function to introduce support for a new --dry-run option to our "git lfs fetch" command. When the --dry-run command-line option is specified, the fetch() function starts an anonymous goroutine to output the path and object ID of each Git LFS object the command would normally retrieve via a transfer queue. At the moment, this goroutine calls the Watch() method of the TransferQueue structure in our "tq" package to create and register a "watcher" channel with the queue, and then starts a loop to read the values sent over the channel by the queue, each of which represents an object the queue would transfer if it was not executing in a dry-run mode. So long as the goroutine calls the Watch() method before the transfer queue begins operating, this design works as we intend. However, if the goroutine's execution is delayed by the Go scheduler, the queue may perform all of its actions before the goroutine calls the Watch() method, and the fetch() function could invoke the Wait() method of the TransferQueue structure before the goroutine calls the Watch() method. Should this happen, the "watcher" channel created by the Watch() method will never be closed, because that only happens in the Wait() method, and that method has already run at a time when there were no extant channels to close. Therefore the goroutine's loop over the channel will never exit, and the goroutine will never decrement the count of the WaitGroup structure by calling its Done() method, and in turn the fetch() function's call to the WaitGroup's Wait() method will never complete, so the "git lfs fetch" command will stall indefinitely. To ensure we handle this scheduling condition, we create the "watcher" channel in the fetch() function before starting our anonymous goroutine, so the channel will always be registered with the TransferQueue before its Wait() method is called. Thus the channel will always be found and closed by that method, and the goroutine cannot enter an infinite wait state on the channel.
chrisd8088
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good!
Thank you for the invitation to collaborate; I've pushed up what I hope is one final change to deal with an edge case where the queue's Wait() method is called before the anonymous goroutine has a chance to register a new watcher channel. Otherwise, I found that inserting a time.Sleep(1 * time.Second) at the start of the goroutine was enough to make the command stall indefinitely.
commands/command_fetch.go
Outdated
| out <- p | ||
| } | ||
| defer wg.Done() | ||
| for p := range q.Watch() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this myself last night, and noticed that if I injected an artificial delay before the for ... range loop, as if the goroutine was not dispatched immediately, the loop never exited and the command would stall indefinitely.
I thought I'd worked around this by calling q.Watch() in advance of the loop, and assigning it to a variable, but in the fresh light of morning I see that that actually makes no difference. :-)
What we need to do instead is call q.Watch() before starting the goroutine at all. Otherwise, there's an edge case where the goroutine doesn't start executing immediately, and the entire dry-run transfer process completes in the meantime. In this instance, the q.Wait() call below is made before the goroutine starts, and because it finds no watcher channels (as the goroutine hasn't created one yet), it returns without closing any.
Since q.Wait() will never be called again to close the channel the goroutine creates, the goroutine's loop never exits, wg.Done() is never called, and wg.Wait() never returns.
I'll push up a commit which makes this change and adds some explanation in the description.
|
OK, I think we're all good here ... except that, sigh, our Windows CI job is failing for an unrelated reason. I'll work on getting that fixed ASAP, and then we can re-run the CI jobs here and merge this. |
This was modelled after the
git-lfs push --dry-runoption. I've removed some apparently unused code:nil.For testing, I added
--dry-runtesting to a couple of cases, but I wonder if we should add it for all cases.This is a part of what was discussed in #5888.