-
Notifications
You must be signed in to change notification settings - Fork 2.2k
commands: add logging to git-lfs-migrate-info(1) #2329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // queue is the incoming, unbuffered queue of tasks to enqueue. | ||
| queue chan Task | ||
| // tasks is the set of tasks to process. | ||
| tasks chan Task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you skip queue and just use tasks, it should remove a lot of of complexity from consume(). The comment sounds like you intended for tasks to be buffered, but it's not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I added the queue and tasks channel so that new tasks could be enqueued before the current task is finished processing.
Looking back through this, I realized that I removed the callsite in the code that originally necessitated this, but I think if we want to use this package as a general purpose logging utility for LFS, it's worthwhile to keep the extra complexity.
I added a test to ensure that this non-blocking behavior is in place. Before 930c151:
~/g/git-lfs (migrate-logger) $ go test -v ./git/githistory/log -run TestLoggerLogsMultipleTasksWithoutBlocking
=== RUN TestLoggerLogsMultipleTasksWithoutBlocking
# ...
and with 930c151:
~/g/git-lfs (migrate-logger) $ go test -v ./git/githistory/log -run TestLoggerLogsMultipleTasksWithoutBlocking
=== RUN TestLoggerLogsMultipleTasksWithoutBlocking
--- PASS: TestLoggerLogsMultipleTasksWithoutBlocking (0.00s)
PASS
ok github.com/git-lfs/git-lfs/git/githistory/log 0.013s
git/githistory/log/log.go
Outdated
|
|
||
| var last string | ||
| for last = range task.Updates() { | ||
| l.logLine(last) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth watching your CPU on an intense write load. The existing spinner that fetch updates the terminal no more frequently than every 200ms. It's plenty time to represent progress to humans, and doesn't write to the terminal nearly as frequently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I tried to implement this with channels, but the code ended up being simpler with time arithmetic. As of fc1faca, you can specify a throttle which is the minimum amount of time that must pass between logs. Additional log updates will be dropped on the ground.
Even if updates aren't logged, however, the logger will still remember them, so if a PercentageTask only gets to 95% within the throttle window, it will always finish as: msg: 100% ..., done.
In commit e90d54d of PR git-lfs#2329 the "git/githistory/log" package was added, including the PercentageTask structure and associated methods. Then in commit 122b765 of PR git-lfs#2610 the Entry() method was added to provide support for verbose logging in the "git lfs migrate impot" command. However, the name of the receiver argument for the Entry() method differs from that of the other methods for the PercentageTask structure, so we update it to match now, as well as fixing a typo which remains from the original commit in PR git-lfs#2329. Note that the "git/githistory/log" package was later renamed to the "tasklog" package in commits b9ab79e and 45c580e of PR git-lfs#2747.
The *Logger.consume() method in the "tasklog" package was first introduced in commit e90d54d of PR git-lfs#2329, and since that time has included logic to manage an internal "pending" list of tasks; however, that logic is never exercised because tasks are only appended to the "pending" array when the "next" variable is non-nil, but that variable is initialized as nil and can only be set to another value if the "pending" array is non-empty. As a result, we can remove both variables and the related logic and simplify the consume() method as a result.
In commit e90d54d of PR git-lfs#2329 the "git/githistory/log" package was added, including the PercentageTask structure and associated methods, but unlike the WaitingTask which was added at the same time, no Complete() method was defined for the PercentageTask structure. (Note that the "git/githistory/log" package was later renamed to the "tasklog" package in commits b9ab79e and 45c580e of PR git-lfs#2747.) Other task types such as the ListTask and SimpleTask structures (introduced in commit 31ffeb9 of PR git-lfs#2335 and commit 7a760b6 of PR git-lfs#2756, respectively) provide a Complete() method, and the Meter task type of the "tq" package, which implemented the Task interface in commit 7c0f9e2 of PR git-lfs#2732, provides an equivalent Finish() method. These methods allow the caller to explicitly close the channel returned by the Updates() method, on which the anonymous goroutine started by the Logger.consume() method for the task is waiting, in a "range" loop on the channel in the Logger.logTask() method. One key use of the PercentageTask structure is in the methods of the Rewriter structure from the "git/githistory" package. It is initialized with the number of commits to be rewritten during a "git lfs migrate" command's traversal of a Git repository's history. As each commit is rewritten, the Count() method is called to increment the percentage completed value. When every commit has been rewritten, the count reaches the expected total, and the Count() method closes the channel, thus allowing the task receiving updates to finish and exit its goroutine. Under exceptional circumstances, though, the Rewrite() method of the Rewriter structure may never finish iterating through all the expected commits, and return in error or via a panic call. When this happens, the goroutine waiting on the PercentageTask's updates channel can never exit, leading to a hung process which never fully exits. In order to allow the Rewriter's Rewrite() method to define a deferred function which will always be called when it returns, under any circumstances, we add a Complete() method for the PercentageTask structure which sets the number of completed elements to the expected total in an atomic swap, and then closes the updates channel if the number of completed elements was previously less than the expected total. We also add a test to validate this new method's behaviour, and update the existing TestPercentageTaskCallsDoneWhenComplete() function to also confirm that calling the new Complete() method after the number of completed elements has reached the expected total does not cause a second attempt to close the updates channel.
The *Logger.consume() method in the "tasklog" package was first introduced in commit e90d54d of PR git-lfs#2329, and since that time has included logic to manage an internal "pending" list of tasks; however, that logic is never exercised because tasks are only appended to the "pending" array when the "next" variable is non-nil, but that variable is initialized as nil and can only be set to another value if the "pending" array is non-empty. As a result, we can remove both variables and the related logic and simplify the consume() method as a result.
In commit e90d54d of PR git-lfs#2329 the "git/githistory/log" package was added, including the PercentageTask structure and associated methods, but unlike the WaitingTask which was added at the same time, no Complete() method was defined for the PercentageTask structure. (Note that the "git/githistory/log" package was later renamed to the "tasklog" package in commits b9ab79e and 45c580e of PR git-lfs#2747.) Other task types such as the ListTask and SimpleTask structures (introduced in commit 31ffeb9 of PR git-lfs#2335 and commit 7a760b6 of PR git-lfs#2756, respectively) provide a Complete() method, and the Meter task type of the "tq" package, which implemented the Task interface in commit 7c0f9e2 of PR git-lfs#2732, provides an equivalent Finish() method. These methods allow the caller to explicitly close the channel returned by the Updates() method, on which the anonymous goroutine started by the Logger.consume() method for the task is waiting, in a "range" loop on the channel in the Logger.logTask() method. One key use of the PercentageTask structure is in the methods of the Rewriter structure from the "git/githistory" package. It is initialized with the number of commits to be rewritten during a "git lfs migrate" command's traversal of a Git repository's history. As each commit is rewritten, the Count() method is called to increment the percentage completed value. When every commit has been rewritten, the count reaches the expected total, and the Count() method closes the channel, thus allowing the task receiving updates to finish and exit its goroutine. Under exceptional circumstances, though, the Rewrite() method of the Rewriter structure may never finish iterating through all the expected commits, and return in error or via a panic call. When this happens, the goroutine waiting on the PercentageTask's updates channel can never exit, leading to a hung process which never fully exits. In order to allow the Rewriter's Rewrite() method to define a deferred function which will always be called when it returns, under any circumstances, we add a Complete() method for the PercentageTask structure which sets the number of completed elements to the expected total in an atomic swap, and then closes the updates channel if the number of completed elements was previously less than the expected total. We also add a test to validate this new method's behaviour, and update the existing TestPercentageTaskCallsDoneWhenComplete() function to also confirm that calling the new Complete() method after the number of completed elements has reached the expected total does not cause a second attempt to close the updates channel.
In PR git-lfs#2329 the output of the "git lfs migrate" command was revised to use the methods of the Logger structure of the "log" package to output progress messages. This package was originally located within the "git/githistory" source directory, but was later moved and renamed to the "tlog" package and then to the current "tasklog" package in PR git-lfs#2747. As part of the changes in PR git-lfs#2329, the methods of the Rewriter structure in the "githistory" package were updated to use the methods of the Logger and PercentageTask types in the "log" package to generate progress messages which began with the Git LFS subcommand name, for example, "migrate: Rewriting commits". Since that PR, multiple other informational and error messages output by the "git lfs migrate" command have been added, and while most begin with the "migrate:" prefix, not all do. For instance, running the "git lfs migrate import --verbose" command outputs commit SHAs and file paths with the prefix, but Git references and the mapping of their old and new SHAs without the prefix. Two other Git LFS commands, the "git lfs prune" and "git lfs fetch" commands, follow the same design and output their subcommand names as prefixes to their progress messages, but no other commands do this. Git commands also do not prefix their progress messages with the name of the subcommand (although messages from a Git remote may be reported with the prefix "remote:"). To help bring all our commands' progress messages into alignment, we simply remove the "migrate:" prefix from the messages output by the "git lfs migrate" command. We have similarly altered the progress messages of the "git lfs prune" and "git lfs fetch" commands in previous commits in this PR.
In PR git-lfs#2329 the output of the "git lfs migrate" command was revised to use the methods of the Logger structure of the "log" package to output progress messages. This package was originally located within the "git/githistory" source directory, but was later moved and renamed to the "tlog" package and then to the current "tasklog" package in PR git-lfs#2747. As part of the changes in PR git-lfs#2329, the methods of the Rewriter structure in the "githistory" package were updated to use the methods of the Logger and PercentageTask types in the "log" package to generate progress messages which began with the Git LFS subcommand name, for example, "migrate: Rewriting commits". Since that PR, multiple other informational and error messages output by the "git lfs migrate" command have been added, and while most begin with the "migrate:" prefix, not all do. For instance, running the "git lfs migrate import --verbose" command outputs commit SHAs and file paths with the prefix, but Git references and the mapping of their old and new SHAs without the prefix. Two other Git LFS commands, the "git lfs prune" and "git lfs fetch" commands, follow the same design and output their subcommand names as prefixes to their progress messages, but no other commands do this. Git commands also do not prefix their progress messages with the name of the subcommand (although messages from a Git remote may be reported with the prefix "remote:"). To help bring all our commands' progress messages into alignment, we simply remove the "migrate:" prefix from the messages output by the "git lfs migrate" command. We have similarly altered the progress messages of the "git lfs prune" and "git lfs fetch" commands in previous commits in this PR.
In PR git-lfs#2329 the output of the "git lfs migrate" command was revised to use the methods of the Logger structure of the "log" package to output progress messages. This package was originally located within the "git/githistory" source directory, but was later moved and renamed to the "tlog" package and then to the current "tasklog" package in PR git-lfs#2747. As part of the changes in PR git-lfs#2329, the methods of the Rewriter structure in the "githistory" package were updated to use the methods of the Logger and PercentageTask types in the "log" package to generate progress messages which began with the Git LFS subcommand name, for example, "migrate: Rewriting commits". Since that PR, multiple other informational and error messages output by the "git lfs migrate" command have been added, and while most begin with the "migrate:" prefix, not all do. For instance, running the "git lfs migrate import --verbose" command outputs commit SHAs and file paths with the prefix, but Git references and the mapping of their old and new SHAs without the prefix. Two other Git LFS commands, the "git lfs prune" and "git lfs fetch" commands, follow the same design and output their subcommand names as prefixes to their progress messages, but no other commands do this. Git commands also do not prefix their progress messages with the name of the subcommand (although messages from a Git remote may be reported with the prefix "remote:"). To help bring all our commands' progress messages into alignment, we simply remove the "migrate:" prefix from the messages output by the "git lfs migrate" command. We have similarly altered the progress messages of the "git lfs prune" and "git lfs fetch" commands in previous commits in this PR.
This pull request introduces a new package,
github.com/git-lfs/git-lfs/git/githistory/logfor Git-style progress based logging, like this:This pull request is based off the
migrate-subcommand-infobranch, since I wanted to use theinfo(1)subcommand to test out the logger in a real-world setting. Once #2313 is merged, I'll change the base of this pull request tomaster.The API works as follows: given a
*log.Loggerwith asink io.Writer, ask for one of two types of logging tasks:*WaitingTask(via(*log.Logger) Waiter()): a task for which the upper bound of items to process is not known. Logged asmessage: ...andmessage: ..., done.*PercentageTask(via(*log.Logger) Percentage()): a task for which the upper bound of items is known. Logged asmessage: 1% (1/100)andmessage: 100% (100/100), done.When a new task is created (via either
Waiter()orPercentage()) that task is queued by the logger and becomes the primary task being logged. Only one task may be logged at a time, though new tasks can be created before the existing task is completed. This, together, creates the two axioms of*log.Logger:The finer details of how this works are described in the
consume()logger function.One other note: this was originally designed as a utility for the
githistorypackage, but I think that it could have other uses throughout LFS. I'd like to eventually consider what making this a top-levelgithub.com/git-lfs/git-lfs/logpackage could look like./cc @git-lfs/core
/ref #2146, #2313