Skip to content

Conversation

@ttaylorr
Copy link
Contributor

This pull request introduces a new package, github.com/git-lfs/git-lfs/git/githistory/log for Git-style progress based logging, like this:

This pull request is based off the migrate-subcommand-info branch, since I wanted to use the info(1) subcommand to test out the logger in a real-world setting. Once #2313 is merged, I'll change the base of this pull request to master.

The API works as follows: given a *log.Logger with a sink io.Writer, ask for one of two types of logging tasks:

  • *WaitingTask (via (*log.Logger) Waiter()): a task for which the upper bound of items to process is not known. Logged as message: ... and message: ..., done.
  • *PercentageTask (via (*log.Logger) Percentage()): a task for which the upper bound of items is known. Logged as message: 1% (1/100) and message: 100% (100/100), done.

When a new task is created (via either Waiter() or Percentage()) that task is queued by the logger and becomes the primary task being logged. Only one task may be logged at a time, though new tasks can be created before the existing task is completed. This, together, creates the two axioms of *log.Logger:

  1. Only one task may be logged at once, and that task is logged until completion.
  2. "Infinite" tasks may be created before the current task is finished logging, creating a pseudo-infinite channel buffer.

The finer details of how this works are described in the consume() logger function.

One other note: this was originally designed as a utility for the githistory package, but I think that it could have other uses throughout LFS. I'd like to eventually consider what making this a top-level github.com/git-lfs/git-lfs/log package could look like.


/cc @git-lfs/core
/ref #2146, #2313

@ttaylorr ttaylorr added this to the v2.2.0 milestone Jun 15, 2017
@ttaylorr ttaylorr requested a review from technoweenie June 15, 2017 16:46
// queue is the incoming, unbuffered queue of tasks to enqueue.
queue chan Task
// tasks is the set of tasks to process.
tasks chan Task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you skip queue and just use tasks, it should remove a lot of of complexity from consume(). The comment sounds like you intended for tasks to be buffered, but it's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I added the queue and tasks channel so that new tasks could be enqueued before the current task is finished processing.

Looking back through this, I realized that I removed the callsite in the code that originally necessitated this, but I think if we want to use this package as a general purpose logging utility for LFS, it's worthwhile to keep the extra complexity.

I added a test to ensure that this non-blocking behavior is in place. Before 930c151:

~/g/git-lfs (migrate-logger) $ go test -v ./git/githistory/log -run TestLoggerLogsMultipleTasksWithoutBlocking
=== RUN   TestLoggerLogsMultipleTasksWithoutBlocking
# ...

and with 930c151:

~/g/git-lfs (migrate-logger) $ go test -v ./git/githistory/log -run TestLoggerLogsMultipleTasksWithoutBlocking
=== RUN   TestLoggerLogsMultipleTasksWithoutBlocking
--- PASS: TestLoggerLogsMultipleTasksWithoutBlocking (0.00s)
PASS
ok      github.com/git-lfs/git-lfs/git/githistory/log   0.013s


var last string
for last = range task.Updates() {
l.logLine(last)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth watching your CPU on an intense write load. The existing spinner that fetch updates the terminal no more frequently than every 200ms. It's plenty time to represent progress to humans, and doesn't write to the terminal nearly as frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I tried to implement this with channels, but the code ended up being simpler with time arithmetic. As of fc1faca, you can specify a throttle which is the minimum amount of time that must pass between logs. Additional log updates will be dropped on the ground.

Even if updates aren't logged, however, the logger will still remember them, so if a PercentageTask only gets to 95% within the throttle window, it will always finish as: msg: 100% ..., done.

@ttaylorr ttaylorr changed the base branch from migrate-subcommand-info to master June 16, 2017 18:11
@ttaylorr ttaylorr merged commit 374a6ef into master Jun 19, 2017
@ttaylorr ttaylorr deleted the migrate-logger branch July 4, 2017 22:32
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 26, 2023
In commit e90d54d of PR git-lfs#2329 the
"git/githistory/log" package was added, including the PercentageTask
structure and associated methods.  Then in commit
122b765 of PR git-lfs#2610 the Entry()
method was added to provide support for verbose logging in the
"git lfs migrate impot" command.  However, the name of the
receiver argument for the Entry() method differs from that of
the other methods for the PercentageTask structure, so we update
it to match now, as well as fixing a typo which remains from the
original commit in PR git-lfs#2329.

Note that the "git/githistory/log" package was later renamed to the
"tasklog" package in commits b9ab79e
and 45c580e of PR git-lfs#2747.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 26, 2023
The *Logger.consume() method in the "tasklog" package was first
introduced in commit e90d54d of
PR git-lfs#2329, and since that time has included logic to manage an
internal "pending" list of tasks; however, that logic is never
exercised because tasks are only appended to the "pending" array
when the "next" variable is non-nil, but that variable is
initialized as nil and can only be set to another value if the
"pending" array is non-empty.

As a result, we can remove both variables and the related logic
and simplify the consume() method as a result.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 26, 2023
In commit e90d54d of PR git-lfs#2329 the
"git/githistory/log" package was added, including the PercentageTask
structure and associated methods, but unlike the WaitingTask which
was added at the same time, no Complete() method was defined for
the PercentageTask structure.

(Note that the "git/githistory/log" package was later renamed to the
"tasklog" package in commits b9ab79e
and 45c580e of PR git-lfs#2747.)

Other task types such as the ListTask and SimpleTask structures
(introduced in commit 31ffeb9 of
PR git-lfs#2335 and commit 7a760b6 of PR git-lfs#2756,
respectively) provide a Complete() method, and the Meter task type
of the "tq" package, which implemented the Task interface in commit
7c0f9e2 of PR git-lfs#2732, provides an
equivalent Finish() method.  These methods allow the caller to
explicitly close the channel returned by the Updates() method,
on which the anonymous goroutine started by the Logger.consume() method
for the task is waiting, in a "range" loop on the channel in the
Logger.logTask() method.

One key use of the PercentageTask structure is in the methods of
the Rewriter structure from the "git/githistory" package.  It is
initialized with the number of commits to be rewritten during
a "git lfs migrate" command's traversal of a Git repository's history.
As each commit is rewritten, the Count() method is called to
increment the percentage completed value.  When every commit has
been rewritten, the count reaches the expected total, and the
Count() method closes the channel, thus allowing the task receiving
updates to finish and exit its goroutine.

Under exceptional circumstances, though, the Rewrite() method of
the Rewriter structure may never finish iterating through all
the expected commits, and return in error or via a panic call.
When this happens, the goroutine waiting on the PercentageTask's
updates channel can never exit, leading to a hung process which
never fully exits.

In order to allow the Rewriter's Rewrite() method to define a
deferred function which will always be called when it returns,
under any circumstances, we add a Complete() method for the
PercentageTask structure which sets the number of completed elements
to the expected total in an atomic swap, and then closes the
updates channel if the number of completed elements was previously
less than the expected total.

We also add a test to validate this new method's behaviour, and
update the existing TestPercentageTaskCallsDoneWhenComplete()
function to also confirm that calling the new Complete() method
after the number of completed elements has reached the expected
total does not cause a second attempt to close the updates channel.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 26, 2023
The *Logger.consume() method in the "tasklog" package was first
introduced in commit e90d54d of
PR git-lfs#2329, and since that time has included logic to manage an
internal "pending" list of tasks; however, that logic is never
exercised because tasks are only appended to the "pending" array
when the "next" variable is non-nil, but that variable is
initialized as nil and can only be set to another value if the
"pending" array is non-empty.

As a result, we can remove both variables and the related logic
and simplify the consume() method as a result.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request May 26, 2023
In commit e90d54d of PR git-lfs#2329 the
"git/githistory/log" package was added, including the PercentageTask
structure and associated methods, but unlike the WaitingTask which
was added at the same time, no Complete() method was defined for
the PercentageTask structure.

(Note that the "git/githistory/log" package was later renamed to the
"tasklog" package in commits b9ab79e
and 45c580e of PR git-lfs#2747.)

Other task types such as the ListTask and SimpleTask structures
(introduced in commit 31ffeb9 of
PR git-lfs#2335 and commit 7a760b6 of PR git-lfs#2756,
respectively) provide a Complete() method, and the Meter task type
of the "tq" package, which implemented the Task interface in commit
7c0f9e2 of PR git-lfs#2732, provides an
equivalent Finish() method.  These methods allow the caller to
explicitly close the channel returned by the Updates() method,
on which the anonymous goroutine started by the Logger.consume() method
for the task is waiting, in a "range" loop on the channel in the
Logger.logTask() method.

One key use of the PercentageTask structure is in the methods of
the Rewriter structure from the "git/githistory" package.  It is
initialized with the number of commits to be rewritten during
a "git lfs migrate" command's traversal of a Git repository's history.
As each commit is rewritten, the Count() method is called to
increment the percentage completed value.  When every commit has
been rewritten, the count reaches the expected total, and the
Count() method closes the channel, thus allowing the task receiving
updates to finish and exit its goroutine.

Under exceptional circumstances, though, the Rewrite() method of
the Rewriter structure may never finish iterating through all
the expected commits, and return in error or via a panic call.
When this happens, the goroutine waiting on the PercentageTask's
updates channel can never exit, leading to a hung process which
never fully exits.

In order to allow the Rewriter's Rewrite() method to define a
deferred function which will always be called when it returns,
under any circumstances, we add a Complete() method for the
PercentageTask structure which sets the number of completed elements
to the expected total in an atomic swap, and then closes the
updates channel if the number of completed elements was previously
less than the expected total.

We also add a test to validate this new method's behaviour, and
update the existing TestPercentageTaskCallsDoneWhenComplete()
function to also confirm that calling the new Complete() method
after the number of completed elements has reached the expected
total does not cause a second attempt to close the updates channel.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Feb 25, 2025
In PR git-lfs#2329 the output of the "git lfs migrate" command was revised to
use the methods of the Logger structure of the "log" package to output
progress messages.  This package was originally located within the
"git/githistory" source directory, but was later moved and renamed to
the "tlog" package and then to the current "tasklog" package in PR git-lfs#2747.

As part of the changes in PR git-lfs#2329, the methods of the Rewriter
structure in the "githistory" package were updated to use the methods
of the Logger and PercentageTask types in the "log" package to generate
progress messages which began with the Git LFS subcommand name, for
example, "migrate: Rewriting commits".

Since that PR, multiple other informational and error messages output
by the "git lfs migrate" command have been added, and while most begin
with the "migrate:" prefix, not all do.  For instance, running the
"git lfs migrate import --verbose" command outputs commit SHAs and
file paths with the prefix, but Git references and the mapping of their
old and new SHAs without the prefix.

Two other Git LFS commands, the "git lfs prune" and "git lfs fetch"
commands, follow the same design and output their subcommand names as
prefixes to their progress messages, but no other commands do this.
Git commands also do not prefix their progress messages with the name
of the subcommand (although messages from a Git remote may be reported
with the prefix "remote:").

To help bring all our commands' progress messages into alignment, we
simply remove the "migrate:" prefix from the messages output by the
"git lfs migrate" command.  We have similarly altered the progress
messages of the "git lfs prune" and "git lfs fetch" commands in
previous commits in this PR.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Feb 25, 2025
In PR git-lfs#2329 the output of the "git lfs migrate" command was revised to
use the methods of the Logger structure of the "log" package to output
progress messages.  This package was originally located within the
"git/githistory" source directory, but was later moved and renamed to
the "tlog" package and then to the current "tasklog" package in PR git-lfs#2747.

As part of the changes in PR git-lfs#2329, the methods of the Rewriter
structure in the "githistory" package were updated to use the methods
of the Logger and PercentageTask types in the "log" package to generate
progress messages which began with the Git LFS subcommand name, for
example, "migrate: Rewriting commits".

Since that PR, multiple other informational and error messages output
by the "git lfs migrate" command have been added, and while most begin
with the "migrate:" prefix, not all do.  For instance, running the
"git lfs migrate import --verbose" command outputs commit SHAs and
file paths with the prefix, but Git references and the mapping of their
old and new SHAs without the prefix.

Two other Git LFS commands, the "git lfs prune" and "git lfs fetch"
commands, follow the same design and output their subcommand names as
prefixes to their progress messages, but no other commands do this.
Git commands also do not prefix their progress messages with the name
of the subcommand (although messages from a Git remote may be reported
with the prefix "remote:").

To help bring all our commands' progress messages into alignment, we
simply remove the "migrate:" prefix from the messages output by the
"git lfs migrate" command.  We have similarly altered the progress
messages of the "git lfs prune" and "git lfs fetch" commands in
previous commits in this PR.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Feb 25, 2025
In PR git-lfs#2329 the output of the "git lfs migrate" command was revised to
use the methods of the Logger structure of the "log" package to output
progress messages.  This package was originally located within the
"git/githistory" source directory, but was later moved and renamed to
the "tlog" package and then to the current "tasklog" package in PR git-lfs#2747.

As part of the changes in PR git-lfs#2329, the methods of the Rewriter
structure in the "githistory" package were updated to use the methods
of the Logger and PercentageTask types in the "log" package to generate
progress messages which began with the Git LFS subcommand name, for
example, "migrate: Rewriting commits".

Since that PR, multiple other informational and error messages output
by the "git lfs migrate" command have been added, and while most begin
with the "migrate:" prefix, not all do.  For instance, running the
"git lfs migrate import --verbose" command outputs commit SHAs and
file paths with the prefix, but Git references and the mapping of their
old and new SHAs without the prefix.

Two other Git LFS commands, the "git lfs prune" and "git lfs fetch"
commands, follow the same design and output their subcommand names as
prefixes to their progress messages, but no other commands do this.
Git commands also do not prefix their progress messages with the name
of the subcommand (although messages from a Git remote may be reported
with the prefix "remote:").

To help bring all our commands' progress messages into alignment, we
simply remove the "migrate:" prefix from the messages output by the
"git lfs migrate" command.  We have similarly altered the progress
messages of the "git lfs prune" and "git lfs fetch" commands in
previous commits in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants