Skip to content

Conversation

@chrisd8088
Copy link
Member

@chrisd8088 chrisd8088 commented May 27, 2023

This PR revises our handling of structures that implement the Task interface of the tasklog package so that in all cases we always immediately register a deferred call to their Complete() method after allocating and initializing a new structure. This ensures that the channels created for the structures are always closed, which prevents Logger instances waiting on those channels from hanging indefinitely following a panic or other exceptional error condition.

As well, we simplify some instances of the construction of these types of structures, and clean up some details in the implementation of the tasklog package.


In the case of the PercentageTask structure, we first add a new Complete() method, which does not currently exist (although it was discussed in #2757 (comment), in relation to the git-lfs-prune(1) command). We can then register a call to our new *PercentageTask.Complete() method with defer following the creation of a PercentageTask structure, such as occurs in the Rewrite() method of the Rewrite structure in the git/githistory package.

This particular use case resolves one of the problems reported in #5332, where after a panic occurs (due to an unrelated bug while processing Git macro attributes) in the Git history rewriting phase of a git lfs migrate import --fixup command, the command hangs because the deferred *Logger.Close() call of the "main" goroutine is stuck waiting on a semaphore which will never decrement to zero. Under normal circumstances the *Rewriter.Rewrite() method would iterate through all the commits selected for rewriting, and so the final call to the *PercentageTask.Count() would cause the task's channel to be closed, thus notifying the anonymous function that is waiting on the channel that is should cease waiting and begin to exit. But if the *Rewriter.Rewrite() method panics during its processing of commits, it never closes the channel of the PercentageTask. By adding an explicitly deferred call to our new Complete() method for that task, though, we can guarantee that the channel will always be closed by the time the *Rewriter.Rewrite() method returns.

(The macro attribute processing bug from #5332, and the discrepancy between how the import and info subcommands of the git-lfs-migrate(1) command manage the cleanup of their *Logger structures, will be dealt with in subsequent PRs.)

Also, as in the case of the SimpleTask structure used by the git-lfs-prune(1) command's prune() function, we refactor our code so as to introduce small utility functions which encapsulate the initialization and cleanup of structures implementing the Task interface. This allows us to always establish a deferred call to the structure's Cleanup() method immediately after initializing and enqueuing the structure, so the return from the utility function guarantees the structure's channel is closed in all instances.

This pattern was discussed in #2757 (comment) as a desirable one to follow for all resource allocation and deallocation, and corresponds to the resource management approach used elsewhere, such as in the UpdateRefs() method of the refUpdater structure in the git/githistory package, where a ListTask is created and a matching deferred call to its Complete() method is then defined, and in the scanAll() and pointersToFetchForRefs() functions of the git-lfs-fetch(1) command, where SimpleTask structures are created and a deferred call to their Complete() method is immediately established. (Note, though, that we refine these functions by making use of the *Logger.Simple() method, which was introduced in commit 0cad488 of PR #2767 but is not used anywhere at present.)


This PR also fixes some minor issues relating to the tasklog package, removing unused or unnecessary code and correcting code comments, and simplifying the initialization of some SimpleTask structures with the *Logger.Simple() method, as mentioned above.

One notable change is the removal of the pending and next variables and the code to populate and process them from the *Logger.consume() method of the tasklog package. This method was introduced in the initial form of the tasklog package in commit e90d54d of PR #2329, and included the internal pending slice and the logic to push and pop tasks from it in much the same form as that code still exists today.

However, that logic which manages the pending slice is never exercised because tasks are only appended in a conditional block which executes if the next variable is non-nil, but that variable is initialized as nil and can only be set to another value if the pending slice is non-empty. As a result, neither condition is ever true, and so only the in the conditional block for when the next variable is nil ever executes, and we can remove the rest.


It may be easiest to review this PR commit-by-commit, as each commit has a detailed description and should be logically independent and bisectable, but it should also be possible to review this PR as a single diff.

/cc #5332 where the git lfs migrate import --fixup hang behaviour was reported.
/cc @ttaylorr for insight on the design of the tasklog and its buffering logic.

chrisd8088 added 10 commits May 26, 2023 01:36
We can update several code comments in the "tasklog" package to
reflect their corresponding functions and fix typos.
In commit e90d54d of PR git-lfs#2329 the
"git/githistory/log" package was added, including the PercentageTask
structure and associated methods.  Then in commit
122b765 of PR git-lfs#2610 the Entry()
method was added to provide support for verbose logging in the
"git lfs migrate impot" command.  However, the name of the
receiver argument for the Entry() method differs from that of
the other methods for the PercentageTask structure, so we update
it to match now, as well as fixing a typo which remains from the
original commit in PR git-lfs#2329.

Note that the "git/githistory/log" package was later renamed to the
"tasklog" package in commits b9ab79e
and 45c580e of PR git-lfs#2747.
In commit 3129b9a of PR git-lfs#2537 the
"githistory" package was updated so the WithLogger global option
variable now required a *log.Logger argument instead of calling
log.NewLogger().  This was done so the "git lfs migrate" commands
could pass in a previously initialized logger to which the commands
had already written log messages; that change was introduced in
PR git-lfs#2538.

In the same commit of PR git-lfs#2537, a WithLoggerTo global option function
was added to preserve the existing behaviour.  However, this option
has never been utilized, and so to reduce unnecessary complexity
we just remove it now.

Note that the use of functional options in the Git LFS project
stems from PR git-lfs#2295 and is inspired by the design proposed in several
articles from 2014; see, for reference:

https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis
https://commandcenter.blogspot.com/2014/01/self-referential-functions-and-design.html
The SimpleTask structure of the "tasklog" package, which was introduced
in commit 7a760b6 of PR git-lfs#2756, is used
at several points in the implementation of the "git lfs fetch" and
"git lfs prune" commands, and in each case is constructed with a
call to the NewSimpleTask() function, and then passed to a Logger
structure's Enqueue() method.

However, in commit 0cad488 of PR git-lfs#2767,
a Logger.Simple() method was added to both create a new SimpleTask
structure and pass it to a Logger structure's Enqueue() method.  This
Logger.Simple() method was added to be consistent with other existing
similar methods such as Logger.List() and Logger.Percentage().

These other methods are used consistently throughout our code
to create and enqueue tasks, so we now update the locations where
we call NewSimpleTask() to use the Logger.Simple() method instead.

This also provides an opportunity to tidy some of the surrounding
code and whitespace.
In commit f100293 of PR git-lfs#2732 a local
maxInt() function was added to the "git/githistory/log" package in order
to avoid a circular dependancy on the "tools" package, which provides
a common MaxInt() implementation.  See, for reference, the discussion
in this PR comment:

git-lfs@f100293#r151551639

The "git/githistory/log" package was subsequently renamed to
the "tasklog" package in commits b9ab79e
and 45c580e of PR git-lfs#2747.

At present, though, there is no cyclic dependency which prevents
the use of the standard tools.MaxInt() function in the "tasklog"
package, so we revert to its use now.
The *Logger.consume() method in the "tasklog" package was first
introduced in commit e90d54d of
PR git-lfs#2329, and since that time has included logic to manage an
internal "pending" list of tasks; however, that logic is never
exercised because tasks are only appended to the "pending" array
when the "next" variable is non-nil, but that variable is
initialized as nil and can only be set to another value if the
"pending" array is non-empty.

As a result, we can remove both variables and the related logic
and simplify the consume() method as a result.
In commit e90d54d of PR git-lfs#2329 the
"git/githistory/log" package was added, including the PercentageTask
structure and associated methods, but unlike the WaitingTask which
was added at the same time, no Complete() method was defined for
the PercentageTask structure.

(Note that the "git/githistory/log" package was later renamed to the
"tasklog" package in commits b9ab79e
and 45c580e of PR git-lfs#2747.)

Other task types such as the ListTask and SimpleTask structures
(introduced in commit 31ffeb9 of
PR git-lfs#2335 and commit 7a760b6 of PR git-lfs#2756,
respectively) provide a Complete() method, and the Meter task type
of the "tq" package, which implemented the Task interface in commit
7c0f9e2 of PR git-lfs#2732, provides an
equivalent Finish() method.  These methods allow the caller to
explicitly close the channel returned by the Updates() method,
on which the anonymous goroutine started by the Logger.consume() method
for the task is waiting, in a "range" loop on the channel in the
Logger.logTask() method.

One key use of the PercentageTask structure is in the methods of
the Rewriter structure from the "git/githistory" package.  It is
initialized with the number of commits to be rewritten during
a "git lfs migrate" command's traversal of a Git repository's history.
As each commit is rewritten, the Count() method is called to
increment the percentage completed value.  When every commit has
been rewritten, the count reaches the expected total, and the
Count() method closes the channel, thus allowing the task receiving
updates to finish and exit its goroutine.

Under exceptional circumstances, though, the Rewrite() method of
the Rewriter structure may never finish iterating through all
the expected commits, and return in error or via a panic call.
When this happens, the goroutine waiting on the PercentageTask's
updates channel can never exit, leading to a hung process which
never fully exits.

In order to allow the Rewriter's Rewrite() method to define a
deferred function which will always be called when it returns,
under any circumstances, we add a Complete() method for the
PercentageTask structure which sets the number of completed elements
to the expected total in an atomic swap, and then closes the
updates channel if the number of completed elements was previously
less than the expected total.

We also add a test to validate this new method's behaviour, and
update the existing TestPercentageTaskCallsDoneWhenComplete()
function to also confirm that calling the new Complete() method
after the number of completed elements has reached the expected
total does not cause a second attempt to close the updates channel.
In a prior commit in this PR, we added a Complete() method for
the PercentageTask structure which sets the number of completed
elements of the task to the task's expected total in an atomic swap,
and then closes the task's updates channel if the number of completed
elements was less than the expected total.

As noted in that same commit, the PercentageTask structure is used
in the methods of the Rewriter structure from the "git/githistory"
package.  It is initialized with the number of commits to be rewritten
during a "git lfs migrate" command's traversal of a Git repository's
history.  As each commit is rewritten, the Count() method is called to
increment the percentage completed value.  When every commit has been
rewritten, the count reaches the expected total, and the Count() method
closes the channel, thus allowing the task receiving updates to finish
and exit its goroutine.

Under exceptional circumstances, though, the Rewrite() method of
the Rewriter structure may never finish iterating through all
the expected commits, and return in error or via a panic call.
When this happens, the anonymous goroutine started by the consume()
method of the Logger structure in "tasklog" package that is waiting
on the PercentageTask's updates channel can never exit, leading to
a hung process which never fully exits.

We can resolve this problem now by defining a deferred call to the
PercentageTask's new Complete() method after we create the task
structure in the Rewrite() method of the Rewriter structure.
As a result, even if the Rewrite() method suffers a panic() or
other error condition causing an early return, the PercentageTask's
updates channel is closed and the waiting goroutine will complete
and exit normally.

This usage corresponds to the resource management approach used
elsewhere, such as in the scanAll() and pointersToFetchForRefs()
functions of the "git lfs fetch" command, where SimpleTask objects
are created and a deferred call to their Complete() method is
immediately established.  This technique is also used in the
UpdateRefs() method of the refUpdater structure in the "git/githistory"
package, where a ListTask is created and a matching deferred call
to its Complete() method is then defined.
In a prior commit in this PR, we added a Complete() method for
the PercentageTask structure which sets the number of completed
elements of the task to the task's expected total in an atomic swap,
and then closes the task's updates channel if the number of completed
elements was less than the expected total.

As noted in that same commit, this allows us to establish a deferred
call to the Complete() method after creating a PercentageTask structure,
akin to how we handle other "tasklog" package task structures such as
ListTask and WaiterTask.  Use of this idiom ensures that even if the
calling function returns due to an unexpected error or panic condition,
the anonymous goroutine started by the consume() method of the Logger
structure that is waiting on the task's updates channel will exit
cleanly, preventing a hung process.

We therefore now update the other use of the PercentageTask structure,
in the "git lfs prune" command's pruneDeleteFiles() function, to
follow the same approach as we have already implemented for the
Rewrite() method of the Rewriter structure in the "git/githistory"
package.
In the implementation of several commands we create structures with
the Task interface from the "tasklog" package, and enqueue them to be
reported by an anonymous goroutine created by the Logger structure's
consume() method for each task.  These goroutines only exit when the
channel returned by the the Task's Updates() method is closed.

In order to ensure that we always close the channel associated with
each task, we refactor the creation and use of these tasks into small
utility functions.  We can then establish a deferred call to the
tasks' Complete() methods immediately after creating the tasks.
As these deferred calls will always be executed when returning from
the new utility functions, even under exceptional conditions, the
waiting goroutines will never fail to exit and thus cause the entire
program to hang.

This usage corresponds to the resource management approach used
elsewhere, such as in the scanAll() and pointersToFetchForRefs()
functions of the "git lfs fetch" command, where SimpleTask objects
are created and a deferred call to their Complete() method is
immediately established.  This technique is also used in the
UpdateRefs() method of the refUpdater structure in the "git/githistory"
package, where a ListTask is created and a matching deferred call
to its Complete() method is then defined, and in previous commits
in this PR we updated the locations in our code where we create
PercentageTask structures to follow the same pattern too.
@chrisd8088 chrisd8088 requested a review from a team as a code owner May 27, 2023 03:19
@chrisd8088 chrisd8088 merged commit 05926b1 into git-lfs:main May 30, 2023
@chrisd8088 chrisd8088 deleted the update-tasklog-complete-all-tasks branch May 30, 2023 16:09
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Jun 1, 2023
Support for Git macro attributes was added in a series of commits
in PR git-lfs#3391, including commit 9d3e52d,
where the Line structure of the "git/gitattr" package was updated
to include a Macro element which would be set non-nil for macro
definition lines in a .gitattributes file file, while the existing
Pattern element would be set non-nil for all other lines.

The "git lfs track" command, among others, was then adjusted to
create a MacroProcessor structure (from the same "git/gitattr" package)
and call its ProcessLines() method to resolve any macro references
and thus convert the "raw" parsed Line structures into a set for
which the Pattern element was always non-nil, and no Macro elements
appeared.

Later, the "git lfs fsck" command gained the ability to process
macro definitions in .gitattributes files, in PR git-lfs#4525.

However, the "git lfs migrate import" command was not adjusted,
specifically in the implementation of its "--fixup" option, which
initializes a Tree structure (also of the "git/gitattr" package) for
the root tree of each commit in a repository's history using the
package's New() function.  This function traverses all the trees
in the hierarchy and finds and parses all the .gitattributes files
in them.  Then, when the command visits each file within the
commit's tree using the Rewrite() method of the Rewriter structure
in the "git/githistory" package, it calls the (*Tree).Applied()
method to match the file's path against any applicable Git attributes,
to see if the file should be treated as a Git LFS object.

This lack of support for macro attributes in the "git lfs migrate
import --fixup" command was then propagated to the "git lfs migrate
info --fixup" command in commit 4800c5e
of PR git-lfs#4501, when the "git lfs migrate info" command was updated to
respect the --fixup option.

As a result, both of these commands (when used with the --fixup option)
would panic if they encountered a .gitattributes file with any macro
definition, as they would call the (*Tree).Applied() method and it
would attempt to access the nil Pattern element of the lines with
non-nil Macro elements.  (Prior to the changes in commit
c374d1f of PR git-lfs#5375 the "git lfs migrate
import --fixup" command would then stall indefinitely, but it now
also exits after the panic condition.)  These problems were reported
in issue git-lfs#5332.

To resolve this problem and avoid similar ones in the future, we
refactor the Line structure into a Line interface, which only provides
a Attrs() method to retrieve a slice of Attr attributes, and no other
methods.  We then also define two additional interfaces, each of which
embeds the Line interface, PatternLine and MacroLine, with corresponding
getter methods for their respective elements.

The ParseLine() function of the "git/gitattr" package now returns a
slice of generic Line types, each of which is either a PatternLine
or a MacroLine, but never both.  Callers like the Applied() method of
the Tree structure therefore need to perform type assertions or
switches to determine which type of Line they are handling, which
ensures they always access the line's data through safe methods.

We then update the Go tests for the "git/gitattr" package as
appropriate, and also add two tests each to the t/t-migrate-fixup.sh
and t/t-migrate-import.sh test suites.  All four of these new
shell tests fail without the changes in this commit.  In particular,
several of these tests make sure to run the "git lfs migrate" commands
outside of any shell pipeline so the test will fail if the command
panics and produces no output, even if no output is the expected
condition for a successful execution of the command.
chrisd8088 added a commit to chrisd8088/git-lfs that referenced this pull request Jun 1, 2023
Support for Git macro attributes was added in a series of commits
in PR git-lfs#3391, including commit 9d3e52d,
where the Line structure of the "git/gitattr" package was updated
to include a Macro element which would be set non-nil for macro
definition lines in a .gitattributes file file, while the existing
Pattern element would be set non-nil for all other lines.

The "git lfs track" command, among others, was then adjusted to
create a MacroProcessor structure (from the same "git/gitattr" package)
and call its ProcessLines() method to resolve any macro references
and thus convert the "raw" parsed Line structures into a set for
which the Pattern element was always non-nil, and no Macro elements
appeared.

Later, the "git lfs fsck" command gained the ability to process
macro definitions in .gitattributes files, in PR git-lfs#4525.

However, the "git lfs migrate import" command was not adjusted,
specifically in the implementation of its "--fixup" option, which
initializes a Tree structure (also of the "git/gitattr" package) for
the root tree of each commit in a repository's history using the
package's New() function.  This function traverses all the trees
in the hierarchy and finds and parses all the .gitattributes files
in them.  Then, when the command visits each file within the
commit's tree using the Rewrite() method of the Rewriter structure
in the "git/githistory" package, it calls the (*Tree).Applied()
method to match the file's path against any applicable Git attributes,
to see if the file should be treated as a Git LFS object.

This lack of support for macro attributes in the "git lfs migrate
import --fixup" command was then propagated to the "git lfs migrate
info --fixup" command in commit 4800c5e
of PR git-lfs#4501, when the "git lfs migrate info" command was updated to
respect the --fixup option.

As a result, both of these commands (when used with the --fixup option)
panic if they encounter a .gitattributes file with any macro
definitions, as they call the (*Tree).Applied() method and it
attempts to access the nil Pattern element of the lines with
non-nil Macro elements.  (Prior to the changes in commit
c374d1f of PR git-lfs#5375 the "git lfs migrate
import --fixup" command would then stall indefinitely, but it now
also exits after the panic condition.)  These problems were reported
in issue git-lfs#5332.

To resolve this problem and avoid similar ones in the future, we
refactor the Line structure into a Line interface, which only provides
an Attrs() method to retrieve a slice of Attr attributes, and no other
methods.  We then also define two additional interfaces, each of which
embeds the Line interface, PatternLine and MacroLine, with corresponding
getter methods for their respective elements.

The ParseLine() function of the "git/gitattr" package now returns a
slice of generic Line types, each of which is either a PatternLine
or a MacroLine, but never both.  Callers like the Applied() method of
the Tree structure therefore need to perform type assertions or
switches to determine which type of Line they are handling, which
ensures they always access the line's data through safe methods.

We then update the Go tests for the "git/gitattr" package as
appropriate, and also add two tests each to the t/t-migrate-fixup.sh
and t/t-migrate-import.sh test suites.  All four of these new
shell tests fail without the changes in this commit.  In particular,
several of these tests make sure to run the "git lfs migrate" commands
outside of any shell pipeline so the test will fail if the command
panics and produces no output, even if no output is the expected
condition for a successful execution of the command.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants