Skip to content

Conversation

@chrisd8088
Copy link
Member

For the past several months our CI test suite has experienced intermittent failures when running on Windows, which are caused when no data is returned from the simulated /dev/urandom special file in Git Bash.

While the cause of this regression in the Git Bash/MSYS2 environment is unclear, to avoid this and any similar issues in the future, we introduce a new test helper program named lfstest-genrandom, and then use it to generate the contents of all the test files we need throughout our test suite.

Further, we make use of this new utility to generate random filenames and directory names, which permits us to replace calls to mktemp(1) and uuidgen(1) (which doesn't exist in Git Bash) in a number of our tests.

Each commit in this PR has a detailed description; however, it should also be possible to review this PR as a single patch diff.

CI Failures on Windows

In the most recent CI job run from the merge of PR #5866 earlier this week, we can see the consequences when /dev/urandom does not return any data.

As one example among many other test failures, the clean stdin test in our t/t-clean.sh test script does not succeed. The test creates two files filled with data from /dev/urandom and calculates their Git LFS OIDs. In the error log from the failed test, we can see these OIDs were both e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855, which is the SHA-256 hash of an empty string. When the test then runs git lfs clean on the files, that command produces no output (which is expected in the case of zero-byte input files), so when the test compares the OIDs to the output of git lfs clean, they don't match and the test fails.

stderr log
# -- stderr --
#     + set -e
#     + reponame=clean-over-stdin
#     + git init clean-over-stdin
#     + cd clean-over-stdin
#     + base64
#     + head -c 1024
#     + base64
#     + head -c 2048
#     ++ calc_oid_file small.dat
#     ++ sha256sum small.dat
#     ++ cut -f 1 -d ' '
#     + expected_small=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
#     ++ calc_oid_file large.dat
#     ++ sha256sum large.dat
#     ++ cut -f 1 -d ' '
#     + expected_large=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
#     ++ git lfs clean
#     ++ grep oid
#     ++ cut -d : -f 2
#     + actual_small=
#     ++ git lfs clean
#     ++ grep oid
#     ++ cut -d : -f 2
#     + actual_large=
#     + '[' e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 '!=' '' ']'
#     + echo 'fatal: expected small OID of: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855, got: '
#     fatal: expected small OID of: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855, got: 
#     + exit 1
#     + test_status=1

New Test Helper Program

Our new lfstest-genrandom program takes an optional command-line argument specifying the number of random bytes to return, as well as optional --base64 or --base64url flag arguments.

When no size argument is provided, the program returns an effectively endless stream of random data, following the model of the test-genrandom helper program in the Git test suite. Otherwise the program returns exactly the specified number of bytes of random data.

When the --base64 argument is given, the output is encoded in regular Base64, as defined by RFC 4648, including / and + characters. If the --base64url argument is used instead, the output is encoded in the alternate Base64 encoding format defined in RFC 4648, which uses the - and _ characters. In both cases, no Base64 padding is added, and if a size argument was provided, the output is truncated to the specified size after encoding, not before.

The program follows the example of several of our other test helper programs, such as lfstest-realpath and lfstest-testutils, in using distinct exit codes (specifically, the values 2 and above) for each error condition.

With this helper program available, we can then replace all our shell pipelines of the form base64 < /dev/urandom | head -c 120 >foo.txt with equivalents likelfstest-genrandom --base64 120 >foo.txt.

We can also rewrite a handful of calls to mktemp(1) and uuidgen(1), simplifying them and also ensuring we always use data of the expected length and type.

In particular, the prune verify large numbers of refs test in our t/t-prune.sh script calls uuidgen(1) to create random Git LFS object data; however, that command is not available in the Git Bash environment on Windows, so in practice the test uses empty data. While that test's success is not conditional on having valid Git LFS object data, having invalid data is not an ideal practice, and we can easily correct this issue now using the lfstest-genrandom program.

Temporary Directory Names

Both our t/t-migrate-export.sh and t/t-migrate-import.sh test scripts make calls to the mktemp(1) command to create temporary directories. However, they do not remove these after the tests complete, in part because the directories are created outside of the location specified by our GIT_LFS_TEST_DIR variable (but see the notes below on this point).

We can replace these calls to mktemp(1) with the use of our lfstest-genrandom program and thus ensure the directories we create have paths which begin with the path stored in the GIT_LFS_TEST_DIR variable, so at least they will be removed if and when that directory is deleted. Note, though, that due to the fact that we do not export that variable, and because each test script runs in its own shell session, each script creates another temporary directory for its own private GIT_LFS_TEST_DIR variable. Moreover, we do not consistently remove these directories after the test suites are run, as we never check or make use of the yes value we store in the RM_GIT_LFS_TEST_DIR variable. We expect to address both issues in subsequent PRs.

For now, one additional change we do make is to fix the path generated on macOS by our sole remaining call to mktemp(1). On macOS, the BSD version of mktemp(1) treats the X characters in our directory name template as literals, leading to names like git-lfs_TEMP.XXXXXX.y3IwCMwm instead of git-lfs_TEMP.y3IwCMwm. (This version of the command treats the argument following the -t option as a prefix rather than a template, unlike the GNU version.) So we simply set our TEMPDIR_PREFIX to git-lfs_TEMP on macOS, which ensures we are returned directory names of the form we intend, without spurious X characters.

Expected and Unrelated CI Failures

Note that we expect our Build with specific Go CI job to fail at the moment, due to the recent release of Go version 1.23. We install the latest version of the goimports package, and the installation fails in the Build with specific Go CI job because the x/tools module requires Go 1.22 as of commit golang/tools@70f5626, and we are still using Go 1.21 for that CI job. We will address this issue in a subsequent PR.

We use several techniques in our test suite to acquire random data
and filenames, of which the most prevalent one is to read from the
/dev/urandom special file.  Recently, this appears to no longer
work reliably when our CI jobs are run on Windows by our GitHub Actions
workflows, possibly due to an upstream change in Git Bash, MSYS2,
Cygwin, or for some other reason.

We also use the uuidgen(1) command in one test in our t/t-prune.sh
test script to populate Git LFS object content.  However, that command
is not installed with Git Bash, so on Windows this data is simply empty;
while the test still passes, this is less than ideal.

We already have a number of helper programs written in Go that provide
platform-agnostic utilities to our test suite, and Go's crypto/rand
package can be used to generate random data on all the platforms
we support.  On Windows in particular it uses the ProcessPrng API,
since /dev/urandom does not exist on that platform.  (We are only able
to make use of /dev/urandom on Windows because a simulation of it is
provided by the MSYS2 environment, and Git Bash, which we use to run
our test suite, is built on MSYS2.)

Therefore we add a new lfstest-genrandom program for our test suite
which uses the crypto/rand Go package to return strings of random data.
If a size is provided as a command-line argument, it returns that
number of random bytes; otherwise, it returns an effectively endless
stream of data.  This parallels the design of the Git test suite's
test-genrandom helper:

  https://github.com/git/git/blob/a7dae3bdc8b516d36f630b12bb01e853a667e0d9/t/helper/test-genrandom.c

When the --base64 command-line flag is specified, our new program returns
data encoded using the regular Base64 specification from RFC 4648.  When
the --base64url flag is specified instead, the returned data is encoded
with the alternate Base64 specification from RFC 4648 that is suitable
for use in filenames and URLs.  In this case, slash and plus characters
are avoided in favour of underscore and dash characters.  With either
flag, the encoded data has no Base64 padding, and if a size was provided
on the command line, the data will be truncated to exactly that number
of bytes after encoding (rather than being truncated before encoding,
which would produce extra output bytes).

While we could choose to only produce the URL-compatible form of Base64
data, we would have to rewrite a number of tests which expect to be
able to read the random data we've written into a test file and then
pass it to our calc_oid() shell function, which calls the Bash builtin
command "printf" on the data.  If the data happens to begin with a dash
character, "printf" would interpret it as a flag argument and might fail.
It is therefore more convenient in these cases to use the regular form
of Base64 encoding.

Conversely, if we only generated data encoded in the regular form of
Base64, we would not be able to use it in the several places where
we generate random directory names without also passing the data through
something like the sha256sum(1) program, as we do now.  By making
the URL-compatible encoding available directly from our new utility
program, we can simplify these portions of our test suite when, in a
subsequent commit in this PR, we refactor our test's shell functions
to make use of our new program.

Note that we follow the example of several of our other test helper
programs, such as lfstest-realpath and lfstest-testutils, in using
distinct exit codes (specifically, the values 2 and above) for each
error condition.
In a previous commit in this PR we introduced a new utility program
for our test suite, lfstest-genrandom, which reliably produces random
data on all the platforms we support, either as raw binary data or
using one of several forms of Base64 encoding.

We now make use of that program throughout our test suite, in place
of reading from /dev/urandom directly.  (Note that because the
lfstest-genrandom program uses the Go crypto/rand package to collect
the random data, on macOS and Linux systems the ultimate source of the
data will still be /dev/urandom, since the Go package reads from
that special file on those platforms.  On Windows, however, the Go
package makes use of the native Windows ProcessPrng API, so we will
no longer be dependent on the Git Bash environment to provide a
simulated /dev/urandom on Windows.)

In most cases, we pass the --base64 flag to our lfstest-genrandom
program, which allows us to eliminate many calls to the base64(1)
command.  We can also pass the exact size of the returned data
we desire as a command-line argument, which eliminates the need
to pipe the data through the head(1) command.

In two instances, in the remove_and_create_local_repo() and
remove_and_create_remote_repo() functions in our t/fixtures/migrate.sh
file, we use the --base64url flag so that we can use the returned
strings as directory name suffixes without further processing.
Previously, we had to pipe the data through base64(1), head(1),
sha256sum(1), and cut(1) in order to ensure we had a random string
safe for use in a directory name.

And in the setup_local_branch_with_special_character_files() function
in our t/fixtures/migrate.sh file we don't pass any flags to the
lfstest-genrandom program as we want raw binary data for the test
files created by that function.
In previous commits in this PR we introduced a new utility program
for our test suite, lfstest-genrandom, which reliably produces random
data on all the platforms we support (in several forms of Base64
encoding, or as raw binary data), and then converted much of our test
suite to make use of this new program.

As noted in one of those commits, one test in our t/t-prune.sh test
script makes use of the uuidgen(1) command to populate some Git LFS
objects with arbitrary text, but as this command does not exist in
the Git Bash environment on Windows, the objects actually have empty
data on that platform.  The test still passes because it doesn't depend
on the length of the random data, and because the non-zero exit code
caused by the missing uuidgen(1) command is ignored as it appears earlier
in a pipeline whose final command (our lfstest-testutils program)
always succeeds.

We therefore revise this "prune verify large numbers of refs" test
(which was added in commit dd19c1b of
PR git-lfs#1861) to call our new lfstest-genrandom program instead of the
uuidgen(1) command.  We pass the --base64 flag to our program and request
a slightly larger size for the output Base64-encoded string than the
length of a UUID, just to ensure we are using a similar amount of random
data as before, although this doesn't actually affect the success or
failure of the test.

We also adjust the test so that it does not create the Git LFS objects
with sizes that do not match the length of their arbitrary content.
(Again, the test passed despite this inconsistency in its test data.)
In commit 96775c6 of PR git-lfs#2929 the
"migrate import (--object-map)" test was added to the test suite for
the "git lfs migrate import" command to validate the --object-map option
for that command, which was introduced in the same PR.  This test
makes use of the mktemp(1) shell command to generate a unique temporary
directory, into which the test then writes several files including
the Git LFS migration's object map file.

Likewise, in commit 7632a20 of PR git-lfs#3084,
a similar "migrate export (--object-map)" test was added to the test
suite for the "git lfs migrate export" command, which was introduced
in that PR, and this test also uses the mktemp(1) shell command to
create a unique temporary directory.

However, neither test removes these temporary directories after
completion, so they remain in place indefinitely in whatever system
location is used by the mktemp(1) command.  This is in addition to the
temporary directory created by the mktemp(1) command which is used
throughout all our test suites, whose path we store in the
GIT_LFS_TEST_DIR variable so it can be reused by all test suites after
the first one to run.  (At least, that is the intent of the variable,
but in practice, because we do not export it, and because each test
script runs in its own shell session, each script creates another
temporary directory.  We will address this issue in a subsequent PR.)

In a previous commit in this PR we introduced a new lfstest-genrandom
utility program for our test suite, so we can now make use of it to
replace both instances where our migration tests make an extra call
to the mktemp(1) shell command (beyond the one used to populate the
GIT_LFS_TEST_DIR variable).

In both tests, we now create a unique directory whose path begins
with the path in the GIT_LFS_TEST_DIR variable, followed by a directory
name partially generated by our lfstest-genrandom program.  We pass
the --base64url flag to the program ensure it returns only characters
safe to use in a file or directory name on any OS platform.
In commit cc41ca5 we changed the
template used for our call to mktemp(1) to add a suffix of "XXXXXX"
in order to comply with the requirements of the GNU version of that
command, which expects at least 3 "X" characters in a template
for a temporary file or directory name.

However, on macOS, the BSD version of mktemp(1) treats these
as literal "X" characters if the -t option is used, because in
this case the following argument is interpreted as a prefix,
not a template.  Thus we actually create temporary directories
on macOS of the form /var/folders/.../git-lfs_TEMP.XXXXXX.<random>,
with 8 random characters appended by the command.

Since the "X" characters are unnecessary in this case, we adjust
our definition of the TEMPDIR_PREFIX variable to leave them off
when our tests are running on macOS.
@chrisd8088 chrisd8088 merged commit 2c3dbea into git-lfs:main Sep 27, 2024
@chrisd8088 chrisd8088 deleted the test-updates-genrandom branch September 27, 2024 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants