Skip to content

Conversation

@bk2204
Copy link
Member

@bk2204 bk2204 commented Apr 27, 2022

Right now, we provide signed SHA-256 hashes for our releases. This is fine and sufficient, and also cryptographically secure. However, many distributors use other algorithms, and it would be convenient if we could provide easy access to those hashes as well. For example, NetBSD uses SHA-512 and BLAKE2s.

Let's add an additional file, hashes.asc, which contains a general set of hashes in the BSD format. The advantage of the BSD format over the traditional GNU format is that it includes the hash algorithm, which allows us to distinguish between hashes of the same length, such as SHA-256, SHA-512/256, and SHA3-256. It is generated by shasum, sha*sum, sha3sum, and b2sum with the --tag format, and all of these programs accept it for verification with no problems.

Using the BSD format means that we need only provide one additional file with all the additional algorithms. There is therefore no need to add multiple new files, and if we desire to add additional algorithms in the future, that's easily done without modification.

I don't love the name hashes and would prefer something different, but I am, as usual, bad at naming things, so suggestions are welcome.

Fixes #4937

bk2204 added 3 commits April 27, 2022 20:10
Right now, we provide signed SHA-256 hashes for our releases.  This is
fine and sufficient, and also cryptographically secure.  However, many
distributors use other algorithms, and it would be convenient if we
could provide easy access to those hashes as well.  For example, NetBSD
uses SHA-512 and BLAKE2s.

Let's add a script to hash files with various algorithms and output them
in the BSD format.  The advantage of the BSD format over the traditional
GNU format is that it includes the hash algorithm, which allows us to
distinguish between hashes of the same length, such as SHA-256,
SHA-512/256, and SHA3-256.  It is generated by shasum, sha*sum, sha3sum,
and b2sum with the --tag format, and all of these programs accept it for
verification with no problems.

Using the BSD format means that we need only provide one additional file
with all the additional algorithms.  There is therefore no need to add
multiple new files, and if we desire to add additional algorithms in the
future, that's easily done without modification.

For aesthetics, we sort first by hash name and then by filename in the
output.  Unlike sorting with `sort`, this keeps the SHA-2 and SHA-3
algorithms separate instead of interspersing them, which aids in
reading.  Add some comments because the algorithm, while logical,
is somewhat subtle.
Right now, we provide signed SHA-256 hashes for our releases.  This is
fine and sufficient, and also cryptographically secure.  However, many
distributors use other algorithms, and it would be convenient if we
could provide easy access to those hashes as well.  For example, NetBSD
uses SHA-512 and BLAKE2s.

Let's add an additional file, hashes.asc, which contains a general set
of hashes in the BSD format. The advantage of the BSD format over the
traditional GNU format is that it includes the hash algorithm, which
allows us to distinguish between hashes of the same length, such as
SHA-256, SHA-512/256, and SHA3-256.  It is generated by shasum, sha*sum,
sha3sum, and b2sum with the --tag format, and all of these programs
accept it for verification with no problems.

Using the BSD format means that we need only provide one additional file
with all the additional algorithms.  There is therefore no need to add
multiple new files, and if we desire to add additional algorithms in the
future, that's easily done without modification.

If the user has sha3sum (which comes from Perl's Digest::SHA3) or b2sum
(part of GNU coreutils), then we use them to verify our hashes.  There
are no known commands available on a typical Linux system to verify
BLAKE2s, but we assume that if OpenSSL and our Ruby script correctly
generated the SHA-2 entries, then it will also have properly generated
the other hashes as well.

Since we must now run programs inside the repository, we need to know
where that file is located, and therefore we use git to find the root of
the repository and now must run within the repository.  Since this
script is only run by Git LFS core team members or the CI system when
doing releases, this is not expected to be an issue.
Indicate that there are also additional hashes in another file in the
README.
@bk2204 bk2204 marked this pull request as ready for review May 4, 2022 19:16
@bk2204 bk2204 requested a review from a team as a code owner May 4, 2022 19:16
@chrisd8088 chrisd8088 changed the title Multple hash support Multiple hash support May 4, 2022
Copy link
Member

@chrisd8088 chrisd8088 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bk2204 bk2204 merged commit 6613e65 into git-lfs:main May 5, 2022
@bk2204 bk2204 deleted the multihashes branch May 5, 2022 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include a larger set of hashes with release assets

2 participants