-
Notifications
You must be signed in to change notification settings - Fork 38.8k
build: Use git archive as source tarball #18331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Making SOURCEDIST deterministic is needless since a git archive is used as the source tarball.
131ad5a to
f05f8a4
Compare
|
Thank you for picking this up. It's nice that this mostly removes lines, it's a cleanup too. |
| mkdir -p temp | ||
| pushd temp | ||
| tar -xf ../$SOURCEDIST | ||
| find bitcoin-* | sort | tar --mtime="$REFERENCE_DATETIME" --no-recursion --mode='u+rw,go+r-w,a+X' --owner=0 --group=0 -c -T - | gzip -9n > ../$SOURCEDIST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is not needed for the git archive output?
That's nice!
Oh, you mention this in the OP. I cannot find anything about this in the documentation, but looking at the output of git archive:
- Files are sorted in alphabetic order
- Files are owned by root/root
- Permissions seem to be consistent:
-rw-rw-r-- for normal files
drwxrwxr-x for directories
-rwxrwxr-x for executable scripts
- The timestamp seems to be the same for all the files, but I don't know where it comes from. Oh it's the "CommitDate" of the last commit!
-rw-rw-r-- root/root 4705 2020-03-11 16:45 .appveyor.yml$ git show --pretty=fuller HEAD
commit 24a727e23e4143bcc4e5dfac536bae8d8261d32a (HEAD -> master)
Merge: 3b21a7863411b4b0391313e765f23b2664d4540a faae5a9a356d821f0cbdea32030b0ce356351a1d
Author: Wladimir J. van der Laan <[email protected]>
AuthorDate: Wed Mar 11 16:32:59 2020 +0100
Commit: Wladimir J. van der Laan <[email protected]>
CommitDate: Wed Mar 11 16:45:47 2020 +0100So yes, that looks deterministic to me (assuming the timezone doesn't have a part in it, but tar stores UNIX timestamps so probably not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The timestamp seems to be the same for all the files, but I don't know where it comes from. Oh it's the "CommitDate" of the last commit!
Yes. From git archive docs:
git archivebehaves differently when given a tree ID versus when given a commit ID or tag ID. In the first case the current time is used as the modification time of each file in the archive. In the latter case the commit time as recorded in the referenced commit object is used instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this is the same for the mtime in the global header.
It looks like the generated Oh, this is also mentioned in the manpage:
|
Yes. From
EDIT: Sorry, slow typing... |
|
ACK 86d9fae 🤞 Show signature and timestampSignature: Timestamp of file with hash |
f05f8a4 to
86d9fae
Compare
|
Updated f05f8a4 -> 86d9fae (pr18331.01 -> pr18331.02, compare):
|
Gitian builds
|
86d9fae to
c3c8d3c
Compare
|
Updated 86d9fae -> c3c8d3c (pr18331.02 -> pr18331.03, compare):
|
Some EXTRA_DIST content is needless since a git archive is used as the source tarball.
c3c8d3c to
e4d3667
Compare
|
Updated c3c8d3c -> e4d3667 (pr18331.03 -> pr18331.04, diff; diff 18331.02..pr18331.04):
|
This seems like a real concern. Does this PR now require running autogen when it wasn't required before? If so, it seems like it should have release notes and documentation updates. |
For a major version bump they'd need to adjust the build dependencies anyway (e.g. remove openssl and boost chrono, ...). I don't really see how adding autotools is going to be painful. Though, if it really is that painful, we could run it in the gitian vm and append the resulting configure script to the archive.
We don't really have any documentation about our release tarball, so there is nothing to update I think. I am happy to leave a note in the release notes. |
I probably should familiarize myself with documentation more. I wrote my comment above assuming this PR required a new install step that was optional before, so installation instructions would be updated. I don't think we should hesitate to write release notes for user visible changes. PR descriptions are often unhelpful because they focus on implementation details, while release notes force you to summarzie what the actual effect of a change is (it's still unclear here what files this PR removes from the tarball). |
I'm not complaining about forcing downstreams to update. I'm complaining about the contents of that update. They're going to have to add a pile of new build-time dependencies (autotools dev packages), which may or may not pose further complications. |
|
Release notes added in #18439. |
|
@ryanofsky Our only instruction of building are here: https://github.com/bitcoin/bitcoin/blob/master/doc/build-unix.md#to-build . The first step for them is ./autogen.sh (so nothing changed for them). There is also a source code download link on https://bitcoincore.org/en/download/ . This is what is going to change after this pull request. None of our own packaging scripts should change as an effect of that. They either use the static binaries or the git archive already https://github.com/bitcoin-core/packaging/blob/2c1d6a4cc67411beb3e035d126b56bef3cf7e1f1/debian/rules#L30 I can't comment on how hard it will be for other third party package maintainers to run autogen, but at least the debian one has a full copy of the git repo: https://sources.debian.org/src/bitcoin/0.18.1%7Edfsg-1/ . So I am not sure if there is anyone who depends on this targz and depends on this targz having the configure script included. My opinion is to wait for complaints before changing things again... |
Are there particular operating systems / distros you're concerned about here? I haven't had the most autotools experience in the world, but I haven't personally run into cases where earlier Or maybe there are systems where autotools and m4 are flat out broken or unavailable? |
|
I'm not aware of any. Practically, this will likely simply mean that maintainers add deps to a list and move on. But at a higher level, we can offer a pre-processed source tarball which eases the burden on the builder (as well as adding improving determinism by pegging build tools at tag time) or we could not. After this change, our source tarball is simply a shortcut to a git clone and checkout. Maybe that's ok, but it just doesn't seem worth the hassle of finding out to me. But sure, now that it's in, we can just wait and see if it causes any real-world issues at release time. |
|
We couldn't keep up listing all needed files (or even establish guidelines about what files are needed) reliably, so this was the primary reason to change the source tarball. If the missing configure script is a burden or causes issues otherwise, what about appending it to the git archive after running autogen in the gitian vm? This seems to solve both concerns? |
This is exactly what gitian did before this change. It's more than just 'configure' though, 'make dist' appends all bootstrapped files to the source tarball. It sounds like you would just want an additional step of adding missing git files to that tarball. Edit: I think we're saying the same thing now, we just have different ideas of how to accomplish it. |
e4d3667 build: Drop needless EXTRA_DIST content (Hennadii Stepanov) 6c4da59 build: Drop SOURCEDIST reordering (Hennadii Stepanov) 5e6b8b3 build: Use git archive as source tarball (Hennadii Stepanov) Pull request description: This PR: - is an alternative to bitcoin#17104 - closes bitcoin#16734 - closes bitcoin#6753 The idea is clear described by some developers: - [MarcoFalke](bitcoin#17097 (comment)): > This whole concept of explicitly listing each and every file manually (or with a fragile wildcard) is an obvious sisyphean task. I'd say all we need to do is run git archive and be done with it forever, see bitcoin#16734, bitcoin#6753, bitcoin#11530 ... - [laanwj](bitcoin#17097 (comment)): > I agree, I've never been a fan of it. I don't think we have any files in the git repository we don't want to ship in the source tarball. --- The suggested changes have a downside which is pointed by [**luke-jr**](bitcoin#17104 (comment)): > ... but the distfile needs to include autogen-generated files. This means that a user is not able to run `./configure && make` right away. One must run `./autogen.sh` at first. Here are opinions about mandatory use of `./autogen.sh`: - [ryanofsky](bitcoin#16734 (comment)): > It's probably ok to require autogen. I think historically configure scripts were supposed to work on obscure unix systems that would just have a generic shell + make tool + c compiler, and not necessarily need gnu packages like m4 which are needed for autogen. - [laanwj](bitcoin#16734 (comment)): > I also think it's fine to require autogen. What is one dependency more, if you're building from source. --- ~Also this PR provides Windows users with ZIP archives of the sources. Additionally the commit ID is stored in these ZIP files as a file comment:~ --- Note for reviewers: please verify is `git archive` output deterministic? ACKs for top commit: MarcoFalke: re-ACK e4d3667, only change is adding two dots in a the path 🛳 laanwj: ACK e4d3667 Tree-SHA512: d1153d3ca4a580696019b92be3555ab004d197d9a2146aacff9d3150eb7093b7d40eebd6eea12d861d93ff62d62b68706e04e64dbe5ea796ff6757486e462193
2aa48ed refactor: Drop unused ${WRAP_DIR}/${HOST} directory (Hennadii Stepanov) 1362be0 build: Drop make dist in gitian builds (Hennadii Stepanov) Pull request description: After the merge of #18331, the packaged source tarball is created by `git archive`, but the binaries are built from another one which is made by `make dist`. With this PR the only source tarball, created by `git archive`, is used both for binaries building and for packaging to users. Close #16588. Close #18547. As a good side-effect, #18349 becomes redundant. **Change in behavior** The following variables https://github.com/bitcoin/bitcoin/blob/1b151e3ffce7c1a2ee46bf280cc1d96775d1f91e/configure.ac#L2-L6 are no longer used for naming of directories and tarballs. Instead of them the gitian descriptors use a git tag (if available) or a commit hash. --- Also a small refactor commit picked from #18404. ACKs for top commit: dongcarl: ACK 2aa48ed MarcoFalke: ACK 2aa48ed fanquake: ACK 2aa48ed - I've had a quick look over this, and don't want to block merging if this actually gets as closer to finally having this all sorted out. Obviously we've still got #18741, and after speaking to Carl this morning, there will likely be even more changes after that (not Guix specific). Tree-SHA512: d3b16f87e48d1790a3264940c28acd5d881bfd10f3ce94fb0c8a6af76d8039289d01e0cd4972adac49ae24362857251f6c1e5e09e3e9fbf636c10708b4015a7c
2aa48ed refactor: Drop unused ${WRAP_DIR}/${HOST} directory (Hennadii Stepanov) 1362be0 build: Drop make dist in gitian builds (Hennadii Stepanov) Pull request description: After the merge of bitcoin#18331, the packaged source tarball is created by `git archive`, but the binaries are built from another one which is made by `make dist`. With this PR the only source tarball, created by `git archive`, is used both for binaries building and for packaging to users. Close bitcoin#16588. Close bitcoin#18547. As a good side-effect, bitcoin#18349 becomes redundant. **Change in behavior** The following variables https://github.com/bitcoin/bitcoin/blob/1b151e3ffce7c1a2ee46bf280cc1d96775d1f91e/configure.ac#L2-L6 are no longer used for naming of directories and tarballs. Instead of them the gitian descriptors use a git tag (if available) or a commit hash. --- Also a small refactor commit picked from bitcoin#18404. ACKs for top commit: dongcarl: ACK 2aa48ed MarcoFalke: ACK 2aa48ed fanquake: ACK 2aa48ed - I've had a quick look over this, and don't want to block merging if this actually gets as closer to finally having this all sorted out. Obviously we've still got bitcoin#18741, and after speaking to Carl this morning, there will likely be even more changes after that (not Guix specific). Tree-SHA512: d3b16f87e48d1790a3264940c28acd5d881bfd10f3ce94fb0c8a6af76d8039289d01e0cd4972adac49ae24362857251f6c1e5e09e3e9fbf636c10708b4015a7c
f1f63ac doc: Remove outdated comment (Hennadii Stepanov) Pull request description: The removed commit has been wrong [since v0.20.0](bitcoin/bitcoin#18331). ACKs for top commit: fanquake: ACK - f1f63ac Tree-SHA512: ef6191fef389fa0ee5e6cf224e4990a1804aeefd1c3e9d9a4870cf46e1833fbb8701c379b6ce4e13caa02ae2f4f86778fa2c1e994c89392c08fcf01701482d7a
e4d3667 build: Drop needless EXTRA_DIST content (Hennadii Stepanov) 6c4da59 build: Drop SOURCEDIST reordering (Hennadii Stepanov) 5e6b8b3 build: Use git archive as source tarball (Hennadii Stepanov) Pull request description: This PR: - is an alternative to bitcoin#17104 - closes bitcoin#16734 - closes dashpay#6753 The idea is clear described by some developers: - [MarcoFalke](bitcoin#17097 (comment)): > This whole concept of explicitly listing each and every file manually (or with a fragile wildcard) is an obvious sisyphean task. I'd say all we need to do is run git archive and be done with it forever, see bitcoin#16734, dashpay#6753, bitcoin#11530 ... - [laanwj](bitcoin#17097 (comment)): > I agree, I've never been a fan of it. I don't think we have any files in the git repository we don't want to ship in the source tarball. --- The suggested changes have a downside which is pointed by [**luke-jr**](bitcoin#17104 (comment)): > ... but the distfile needs to include autogen-generated files. This means that a user is not able to run `./configure && make` right away. One must run `./autogen.sh` at first. Here are opinions about mandatory use of `./autogen.sh`: - [ryanofsky](bitcoin#16734 (comment)): > It's probably ok to require autogen. I think historically configure scripts were supposed to work on obscure unix systems that would just have a generic shell + make tool + c compiler, and not necessarily need gnu packages like m4 which are needed for autogen. - [laanwj](bitcoin#16734 (comment)): > I also think it's fine to require autogen. What is one dependency more, if you're building from source. --- ~Also this PR provides Windows users with ZIP archives of the sources. Additionally the commit ID is stored in these ZIP files as a file comment:~ --- Note for reviewers: please verify is `git archive` output deterministic? ACKs for top commit: MarcoFalke: re-ACK e4d3667, only change is adding two dots in a the path 🛳 laanwj: ACK e4d3667 Tree-SHA512: d1153d3ca4a580696019b92be3555ab004d197d9a2146aacff9d3150eb7093b7d40eebd6eea12d861d93ff62d62b68706e04e64dbe5ea796ff6757486e462193
2aa48ed refactor: Drop unused ${WRAP_DIR}/${HOST} directory (Hennadii Stepanov) 1362be0 build: Drop make dist in gitian builds (Hennadii Stepanov) Pull request description: After the merge of bitcoin#18331, the packaged source tarball is created by `git archive`, but the binaries are built from another one which is made by `make dist`. With this PR the only source tarball, created by `git archive`, is used both for binaries building and for packaging to users. Close bitcoin#16588. Close bitcoin#18547. As a good side-effect, bitcoin#18349 becomes redundant. **Change in behavior** The following variables https://github.com/bitcoin/bitcoin/blob/1b151e3ffce7c1a2ee46bf280cc1d96775d1f91e/configure.ac#L2-L6 are no longer used for naming of directories and tarballs. Instead of them the gitian descriptors use a git tag (if available) or a commit hash. --- Also a small refactor commit picked from bitcoin#18404. ACKs for top commit: dongcarl: ACK 2aa48ed MarcoFalke: ACK 2aa48ed fanquake: ACK 2aa48ed - I've had a quick look over this, and don't want to block merging if this actually gets as closer to finally having this all sorted out. Obviously we've still got bitcoin#18741, and after speaking to Carl this morning, there will likely be even more changes after that (not Guix specific). Tree-SHA512: d3b16f87e48d1790a3264940c28acd5d881bfd10f3ce94fb0c8a6af76d8039289d01e0cd4972adac49ae24362857251f6c1e5e09e3e9fbf636c10708b4015a7c
e4d3667 build: Drop needless EXTRA_DIST content (Hennadii Stepanov) 6c4da59 build: Drop SOURCEDIST reordering (Hennadii Stepanov) 5e6b8b3 build: Use git archive as source tarball (Hennadii Stepanov) Pull request description: This PR: - is an alternative to bitcoin#17104 - closes bitcoin#16734 - closes dashpay#6753 The idea is clear described by some developers: - [MarcoFalke](bitcoin#17097 (comment)): > This whole concept of explicitly listing each and every file manually (or with a fragile wildcard) is an obvious sisyphean task. I'd say all we need to do is run git archive and be done with it forever, see bitcoin#16734, dashpay#6753, bitcoin#11530 ... - [laanwj](bitcoin#17097 (comment)): > I agree, I've never been a fan of it. I don't think we have any files in the git repository we don't want to ship in the source tarball. --- The suggested changes have a downside which is pointed by [**luke-jr**](bitcoin#17104 (comment)): > ... but the distfile needs to include autogen-generated files. This means that a user is not able to run `./configure && make` right away. One must run `./autogen.sh` at first. Here are opinions about mandatory use of `./autogen.sh`: - [ryanofsky](bitcoin#16734 (comment)): > It's probably ok to require autogen. I think historically configure scripts were supposed to work on obscure unix systems that would just have a generic shell + make tool + c compiler, and not necessarily need gnu packages like m4 which are needed for autogen. - [laanwj](bitcoin#16734 (comment)): > I also think it's fine to require autogen. What is one dependency more, if you're building from source. --- ~Also this PR provides Windows users with ZIP archives of the sources. Additionally the commit ID is stored in these ZIP files as a file comment:~ --- Note for reviewers: please verify is `git archive` output deterministic? ACKs for top commit: MarcoFalke: re-ACK e4d3667, only change is adding two dots in a the path 🛳 laanwj: ACK e4d3667 Tree-SHA512: d1153d3ca4a580696019b92be3555ab004d197d9a2146aacff9d3150eb7093b7d40eebd6eea12d861d93ff62d62b68706e04e64dbe5ea796ff6757486e462193
2aa48ed refactor: Drop unused ${WRAP_DIR}/${HOST} directory (Hennadii Stepanov) 1362be0 build: Drop make dist in gitian builds (Hennadii Stepanov) Pull request description: After the merge of bitcoin#18331, the packaged source tarball is created by `git archive`, but the binaries are built from another one which is made by `make dist`. With this PR the only source tarball, created by `git archive`, is used both for binaries building and for packaging to users. Close bitcoin#16588. Close bitcoin#18547. As a good side-effect, bitcoin#18349 becomes redundant. **Change in behavior** The following variables https://github.com/bitcoin/bitcoin/blob/1b151e3ffce7c1a2ee46bf280cc1d96775d1f91e/configure.ac#L2-L6 are no longer used for naming of directories and tarballs. Instead of them the gitian descriptors use a git tag (if available) or a commit hash. --- Also a small refactor commit picked from bitcoin#18404. ACKs for top commit: dongcarl: ACK 2aa48ed MarcoFalke: ACK 2aa48ed fanquake: ACK 2aa48ed - I've had a quick look over this, and don't want to block merging if this actually gets as closer to finally having this all sorted out. Obviously we've still got bitcoin#18741, and after speaking to Carl this morning, there will likely be even more changes after that (not Guix specific). Tree-SHA512: d3b16f87e48d1790a3264940c28acd5d881bfd10f3ce94fb0c8a6af76d8039289d01e0cd4972adac49ae24362857251f6c1e5e09e3e9fbf636c10708b4015a7c
This PR:
make distusesgit archive#17104The idea is clear described by some developers:
The suggested changes have a downside which is pointed by luke-jr:
This means that a user is not able to run
./configure && makeright away. One must run./autogen.shat first.Here are opinions about mandatory use of
./autogen.sh:Also this PR provides Windows users with ZIP archives of the sources. Additionally the commit ID is stored in these ZIP files as a file comment:Note for reviewers: please verify is
git archiveoutput deterministic?