-
Notifications
You must be signed in to change notification settings - Fork 38.6k
doc: Clarify comments about endianness after #30526 #31596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31596. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
|
ACK 13464c0 |
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving these!
Since we can’t usually change the behavior, it’s crucial to share tribal knowledge.
I have a few suggestions for reducing duplication, simplifying sentence structure, and applying these changes to similar places.
However, I understand if you prefer to limit the scope
src/rpc/mining.cpp
Outdated
| {RPCResult::Type::STR_HEX, "data", "transaction data encoded in hexadecimal (byte-for-byte)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "txid hash (excluding witness data) with hash bytes reversed, encoded as hex"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documenting txid as "it's the txid hash" is a bit redundant and confusing.
In https://github.com/bitcoin/bitcoin/blob/master/src/rpc/mempool.cpp#L135 (and a few other places) the description is "The transaction hash in hex".
Could we unify the rest as well to something simple like
| {RPCResult::Type::STR_HEX, "txid", "txid hash (excluding witness data) with hash bytes reversed, encoded as hex"}, | |
| {RPCResult::Type::STR_HEX, "txid", "transaction id hash (without witness) shown in byte-reversed hex"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Thanks, I like "shown in byte-reversed hex" as a shorter alternative, and tried to incorporate your other ideas while not sounding ambiguous. If reviewers are happy with the new phrasing I'll add new commit changing other RPCs to use it. The other RPCs do seem ok as-is because they don't mention "little-endian hexadecimal", but it's good to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Could we unify the rest as well
To follow up, I looked at occurrences of "txid", "wtxid", and "hash" (quoted strings) in the RPC code and found dozens of fields that could be updated to be consistent. It seems more challenging than I expected to update all the documentation, and now I'm also wondering if might actually be harmful to mention reversed bytes in some cases but not others, because this might give the wrong impression that bytes are not always reversed for every hash field.
I wonder if maybe a better approach to provide more consistency could be to introduce new RPC types like STR_TXID, and STR_WTXID and use them instead of the more generic STR_HEX type, so specific documentation about the meanings of these hashes could be incorporated into the type documentation and not repeated in field descriptions.
This is a documentation-only change following up on suggestions made in the bitcoin#30526 review. Motivation for this change is that I was recently reviewing bitcoin#31583, which reminded me how confusing the arithmetic blob code was and made me want to write better comments.
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 13464c0 -> 3e0a992 (pr/docblob.1 -> pr/docblob.2, compare) with suggestions
Thanks for the reviews!
src/rpc/mining.cpp
Outdated
| {RPCResult::Type::STR_HEX, "data", "transaction data encoded in hexadecimal (byte-for-byte)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "txid hash (excluding witness data) with hash bytes reversed, encoded as hex"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Thanks, I like "shown in byte-reversed hex" as a shorter alternative, and tried to incorporate your other ideas while not sounding ambiguous. If reviewers are happy with the new phrasing I'll add new commit changing other RPCs to use it. The other RPCs do seem ok as-is because they don't mention "little-endian hexadecimal", but it's good to be consistent.
| /** Get the hash of these transactions' wtxids, concatenated in lexicographical order (treating the | ||
| * wtxids as little endian encoded uint256, smallest to largest). */ | ||
| /** Get the hash of the concatenated wtxids of transactions, with wtxids | ||
| * treated as a little-endian numbers and sorted in ascending numeric order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * treated as a little-endian numbers and sorted in ascending numeric order. | |
| * treated as little-endian numbers and sorted in ascending numeric order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Thanks, fixed in a followup branch.
sedited
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 3e0a992
| {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "transaction hash excluding witness data, shown in byte-reversed hex"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "transaction hash including witness data, shown in byte-reversed hex"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Call it hexadecimal instead of hex like done in the "data" field for better consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Nit: Call it
hexadecimalinstead ofhexlike done in the"data"field for better consistency?
I would probably prefer to go the other way since the term "hex" occurs more frequently than "hexadecimal" in RPC code and documentation (115 vs 11 times). I'm hoping we might be able to improve consistency by moving this information from descriptions of RPC fields to descriptions of RPC types and left another comment about this.
Sjors
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 3e0a992
BrandonOdiwuor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ACK 3e0a992
|
ACK 3e0a992 |
|
There's still a typo there, could we fix it before merging? Otherwise I'd ACK as well. |
Followups from bitcoin#31596 - Fix typo in packages.h that makes comment harder to understand - Refer explicitly to "wtxid" in description of relevant hash field so readers can look up the term, and update "txid" description as well to be consistent. - Uses dashes consistently when writing "little-endian", "most-significant", etc.
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews. Since there are more followups that could be made here, I created a branch to keep track of them. I don't want to create too much churn with documentation PRs so I won't open another one right away, but I can implement any suggestions made here, open a pr if requested, or wait for a related PR to be opened to attach these changes to.
| /** Get the hash of these transactions' wtxids, concatenated in lexicographical order (treating the | ||
| * wtxids as little endian encoded uint256, smallest to largest). */ | ||
| /** Get the hash of the concatenated wtxids of transactions, with wtxids | ||
| * treated as a little-endian numbers and sorted in ascending numeric order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Thanks, fixed in a followup branch.
| {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "transaction hash excluding witness data, shown in byte-reversed hex"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "transaction hash including witness data, shown in byte-reversed hex"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Nit: Call it
hexadecimalinstead ofhexlike done in the"data"field for better consistency?
I would probably prefer to go the other way since the term "hex" occurs more frequently than "hexadecimal" in RPC code and documentation (115 vs 11 times). I'm hoping we might be able to improve consistency by moving this information from descriptions of RPC fields to descriptions of RPC types and left another comment about this.
src/rpc/mining.cpp
Outdated
| {RPCResult::Type::STR_HEX, "data", "transaction data encoded in hexadecimal (byte-for-byte)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"}, | ||
| {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"}, | ||
| {RPCResult::Type::STR_HEX, "txid", "txid hash (excluding witness data) with hash bytes reversed, encoded as hex"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #31596 (comment)
Could we unify the rest as well
To follow up, I looked at occurrences of "txid", "wtxid", and "hash" (quoted strings) in the RPC code and found dozens of fields that could be updated to be consistent. It seems more challenging than I expected to update all the documentation, and now I'm also wondering if might actually be harmful to mention reversed bytes in some cases but not others, because this might give the wrong impression that bytes are not always reversed for every hash field.
I wonder if maybe a better approach to provide more consistency could be to introduce new RPC types like STR_TXID, and STR_WTXID and use them instead of the more generic STR_HEX type, so specific documentation about the meanings of these hashes could be incorporated into the type documentation and not repeated in field descriptions.
This is a documentation-only change following up on suggestions made in the #30526 review.
Motivation for this change is that I was recently reviewing #31583, which reminded me how confusing the arithmetic blob code was and made me want to write better comments.