-
Notifications
You must be signed in to change notification settings - Fork 400
doc: add document describing Taproot sighashes #933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: add document describing Taproot sighashes #933
Conversation
b4792d3 to
a7de102
Compare
doc/taproot-sighash.mediawiki
Outdated
| *** ''sha_single_output'' (32): the SHA256 of the corresponding output in <code>CTxOut</code> format. | ||
| *** '''''NEW''''' ''sha_single_output_witness'' (32): the SHA256 of the serialization of the corresponding output witnesses (rangeproof and surjection proof) in <code>CTxWitness</code> format. | ||
|
|
||
| The total length of ''SigMsg()'' is at most ''491'' bytes. Note that this does not include the size of sub-hashes such as ''sha_prevouts'', which may be cached across signatures of the same transaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general point for discussion. (Fixed length sighash with positions for fields)
- We observed that having a fixed length sighash with a fixed location for which field occurs where is super helpful in creating complex scripts easily.
- Even though we will new opcodes for getting these directly, it might still be useful to have a structured sighash.
- The only fields with variable length are
assetIssuanceandnValue(assetIssuanceinternally uses confidential value serialization). So, everything can be addressed if we fix the serialization of explicit innValue.
Note that we don't need to change the way serialization is done for these fields in p2p network, but only when we do sighash computation.
Downsides:
- more code complexity
- more divergence from core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add zeros to the hash after nNonce nAmount and nAsset such that they'd always total out to 33 bytes. And same for the fields in the issuance.
I don't think this is too much added complexity and it would give you, as you say, fixed-offset fields in the sighash. I can certainly imagine that being useful for a covenant which is trying to fix one piece of the sighash while not touching anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would still have a variable-length encoding based on the sighash flag, and I don't think there's any reasonable way to avoid that (the different flags mean you're hashing different data). But it's easy to fix/check this flag since it appears on the siganture itself, not only inside the sighash.
Do you think we should 0-pad so that the issuance data field is always 130 bytes even if there is no issuance? This adds two full sha2 compressions (at about 200ns/shot on a desktop CPU) and will rarely be nonzero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should compare this overhead to normal signature validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we still want to maintain length index prefixing, we could change the order in which we do the sighash. We could all the input specific things at the end of the sighash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normal signature validation is 50us. So an added .5us is a 1% performance hit.
I'd rather not change the order of the sighash too much to minimize the cognitive load for people coming from Bitcoin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also variable length introduced by the presence/absence of the annex. If you are interested in resolving that, then you should pad the annex hash in the case that there is no annex.
doc/taproot-sighash.mediawiki
Outdated
| *** '''''NEW''''' ''nValue'' (9--33): (possibly confidential) amount of the previous output spent by this input. | ||
| *** ''scriptPubKey'' (35): ''scriptPubKey'' of the previous output spent by this input, serialized as script inside <code>CTxOut</code>. Its size is always 35 bytes. | ||
| *** ''nSequence'' (4): ''nSequence'' of this input. | ||
| *** '''''NEW''''' ''asset_issuance'' (0 or 74--130): if ''outpoint_flag & 0x80 == 0x80'': asset issuance data of this input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be (82--130) instead of (74--130)? The minimum size is 2 explicit values + 2 32 bytes = 18 + 62 = 82
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inflationKeys field can be null, in which case it is one byte.
sanket1729
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support sidechains to sidechains? I was not aware of this. If not, what is parent_pegin_asset and how is that supposed to be used?
|
Yes, we support sidechains to sidechains, and I believe some of our functional tests depend on this fact. |
|
@sanket1729 updated doc to make confidential fields all fixed-length, including issuance which is now always 130 bytes. I don't think we care about 1% perf hit when validating ANYONECANPAY signatures. If we can get a taproot sig in a blockheader, that saves us 10 signatures' worth of validation which means that even if the block has 100 such inputs (unlikely, nobody uses ACP :)) Taproot will still be a net win. |
|
the |
|
Like, between the |
|
Good point. |
3b960ba to
169af2d
Compare
|
Rebased. What needs to be done to get this in? |
|
ACK 169af2d |
|
Removed the input |
d49aab3 to
f18fa5c
Compare
| ** ''nVersion'' (4): the ''nVersion'' of the transaction. | ||
| ** ''nLockTime'' (4): the ''nLockTime'' of the transaction. | ||
| ** If the ''hash_type & 0x80'' does not equal <code>SIGHASH_ANYONECANPAY</code>: | ||
| *** '''''NEW''''' ''sha_outpoint_flags'' (32): the SHA256 of the serialization of the concatenation of one byte per input of the input's outpoint flags shifted right by 24 bits. (The byte for an pegin input would be 0x40. The byte for an issuance input would be 0x80. The byte for both a pegin and issuance would be 0xc0.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should replace this "shifted by 24 bits" language with an explicit list of cases. In both Elements Core and rust-elements, the prevout stored internally does not actually have these bits set; they are extracted during deserialization and recorded separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However you want is fine. I phrased it this way to be forward compatible with new flags that might be added in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds reasonable. Let's keep the existing text.
doc/taproot-sighash.mediawiki
Outdated
| The function ''SigMsg(hash_type, ext_flag)'' computes the message being signed as a byte array. It is implicitly also a function of the spending transaction and the outputs it spends, but these are not listed to keep notation simple. | ||
|
|
||
| The parameter ''hash_type'' is an 8-bit unsigned value. The <code>SIGHASH</code> encodings from the legacy script system are reused, including <code>SIGHASH_ALL</code>, <code>SIGHASH_NONE</code>, <code>SIGHASH_SINGLE</code>, and <code>SIGHASH_ANYONECANPAY</code>, plus the default ''hash_type'' value ''0x00'' which results in signing over the whole transaction just as for <code>SIGHASH_ALL</code>. The following restrictions apply, which cause validation failure if violated: | ||
| * Using any undefined ''hash_type'' (not ''0x00'', ''0x01'', ''0x02'', ''0x03'', ''0x81'', ''0x82'', or ''0x83''<ref>'''Why reject unknown ''hash_type'' values?''' By doing so, it is easier to reason about the worst case amount of signature hashing an implementation with adequate caching must perform.</ref>). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove the <ref>...<ref> here and below. Alternative we could add a reference section below, but I don't think that is preferable.
| * Using any undefined ''hash_type'' (not ''0x00'', ''0x01'', ''0x02'', ''0x03'', ''0x81'', ''0x82'', or ''0x83''<ref>'''Why reject unknown ''hash_type'' values?''' By doing so, it is easier to reason about the worst case amount of signature hashing an implementation with adequate caching must perform.</ref>). | ||
| * Using <code>SIGHASH_SINGLE</code> without a "corresponding output" (an output with the same index as the input being verified). | ||
|
|
||
| '''''NEW''''' If the input under consideration is a pegin input, the fields ''nAsset'', ''nValue'' and ''scriptPubKey'', where they appear, are taken from the pegin witness data. The asset used is the asset ID on the sidechain, not that on the parent chain (if any). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is preferable for all these NEW statements to be their own paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'm sure that was my intent but I guess I didn't look at the rendered version too carefully.
doc/taproot-sighash.mediawiki
Outdated
| ** ''nLockTime'' (4): the ''nLockTime'' of the transaction. | ||
| ** If the ''hash_type & 0x80'' does not equal <code>SIGHASH_ANYONECANPAY</code>: | ||
| *** '''''NEW''''' ''sha_outpoint_flags'' (32): the SHA256 of the serialization of the concatenation of one byte per input of the input's outpoint flags shifted right by 24 bits. (The byte for an pegin input would be 0x40. The byte for an issuance input would be 0x80. The byte for both a pegin and issuance would be 0xc0.) | ||
| *** ''sha_prevouts'' (32): the SHA256 of the serialization of all input outpoints. Each prevout output index is the masked version containing outpoint flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
masked version excluding outpoint flags.
doc/taproot-sighash.mediawiki
Outdated
| ** ''spend_type'' (1): equal to ''(ext_flag * 2) + annex_present'', where ''annex_present'' is 0 if no annex is present, or 1 otherwise (the original witness stack has two or more witness elements, and the first byte of the last element is ''0x50'') | ||
| ** If ''hash_type & 0x80'' equals <code>SIGHASH_ANYONECANPAY</code>: | ||
| *** '''''NEW''''' ''outpoint_flag'' (1): the input's outpoint flags shifted right by 24 bits. (Compare ''sha_outpoint_flags'' above.) | ||
| *** ''outpoint'' (36): the <code>COutPoint</code> of this input (32-byte hash + 4-byte little-endian) where the output index includes the outpoint flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output index excludes the outpoint flags.
doc/taproot-sighash.mediawiki
Outdated
| *** ''sha_annex'' (32): the SHA256 of ''(compact_size(size of annex) || annex)'', where ''annex'' includes the mandatory ''0x50'' prefix. | ||
| * Data about this output: | ||
| ** If ''hash_type & 3'' equals <code>SIGHASH_SINGLE</code>: | ||
| *** ''sha_single_output'' (32): the SHA256 of the corresponding output in extended <code>CTxOut</code> format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a (NEW) before the word extended.
doc/taproot-sighash.mediawiki
Outdated
| *** ''sha_sequences'' (32): the SHA256 of the serialization of all input ''nSequence''. | ||
| *** '''''NEW''''' ''sha_issuances'' (32): the SHA256 of the serialization of the concatenation of all input ''assetIssuance'' or 130 '0x00' bytes for inputs with no issuance | ||
| ** If ''hash_type & 3'' does not equal <code>SIGHASH_NONE</code> or <code>SIGHASH_SINGLE</code>: | ||
| *** ''sha_outputs'' (32): the SHA256 of the serialization of all outputs in extended <code>CTxOut</code> format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a (NEW) before the word extended.
doc/taproot-sighash.mediawiki
Outdated
|
|
||
| '''''NEW''''' If the input under consideration is a pegin input, the fields ''nAsset'', ''nValue'' and ''scriptPubKey'', where they appear, are taken from the pegin witness data. The asset used is the asset ID on the sidechain, not that on the parent chain (if any). | ||
| '''''NEW''''' The epoch field prepended before signature hash is completely dropped. If there are new updates to the taproot signature hashes in elements, they will use new tagged hashes instead of incrementing epochs | ||
| '''''NEW''''' The fields ''nNonce'', ''nAsset'' and ''nValue'' are serialized in a fixed-length format, which consists of their ordinary 1/9/33 byte encoding followed by sufficiently many 0 to pad the length out to 33. When we refer to ''extended <code>CTxOut</code> format'' below, we mean the fields ''nAsset'', ''nValue'', ''nNonce'' serialized in that order in fixed-length format, followed by the ordinary length-prefixed ''scriptPubKey''. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might prefer a term like "padded CTxOut format" or something rather than "extended". To me "extended" conveys that it has more fields. However it isn't a big deal.
|
Added a commit to address all of @roconnor-blockstream comments |
doc/taproot-sighash.mediawiki
Outdated
| ** If ''hash_type & 0x80'' equals <code>SIGHASH_ANYONECANPAY</code>: | ||
| *** '''''NEW''''' ''outpoint_flag'' (1): the input's outpoint flags shifted right by 24 bits. (Compare ''sha_outpoint_flags'' above.) | ||
| *** ''outpoint'' (36): the <code>COutPoint</code> of this input (32-byte hash + 4-byte little-endian) where the output index excludes the outpoint flags. | ||
| *** '''''NEW''''' ''nAsset'' (33): (possibly confidential) assetID of the previous output spent by this input, in fixed-length format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slight preference for this to read ", in padded format". And below too.
doc/taproot-sighash.mediawiki
Outdated
| *** ''sha_single_output'' (32): the SHA256 of the corresponding output in '''NEW''' padded <code>CTxOut</code> format. | ||
| *** '''''NEW''''' ''sha_single_output_witness'' (32): the SHA256 of the serialization of the corresponding output witnesses (rangeproof and surjection proof) in <code>CTxOutWitness</code> format. | ||
|
|
||
| The total length of ''SigMsg()'' is ''443'' bytes for <code>ANYONECANPAY</code> sighashes, ''366'' bytes for non-<code>ANYONECANPAY</code> sighashes, and both numbers are reduced by 64 bytes for <code>SIGHASH_NONE</code> sighashes. Note that this does not include the size of sub-hashes such as ''sha_prevouts'', which may be cached across signatures of the same transaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I counted 442 bytes. Anyone want to count again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not updated with the latest numbers after we removed nNonce. Recalculating the numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's 410/334 for ACP/non-ACP according to my calculations/implementation. Are you taking into consideration the 32-byte annex too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was including the annex.
|
As per offline conversation, added two commits to
I haven't updated the total length arithmetic. Would probably be easiest to do this after it's implemented by poking at the Python implementation. @sanket1729 ? |
|
(fuzz test failure in PSBT module which will be shortly replaced; just kicked the CI job) |
ec4b426 to
cfe5ebb
Compare
cfe5ebb to
7ca7983
Compare
|
Rebased. CI should pass now. cc @roconnor-blockstream can I have an ack? |
@apoelstra |
|
Fixed the doc, but @sanket1729 I notice that your Python (and C++) impl completely ignore |
is the same as |
|
LGTM |
sanket1729
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK cca8224
Thanks to @roconnor-blockstream for writing the first draft of this document.