BIP 181, 182, 183: BIPs for Utreexo #1923

kcalvinalvin · 2025-08-10T06:56:50Z

These are the 3 BIPs that describe Utreexo, a consensus-compatible (non-soft fork) way to send and verify transactions without storing the full UTXO set.

The 3 BIPs are for:

The specification of the Utreexo accumulator.
The specification of Bitcoin block and tx validation using the Utreexo accumulator.
The peer to peer networking changes required to enable Utreexo nodes.

Mailing list post: https://groups.google.com/g/bitcoindev/c/W1lxBraKG_E

jmoik

some typos

utreexo-p2p-bip.md

jonatack

Thank you for proposing these drafts. They already look quite complete with respect to the editorial requirements (BIPs 2 and 3). I've done a cursory first pass. No immediate conceptual feedback. A few editorial comments follow; feel free to ignore them during conceptual review until they are applicable.

utreexo-p2p-bip.md

utreexo-validation-bip.md

utreexo-accumulator-bip.md

utreexo-validation-bip.md

utreexo-accumulator-bip.md

petertodd · 2025-08-12T15:52:53Z

You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common.

ghost · 2025-08-12T18:29:31Z

I strongly recommend replacing SHA-256 with SHAKE256 (from the SHA-3 standard) for the following reasons:

1. Security Advantages

🔒 Provides built-in protection against length-extension attacks
📏 Offers flexible output lengths (supports 128-bit and 256-bit security levels)
⚙️ Based on Keccak sponge construction (NIST FIPS 202 standard)
🌐 Aligns with post-quantum cryptography standards

2. Comparative Analysis: SHA-256 vs SHAKE256

Characteristic	SHA-256	SHAKE256
Algorithm Family	SHA-2	SHA-3 (Keccak)
Output Flexibility	Fixed 256-bit	Arbitrary length
Security Properties	Vulnerable to length-extension	Resistant to length-extension
Internal Structure	Merkle-Damgård	Sponge function
Standardization	NIST FIPS 180-4	NIST FIPS 202

3. Functional Example

Input: Bitcoin

SHAKE256 (512-bit output):
6beb0661ba1fa7289bf359fbb81550bd9641cf5abc62a14d466c421c8a86e528e027632ec0e7ceb994650566f3c8258af2240333b6d0e9186766fd2c1ebb763a

SHAKE256 (256-bit output):
6beb0661ba1fa7289bf359fbb81550bd9641cf5abc62a14d466c421c8a86e528

4. Implementation Benefits

✅ Maintains 256-bit output compatibility where needed
✅ Future-proofs against emerging cryptographic vulnerabilities
✅ Reduces potential attack vectors through improved design
✅ Supports Bitcoin's security evolution while maintaining performance

5. Technical Reference

For detailed cryptographic differences:
Cryptographic Comparison: SHA-2 vs SHA-3

kcalvinalvin · 2025-08-18T11:06:29Z

You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common.

Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256.

But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?

kcalvinalvin · 2025-08-18T11:10:24Z

I strongly recommend replacing SHA-256 with SHAKE256 (from the SHA-3 standard) for the following reasons:

SHAKE256 is not used in Bitcoin and introduces a new hash which increases the trust-assumption. We do not want to do this.

jonatack · 2025-08-18T14:35:55Z

Some friendly moderation to keep the discussion focused on technical review -- thanks.

kcalvinalvin · 2025-08-18T14:46:13Z

The reliance of Bitcoin on SHA-2—a legacy hash function designed by the National Security Agency (NSA)—introduces non-trivial security risks, particularly when considering the often-dismissed threat posed by quantum adversaries.

SHA256 and SHA512 are quantum resistent.

Migrating to SHAKE256 (a variant of SHA-3) would represent a meaningful improvement, though such a change merely delays the inevitable: Bitcoin must eventually transition to a quantum-resistant cryptographic framework. When this occurs—and it will, regardless of opposition—SHA-2, along with ECDSA private keys, public keys, and signatures, will become obsolete.
See: Lenght extension attack (Bitcoin is vulnerable because it's using SHA-256)

Ok but this has nothing to do with this BIP.

murchandamus · 2025-08-18T22:15:07Z

@1BitcoinBoWP1FZ4xwTNkq6XksKidmgYYw, please cut out the LLM generated comments. If any of us were interested in seeing an LLM’s prediction of what might be said about a topic, we could prompt one ourselves.

petertodd · 2025-08-18T22:18:29Z

On Mon, Aug 18, 2025 at 04:06:51AM -0700, Calvin Kim wrote: kcalvinalvin left a comment (bitcoin/bips#1923) > You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common. Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256. But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?

No part of the Bitcoin consensus protocol uses SHA512.

kcalvinalvin · 2025-08-19T06:17:17Z

On Mon, Aug 18, 2025 at 04:06:51AM -0700, Calvin Kim wrote: kcalvinalvin left a comment (bitcoin/bips#1923) > You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common. Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256. But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?
No part of the Bitcoin consensus protocol uses SHA512.

Ok but you've stated in your previous comment "You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol". Would be very helpful to see what type of justifications the other protocols have made.

Second, I don't think it matters if SHA512 wasn't used in the Bitcoin consensus protocol. SHA512 is used in BIP32 and the argument that SHA512 is safe for generating private keys but not safe for Bitcoin consensus isn't sound.

I think our original justification (better performance with SHA512/256) mentioned in the BIP is sound. Happy to provide the benchmarks, they're being worked on at the moment.

lucad70 · 2025-08-21T19:13:46Z

utreexo-validation-bip.md

+| Name              | Type                     | Description                               |
+| ----------------- | ------------------------ | ----------------------------------------- |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |


For clarification, is the Utreexo_Tag_V1 really used twice in preimage to the hash?

My guess would be that this duplication is unintended.

Suggested change

| Name | Type | Description |

| ----------------- | ------------------------ | ----------------------------------------- |

| Utreexo_Tag_V1 | 64 byte array | The version tag to be prepended to the leafhash. |

| Utreexo_Tag_V1 | 64 byte array | The version tag to be prepended to the leafhash. |

| Name | Type | Description |

| ----------------- | ------------------------ | ----------------------------------------- |

| Utreexo_Tag_V1 | 64 byte array | The version tag to be prepended to the leafhash. |

Oh no the duplication is intended.

Since we use SHA512/256 as the hash function, each chunk is 128 bytes. Since the version tag is only 64 bytes, we need two of them.

petertodd · 2025-08-24T13:48:55Z

On Mon, Aug 18, 2025 at 04:06:51AM -0700, Calvin Kim wrote: kcalvinalvin left a comment (bitcoin/bips#1923) > You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol. Right now you just link to a paper from 2011. But that paper is out of date now that hardware support for SHA-256 has become common. Sure we can update the accumulator BIP with benchmarks for SHA512/256 vs SHA256. But could you link to the aforementioned justifications for the other parts of the Bitcoin protocol that use SHA512?
No part of the Bitcoin consensus protocol uses SHA512.

Ok but you've stated in your previous comment "You need to justify why you're using SHA-512/256 rather than SHA-256, like the rest of the Bitcoin protocol". Would be very helpful to see what type of justifications the other protocols have made.

Second, I don't think it matters if SHA512 wasn't used in the Bitcoin consensus protocol. SHA512 is used in BIP32 and the argument that SHA512 is safe for generating private keys but not safe for Bitcoin consensus isn't sound.

I think our original justification (better performance with SHA512/256) mentioned in the BIP is sound. Happy to provide the benchmarks, they're being worked on at the moment.

The question is 1) why are we added one new dependency to consensus implementations, and 2) is this actually a performance increase, given that dedicated SHA256 hardware is becoming common?

Length-extension attacks are not relevant for this use-case as we are only committing to public data.

kcalvinalvin

All of the review comments are addressed and the rationale for BIPs 182 and 183 were added.

BIP-0183 was also edited in the following ways:

1: Images updated with caption
2: Images now updated with transparent backgrounds and changed the colors so they can be read in dark mod
3: Changed the layout of the images and the paragraphs to be more legible.

utreexo-p2p-bip.md

kcalvinalvin · 2025-08-29T08:53:36Z

utreexo-accumulator-bip.md

+To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,
+where N is the number of elements ever added to the set, while still keeping proof sizes small and verification efficient.


Technically the current Utreexo design is O(log2(N)) of all txos since the forest doesn't shrink on a deletion. We just move the leaf up so it has the same affect as shrinking the forest.

kcalvinalvin · 2025-08-29T08:58:51Z

utreexo-validation-bip.md

+| Name              | Type                     | Description                               |
+| ----------------- | ------------------------ | ----------------------------------------- |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |
+| Utreexo_Tag_V1    | 64 byte array            | The version tag to be prepended to the leafhash. |


Oh no the duplication is intended.

Since we use SHA512/256 as the hash function, each chunk is 128 bytes. Since the version tag is only 64 bytes, we need two of them.

kcalvinalvin · 2025-08-29T09:07:08Z

utreexo-accumulator-bip.md

+
+The following utility functions are required for performing accumulator operations:
+
+**parent_hash(left, right):** Returns the hash of the concatenation of two child hashes (`left` and `right`).


Does this ambiguity regarding the depth of the leaf in the tree not introduce similar weaknesses as the original Merkle tree construction?

Not quite sure which weakness you're referring to here. Is it CVE-2012-2459 (one from calculating the Bitcoin block header commitment)? Since we don't duplicate hashes, it's not vulnerable to that particular attack.

Why would we float up leaf-hashes rather than create a tagged hash at each level?

Since we float up the leaf hashes, we can save on the proofs being sent over for the sibling later on.

On a tree like so, proof for 01 is 00, 09, 13.

14 |---------------\ 12 13 |-------\ |-------\ 08 09 10 11 |---\ |---\ |---\ |---\ 00 01 02 03 04 05 06 07

If we delete 00, then 01 moves up to 08. The proof for 01 is now 09 and 13. The proof got shorter.

14 |---------------\ 12 13 |-------\ |-------\ 01 09 10 11 |---\ |---\ |---\ |---\ 02 03 04 05 06 07

kcalvinalvin · 2025-08-29T09:26:23Z

utreexo-accumulator-bip.md

+    return sha512_256(left + right)
+```
+
+**treerows(numleaves):** Returns the minimum number of bits required to represent `numleaves - 1`. This corresponds to the height of the largest tree in the forest. Returns `0` if `numleaves` is `0`.


Ah it's because we wanted treerows to return the index of the largest tree not the length.
For the below tree, numleaves = 4 but we want treerows to return 2 not 3.

row 2: 06 |-------\ row 1: 04 05 |---\ |---\ row 0: 00 01 02 03

If we just took the minimum number of bits to represent numleaves = 4, we'd get 3. So to account for this, we take the minimum number of bits needed to represent numleaves-1. This off-by-one happens when numleaves is a power of two.

@adiabat did talk about wanting to make treerows return the length and not the index a while back so last chance to speak up? :)

I've added the explanation in the bip as well.

kcalvinalvin · 2025-08-29T10:08:22Z

utreexo-validation-bip.md

+proofs. Each of the positions in (1) refer to the UTXO hash preimage in the same
+index.


For some reason I had thought that the accumulator proof was a Merkle branch, but now reading this, it makes me think that the proofs are built-up from the leaf preimages. Which of the two is correct, and could you perhaps check whether some more clarification should be added here to make it unambiguous?

You are right, there's the merkle branches themselves and the leaf preimages are an entirely separate data apart from that.

I'll read it over again and make clarifications where needed.

kcalvinalvin · 2025-08-29T11:31:31Z

utreexo-p2p-bip.md

+CSNs have the goal of minimizing data storage and download while performing block validation.
+Archive and bridge nodes store more data and provide this data to CSNs.
+
+Bridge nodes are nodes that can add inclusion proofs to mempool transactions, support the same set of messages as CSNs, and should in fact be indistinguishable from CSNs on the network.


It’s not clear to me how "bridge nodes should in fact be indistinguishable from CSNs on the network". By whom are they indistinguishable. In what regard are they indistinguishable?

They're indistinguishable as we don't explicitly specify which nodes are bridges. The sentence was an attempt at clarifying a common misconception that a CSN must connect to bridge nodes.

Shouldn’t they, e.g., be frequently the first peer to notify about new transactions appearing in the mempool and blocks having been found as they act as the translation layer and therefore the initial source of data for the Utreexo-portion of the node network?

Yes this is true. They usually should be the first to notify utreexo peers about new txs and blocks

kcalvinalvin · 2025-08-29T11:40:37Z

utreexo-p2p-bip.md

+The node will have the block and the TTLs for the outputs of the given block which it can then use to cache parts of the inclusion proof and only request the needed parts of an inclusion proof for future blocks.
+
+We note that it is feasible for a node to receive incorrect TTL values from malicious nodes and this can negatively impact the bandwidth savings.
+Nodes can mitigate this by not downloading TTL values too far into the future or by checking if the `TTL` message received was included in the accumulator hard-coded into the binary.


Oh I should clarify this.

Since nothing is being committed to the TTL messages, a node can just lie about the values in the message. To prevent this, the node should either:

1: don't download too far into the future since the damage done will be greater.
2: rely on the pre-committed (aka "hard coded into the binary") ttl accumulator in the node software. The ttl accumulator has ttls for each of the blocks accumulated. With this accumulator, the node can check if the received ttl is valid or invalid by checking for its existence in the ttl accumulator.

kcalvinalvin · 2025-09-07T12:55:34Z

utreexo-p2p-bip.md

+
+## Abstract
+
+Utreexo creates a compact representation of the UTXO set that only takes a couple of kilobytes.


It's essentially still a kilobyte but since we can support leaves up to the maximum of uint64, we can have 64 roots which is 64*32 = 2048. So 2KB max.

murchandamus

Thanks for the update.

I gave the diff a quick skim:

murchandamus · 2025-09-16T20:19:29Z

bip-0182.md

+The UTXO proof has 2 elements: the accumulator proof and the leaf data. The
+leaf data provides the necessary UTXO data for block validation that would be
+stored locally for non-Utreexo nodes. Non-Utreexo nodes store this data (under "chainstate/" for Bitcoin Core)
+but since utreexo nodes don't this data, it must be provided.


Missing a word.

Suggested change

but since utreexo nodes don't this data, it must be provided.

but since utreexo nodes don't <missing word> this data, it must be provided.

murchandamus · 2025-09-16T20:24:35Z

bip-0182.md

+
+**Why use the Utreexo accumulator to keep track of UTXOs instead of a key-value database like leveldb?**
+
+There's two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:


Suggested change

There's two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:

There are two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:

murchandamus · 2025-09-16T20:26:50Z

bip-0182.md

+There's two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:
+
+ 1. Puts a cap on the UTXO set growth.
+ 3. Performance gains with the elimination of random reads/writes.


Renders right of course, but still:

Suggested change

3. Performance gains with the elimination of random reads/writes.

2. Performance gains with the elimination of random reads/writes.

murchandamus · 2025-09-16T20:28:35Z

bip-0182.md

+Currently, the UTXO set size is $O(log(N))$ where $N$ is the number of UTXOs.
+By utilizing the Utreexo accumulator, we're able to cap the UTXO set growth at $O(log_2(N))$.


Given that you don’t store the UTXO set, but an accumulator that commits to the UTXO set, perhaps these two sentences should be amended?

murchandamus · 2025-09-16T20:51:52Z

It would perhaps be good if one or two other people gave it also a read, but either way, it seems pretty complete to me. What’s the status on your end? Do you still have planned work, or are waiting for people to finish reviews?

vostrnad

It would perhaps be good if one or two other people gave it also a read

Here's my read. I've suggested mainly formatting and capitalization changes, but at least two suggestions are quite important: the distinction between "varint" and "compact size", and the broken cross-BIP links.

vostrnad · 2025-09-17T00:08:48Z

bip-0181.md

+The Utreexo accumulator is based on an append-only Merkle tree design introduced in [^1],
+which provides logarithmic-sized inclusion proofs. Utreexo extends this design to support dynamic updates,
+specifically enabling deletions from the set—a requirement for tracking UTXO spends in Bitcoin.
+To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,


Specifying the logarithm base is redundant in big O notation, as changing the base is equivalent to multiplying by a constant factor.

Suggested change

To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,

To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log(N))$,

vostrnad · 2025-09-17T00:08:51Z

bip-0181.md

+a 16-element tree ($2^4$), a 4-element tree ($2^2$), and a 1-element tree ($2^0$), with gaps at the 8-element ($2^3$)
+and 2-element ($2^1$) positions.
+
+Each of the hashes in the forest can be referred by an integer label. This labeling is a convention we find easiest


Suggested change

Each of the hashes in the forest can be referred by an integer label. This labeling is a convention we find easiest

Each of the hashes in the forest can be referred to by an integer label. This labeling is a convention we find easiest

vostrnad · 2025-09-17T00:08:54Z

bip-0181.md

+
+**treerows(numleaves):** Returns the minimum number of bits required to represent `numleaves - 1`. This corresponds to the height of the largest tree in the forest. Returns `0` if `numleaves` is `0`.
+
+The reason for taking the minimum number of bits required for `numleaves-1` and not `numleaves` is because when `numleaves` is a power of two, we'd get an off-by-one error.


It would be nice to have consistent spacing around the minus sign, there's both numleaves - 1 and numleaves-1. Same goes for the equals sign below.

vostrnad · 2025-09-17T00:08:58Z

bip-0181.md

+The calculate roots algorithm is defined as `CalculateRoots(numleaves, []hash, proof) -> calculated_roots`:
+
+- Check if length of `proof.targets` is equal to the length of `[]hash`. Return early if they're not equal.
+- map `proof.targets` to their hash.


Suggested change

- map `proof.targets` to their hash.

- Map `proof.targets` to their hash.

vostrnad · 2025-09-17T00:09:01Z

bip-0181.md

+  - Map parent hash to the parent position.
+- Return calculated_roots
+
+The algorithm implemented in python:


Suggested change

The algorithm implemented in python:

The algorithm implemented in Python:

vostrnad · 2025-09-17T00:10:33Z

bip-0183.md

+| length of the proof hashes | varint                       | The length of the proof hashes                                                                                                                                                                                  |
+| proof hashes               | vector of 32 byte hashes     | The vector of the requested Utreexo summaries                                                                                                                                                                   |
+| length of the leafdatas    | varint                       | The length of the leafdatas                                                                                                                                                                                     |
+| leafdatas                  | vector of compact leafdatas  | The preimage of the leafdatas referenced in the bitcoin transaction. MUST be in the order of the referenced inputs. Unconfirmed inputs do not have a corresponding leaf data. See compact leaf data for details |


Suggested change

| leafdatas | vector of compact leafdatas | The preimage of the leafdatas referenced in the bitcoin transaction. MUST be in the order of the referenced inputs. Unconfirmed inputs do not have a corresponding leaf data. See compact leaf data for details |

| leafdatas | vector of compact leafdatas | The preimage of the leafdatas referenced in the Bitcoin transaction. MUST be in the order of the referenced inputs. Unconfirmed inputs do not have a corresponding leaf data. See compact leaf data for details |

vostrnad · 2025-09-17T00:10:35Z

bip-0183.md

+
+#### MSG_UTREEXO_ROOT
+
+`MSG_UTREEXO_ROOT` is the utreexo accumulator state at a given height with a proof to a utreexo accumulator of the utreexo roots.


Suggested change

`MSG_UTREEXO_ROOT` is the utreexo accumulator state at a given height with a proof to a utreexo accumulator of the utreexo roots.

`MSG_UTREEXO_ROOT` is the Utreexo accumulator state at a given height with a proof to a Utreexo accumulator of the Utreexo roots.

There are also several instances of lowercase "utreexo" in the table below.

vostrnad · 2025-09-17T00:10:37Z

bip-0183.md

+For example, a computer could divide the task of validating 800,000 blocks into 100 tasks of 8,000 blocks each: blocks 1 through 800, 800 through 1600, 1600 through 2400, and so on.
+
+In order start the 1600 through 2400 IBD task, however, the node should know what the state of the utxo set is at block 1600, so that it can validate and modify the accumulator.


Suggested change

For example, a computer could divide the task of validating 800,000 blocks into 100 tasks of 8,000 blocks each: blocks 1 through 800, 800 through 1600, 1600 through 2400, and so on.

In order start the 1600 through 2400 IBD task, however, the node should know what the state of the utxo set is at block 1600, so that it can validate and modify the accumulator.

For example, a computer could divide the task of validating 800,000 blocks into 100 tasks of 8,000 blocks each: blocks 1 through 800, 801 through 1600, 1601 through 2400, and so on.

In order start the 1601 through 2400 IBD task, however, the node should know what the state of the UTXO set is at block 1600, so that it can validate and modify the accumulator.

vostrnad · 2025-09-17T00:10:40Z

bip-0183.md

+
+These hints are statements of fact that are hard-coded into the program itself, and if they are false all bets are off about the program.
+
+Archive nodes create a forest of Linkup hints, so that they can prove, with respect to the Linkup forest roots in a node performing IBD, what their binary has claimed the utxo accumulator state to be at any block height.


Suggested change

Archive nodes create a forest of Linkup hints, so that they can prove, with respect to the Linkup forest roots in a node performing IBD, what their binary has claimed the utxo accumulator state to be at any block height.

Archive nodes create a forest of Linkup hints, so that they can prove, with respect to the Linkup forest roots in a node performing IBD, what their binary has claimed the UTXO accumulator state to be at any block height.

vostrnad · 2025-09-17T00:10:42Z

bip-0183.md

+
+#### MSG_GET_UTREEXO_ROOT
+
+`MSG_GET_UTREEXO_ROOT` is used to request a utreexo accumulator state at a given height.


Suggested change

`MSG_GET_UTREEXO_ROOT` is used to request a utreexo accumulator state at a given height.

`MSG_GET_UTREEXO_ROOT` is used to request a Utreexo accumulator state at a given height.

luisschwab · 2025-09-19T22:58:02Z

Some test vectors are in order as well.

luisschwab · 2025-09-19T22:48:40Z

bip-0183.md

+
+![Utreexo TX relay multiple Utreexo proof hash vectors](bip-0183/utreexo-tx-relay-with-multiple-proofhash-inventory-vectors.png)
+
+It's possible to have an inv message with multiple txs as well.


Suggested change

It's possible to have an inv message with multiple txs as well.

It's possible to have an `inv` message with multiple transactions as well.

luisschwab · 2025-09-19T22:49:20Z

bip-0183.md

+
+### Block Propagation
+
+Legacy block propagation without Compact Blocks comprises of three steps:


Suggested change

Legacy block propagation without Compact Blocks comprises of three steps:

Legacy block propagation without Compact Blocks is comprised of three steps:

luisschwab · 2025-09-19T22:50:00Z

bip-0183.md

+1. Node A sends an inv message or a block header to Node B.
+2. Node B makes a getdata request for the block.


Suggested change

1. Node A sends an inv message or a block header to Node B.

2. Node B makes a getdata request for the block.

1. Node A sends an `inv` message or a block header to Node B.

2. Node B makes a `getdata` request for the block.

luisschwab · 2025-09-19T22:50:22Z

bip-0183.md

+2. Node B makes a getdata request for the block.
+3. Node A sends the block data to Node B.
+
+Below image illustrates how a non-Utreexo node would relay blocks without using Compact Blocks.


Suggested change

Below image illustrates how a non-Utreexo node would relay blocks without using Compact Blocks.

The image below illustrates how a non-Utreexo node would relay blocks without using Compact Blocks.

luisschwab · 2025-09-19T22:50:53Z

bip-0183.md

+1. Node A sends an inv message or a block header to Node B.
+2. Node B makes a getdata request for the block.
+3. Node B makes a getutreexoproof request for the block.


Suggested change

1. Node A sends an inv message or a block header to Node B.

2. Node B makes a getdata request for the block.

3. Node B makes a getutreexoproof request for the block.

1. Node A sends an `inv` message or a block header to Node B.

2. Node B makes a `getdata` request for the block.

3. Node B makes a `getutreexoproof` request for the block.

luisschwab · 2025-09-19T23:18:02Z

bip-0183.md

+
+### Commitment scheme for TTL messages
+
+We choose an arbitrary height `X` and go through each of `TTL info` in all the the `Utreexo TTL` values up until that height.


Suggested change

We choose an arbitrary height `X` and go through each of `TTL info` in all the the `Utreexo TTL` values up until that height.

We choose an arbitrary height `X` and go through each of `TTL Info`s in all of the `Utreexo TTL` values up until that height.

luisschwab · 2025-09-19T23:18:21Z

bip-0183.md

+
+We choose an arbitrary height `X` and go through each of `TTL info` in all the the `Utreexo TTL` values up until that height.
+
+If the TTL in the `TTL info` is greater than the [numleaves](bip-0181.md#Definitions) value of the Utreexo accumulator at the chosen height `X`, we reset the `death position` and the `TTL` values to their default of 0.


Suggested change

If the TTL in the `TTL info` is greater than the [numleaves](bip-0181.md#Definitions) value of the Utreexo accumulator at the chosen height `X`, we reset the `death position` and the `TTL` values to their default of 0.

If the TTL in the `TTL Info` is greater than the [numleaves](bip-0181.md#Definitions) value of the Utreexo accumulator at the chosen height `X`, we reset the `death position` and the `TTL` values to their default of 0.

luisschwab · 2025-09-19T23:19:09Z

bip-0183.md

+
+**Why is there a separate NODE_UTREEXO_ARCHIVE service bit from the NODE_UTREEXO service bit?**
+
+For archive nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival nodes.


Suggested change

For archive nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival nodes.

For archival nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival node.

luisschwab · 2025-09-19T23:20:00Z

bip-0183.md

+
+We decided to communicate the positions in the Utreexo merkle forest by inventory vectors instead of a separate message to avoid an extra round trip during the transaction propagation.
+
+As mentioned above in [Transaction Relay](#transaction-relay), non-Utreexo nodes propagate a transaction in these 3 steps:


Suggested change

As mentioned above in [Transaction Relay](#transaction-relay), non-Utreexo nodes propagate a transaction in these 3 steps:

As mentioned above in [Transaction Relay](#transaction-relay), non-Utreexo nodes propagate a transaction in 3 steps:

luisschwab · 2025-09-19T23:21:00Z

bip-0183.md

+  2. Send a message to get the positions in the Utreexo merkle forest for the transaction.
+  3. Receive the positions in the Utreexo merkle forest.


Suggested change

2. Send a message to get the positions in the Utreexo merkle forest for the transaction.

3. Receive the positions in the Utreexo merkle forest.

2. Send a message to get the positions in the Utreexo Merkle forest for the transaction.

3. Receive the positions in the Utreexo Merkle forest.

luisschwab · 2025-09-20T00:01:21Z

bip-0183.md

+|----------------------------|-------------------------|------------------------------------------------------------------------------------------------------------------|
+| blockhash                  | 32 byte vector          | The hash of the block that the requested utreexo root message is for                                             |
+
+### New Inventory Types


For all inventory types: be explicit about what needs to be provided and in what format (eg: blockhash, leaf positions, etc..).

ismaelsadeeq · 2025-10-02T08:16:18Z

bip-0181.md

+of the UTXO set. Since it can grow indefinitely, bounded only by block size, it represents a
+long-term scalability concern.
+
+Utreexo is a dynamic accumulator that enables the UTXO set to be represented in just a few kilobytes,


Coming from https://github.com/cryptography-camp/workbook

The defined accumulator in BIP 181 is positive because it supports membership proofs.

Suggested change

Utreexo is a dynamic accumulator that enables the UTXO set to be represented in just a few kilobytes,

Utreexo is a dynamic positive accumulator that enables the UTXO set to be represented in just a few kilobytes

ismaelsadeeq · 2025-10-02T08:37:45Z

bip-0181.md

+The Utreexo accumulator is based on an append-only Merkle tree design introduced in [^1],
+which provides logarithmic-sized inclusion proofs. Utreexo extends this design to support dynamic updates,
+specifically enabling deletions from the set—a requirement for tracking UTXO spends in Bitcoin.


Parsing through the linked paper it claimed that the accumulator defined there is sound and strong.

With the extension here to make that accumulator dynamic, I suppose it is still correct, sound and strong?
Perhaps link to some resource on where that was explicitly studied

ismaelsadeeq · 2025-10-02T08:51:31Z

I think our original justification (better performance with SHA512/256) mentioned in the BIP is sound. Happy to provide the benchmarks, they're being worked on at the moment.

This point should also be added in the rationale along with the benchmarks when available

murchandamus

It looks like there is still work in progress here. Please let me know when the review feedback has been resolved.

jonatack · 2025-11-26T19:37:21Z

@kcalvinalvin there doesn't seem to be any response since late August -- do you plan to address the review and update here?

kcalvinalvin · 2025-11-28T12:50:17Z

@kcalvinalvin there doesn't seem to be any response since late August -- do you plan to address the review and update here?

Yes. There have been multiple protocol changes to BIP183 since then and I've been busy with the implementations for those changes. I'll get to the reviews and update the BIP.

ajtowns · 2025-12-10T02:31:52Z

bip-0183.md

+`MSG_UTREEXO_PROOF` is all the data required for a CSN or archive node using the Utreexo accumulators to validate a Bitcoin block.
+
+Its `cmdString` for P2PV1 is `uproof`.
+Its [BIP324 P2PV2](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki#user-content-v2_Bitcoin_P2P_message_structure) message type is `29`.


I think these values should be reserved in bip 324 before being added to other bips, to help avoid conflicts. (At worst, perhaps bip-324 could be updated in this PR)

That's a good point! I was also told ages ago that we should make a PR to Core reserving the service bits we are using. @kcalvinalvin I think we should also do that?

rustaceanrob · 2025-12-29T14:00:28Z

bip-0183.md

+To save that bandwidth, we only send a Compact Leaf Data, that contains all missing information for the receiving peer to reconstruct the full leaf data.
+A compact leaf data is defined as:
+
+| Field        | type                         | Description     |


The leaf data is quite similar to CTxUndo in Bitcoin Core (de-serialized as a Coin) . I suggest this data be sent as a separate message, as non-Utreexo clients could also make use of this data, and verify its integrity to varying degrees:

The SwiftSync protocol requires this data to perform full validation and enable parallel block downloads. The integrity of the data is verified at a pre-determined block height. A client cannot be mislead into accepting an invalid state.

~~BIP-157 clients may audit the construction of a compact block filter when two peers disagree on the hash of a filter for a particular block. This assumes the client is not eclipsed.~~ I realized after the fact this is not accurate. One could change the block undo data for trivially spendable scripts and still produce a valid block and undo data combination.

BIP-157 clients do not have a way to reasonably estimate fee rates without a third party oracle. This data would allow for block-level fee rate analysis by light clients, given the client audits the undo data indeed corresponds to the block filter they have received.

There is no protocol-native way for clients to scan for silent payments. This would allow a "light" client to download both the block undo data and compact block filter to check for potential payments by computing partial secrets locally.

If the current serialization format from Bitcoin Core is used, nodes serving this data can simply read the bytes from disk and send them directly over the wire without a de-serialization step. The trade-off here is of course the TTLs. To compensate, along with the length of the message, a header section may be included that specifies a height filter for coins that are assumed to already be in the client's cache. So a client requests "I would like all coins spent in this block, but I have the coins that were created in the last N blocks already," however this would required interpreting the undo-data from disk.

When I checked in October, undo data was approximately 90GB on disk. Is this even significant compared to the proofs?

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch 2 times, most recently from 9b3eafb to a94f643 Compare August 10, 2025 07:09

jonatack added the New BIP label Aug 10, 2025

jmoik reviewed Aug 11, 2025

View reviewed changes

jonatack reviewed Aug 11, 2025

View reviewed changes

luisschwab reviewed Aug 11, 2025

View reviewed changes

utreexo-accumulator-bip.md Outdated Show resolved Hide resolved

utreexo-accumulator-bip.md Outdated Show resolved Hide resolved

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch 2 times, most recently from cb2993c to d1d0342 Compare August 12, 2025 06:23

luisschwab mentioned this pull request Aug 14, 2025

Socratic Seminar 44 (August 2025) Bitcoin-Grove/miamibitdevs.org#23

Closed

lucad70 mentioned this pull request Aug 14, 2025

Agosto 2025 ClubeBitcoinUnB/bitdevs.bsb.br#25

Closed

bitcoin deleted a comment Aug 18, 2025

bitcoin deleted a comment from kcalvinalvin Aug 18, 2025

This comment was marked as off-topic.

Sign in to view

bitcoin deleted a comment from kcalvinalvin Aug 18, 2025

bitcoin deleted a comment Aug 18, 2025

This comment was marked as abuse.

Sign in to view

This comment was marked as off-topic.

Sign in to view

lucad70 reviewed Aug 21, 2025

View reviewed changes

diogo-ck mentioned this pull request Aug 23, 2025

Tópicos 2025/08 curitibabitdevs/curitibabitdevs.org#24

Closed

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from d67f429 to 253c739 Compare September 7, 2025 12:20

BIP181: Add the Utreexo accumulator BIP

d89952d

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from 253c739 to 091afe1 Compare September 7, 2025 12:29

BIP182: Add the Utreexo validation BIP

4aa26f3

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from 091afe1 to 260f2c9 Compare September 7, 2025 12:43

kcalvinalvin added 2 commits September 7, 2025 21:52

BIP183: Add the Utreexo P2P BIP

68da366

Update README table to include BIPs: 181, 182, 183

bd1e242

kcalvinalvin force-pushed the 2025-08-10-utreexo-bips branch from 260f2c9 to bd1e242 Compare September 7, 2025 12:52

kcalvinalvin changed the title ~~BIP draft: BIPs for Utreexo~~ BIP 181, 182, 183: BIPs for Utreexo Sep 7, 2025

kcalvinalvin commented Sep 7, 2025

View reviewed changes

jaoleal mentioned this pull request Sep 8, 2025

Mention utreexo bips on readme getfloresta/Floresta#629

Draft

18 tasks

murchandamus reviewed Sep 16, 2025

View reviewed changes

vostrnad reviewed Sep 17, 2025

View reviewed changes

luisschwab reviewed Sep 19, 2025

View reviewed changes

luisschwab reviewed Sep 20, 2025

View reviewed changes

ismaelsadeeq reviewed Oct 2, 2025

View reviewed changes

murchandamus added the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Oct 6, 2025

murchandamus reviewed Oct 6, 2025

View reviewed changes

storopoli mentioned this pull request Nov 2, 2025

feat(p2p): implement BIP-0183 (Utreexo Peer Services) rust-bitcoin/rust-bitcoin#5009

Draft

2 tasks

murchandamus mentioned this pull request Nov 10, 2025

BIP 110: Reduced Data Temporary Softfork #2017

Open

kcalvinalvin marked this pull request as draft November 30, 2025 17:34

ajtowns reviewed Dec 10, 2025

View reviewed changes

rustaceanrob mentioned this pull request Dec 15, 2025

Implementation of SwiftSync bitcoin/bitcoin#34004

Draft

rustaceanrob reviewed Dec 29, 2025

View reviewed changes

murchandamus mentioned this pull request Jan 9, 2026

BIP Draft: Optimal Batch Proofs for Utreexo #2079

Closed

		To accommodate this, Utreexo changes the storage requirement from the accumulator design in [^1] to $O(log_2(N))$,
		where N is the number of elements ever added to the set, while still keeping proof sizes small and verification efficient.


		The following utility functions are required for performing accumulator operations:

		parent_hash(left, right): Returns the hash of the concatenation of two child hashes (`left` and `right`).

		proofs. Each of the positions in (1) refer to the UTXO hash preimage in the same
		index.


		## Abstract

		Utreexo creates a compact representation of the UTXO set that only takes a couple of kilobytes.

	but since utreexo nodes don't this data, it must be provided.
	but since utreexo nodes don't <missing word> this data, it must be provided.


		Why use the Utreexo accumulator to keep track of UTXOs instead of a key-value database like leveldb?

		There's two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:

	There's two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:
	There are two main advantages to using the Utreexo accumulator instead of a key-value database like leveldb:

	3. Performance gains with the elimination of random reads/writes.
	2. Performance gains with the elimination of random reads/writes.

		Currently, the UTXO set size is $O(log(N))$ where $N$ is the number of UTXOs.
		By utilizing the Utreexo accumulator, we're able to cap the UTXO set growth at $O(log_2(N))$.

	Each of the hashes in the forest can be referred by an integer label. This labeling is a convention we find easiest
	Each of the hashes in the forest can be referred to by an integer label. This labeling is a convention we find easiest


		treerows(numleaves): Returns the minimum number of bits required to represent `numleaves - 1`. This corresponds to the height of the largest tree in the forest. Returns `0` if `numleaves` is `0`.

		The reason for taking the minimum number of bits required for `numleaves-1` and not `numleaves` is because when `numleaves` is a power of two, we'd get an off-by-one error.

	- map `proof.targets` to their hash.
	- Map `proof.targets` to their hash.

	The algorithm implemented in python:
	The algorithm implemented in Python:

	\| leafdatas \| vector of compact leafdatas \| The preimage of the leafdatas referenced in the bitcoin transaction. MUST be in the order of the referenced inputs. Unconfirmed inputs do not have a corresponding leaf data. See compact leaf data for details \|
	\| leafdatas \| vector of compact leafdatas \| The preimage of the leafdatas referenced in the Bitcoin transaction. MUST be in the order of the referenced inputs. Unconfirmed inputs do not have a corresponding leaf data. See compact leaf data for details \|


		#### MSG_UTREEXO_ROOT

		`MSG_UTREEXO_ROOT` is the utreexo accumulator state at a given height with a proof to a utreexo accumulator of the utreexo roots.

		For example, a computer could divide the task of validating 800,000 blocks into 100 tasks of 8,000 blocks each: blocks 1 through 800, 800 through 1600, 1600 through 2400, and so on.

		In order start the 1600 through 2400 IBD task, however, the node should know what the state of the utxo set is at block 1600, so that it can validate and modify the accumulator.


		These hints are statements of fact that are hard-coded into the program itself, and if they are false all bets are off about the program.

		Archive nodes create a forest of Linkup hints, so that they can prove, with respect to the Linkup forest roots in a node performing IBD, what their binary has claimed the utxo accumulator state to be at any block height.


		#### MSG_GET_UTREEXO_ROOT

		`MSG_GET_UTREEXO_ROOT` is used to request a utreexo accumulator state at a given height.


		![Utreexo TX relay multiple Utreexo proof hash vectors](bip-0183/utreexo-tx-relay-with-multiple-proofhash-inventory-vectors.png)

		It's possible to have an inv message with multiple txs as well.

	It's possible to have an inv message with multiple txs as well.
	It's possible to have an `inv` message with multiple transactions as well.

BIP 181, 182, 183: BIPs for Utreexo #1923

Are you sure you want to change the base?

BIP 181, 182, 183: BIPs for Utreexo #1923

Uh oh!

Conversation

kcalvinalvin commented Aug 10, 2025

Uh oh!

jmoik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jonatack left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

petertodd commented Aug 12, 2025

Uh oh!

ghost commented Aug 12, 2025 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Security Advantages

2. Comparative Analysis: SHA-256 vs SHAKE256

3. Functional Example

4. Implementation Benefits

5. Technical Reference

Uh oh!

kcalvinalvin commented Aug 18, 2025

Uh oh!

kcalvinalvin commented Aug 18, 2025 • edited by jonatack Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

jonatack commented Aug 18, 2025

Uh oh!

kcalvinalvin commented Aug 18, 2025 • edited by jonatack Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murchandamus commented Aug 18, 2025

Uh oh!

petertodd commented Aug 18, 2025 via email

Uh oh!

kcalvinalvin commented Aug 19, 2025

Uh oh!

This comment was marked as abuse.

This comment was marked as off-topic.

This comment was marked as off-topic.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

petertodd commented Aug 24, 2025

Uh oh!

kcalvinalvin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghost commented Aug 12, 2025 •

edited by ghost

Loading

kcalvinalvin commented Aug 18, 2025 •

edited by jonatack

Loading

kcalvinalvin commented Aug 18, 2025 •

edited by jonatack

Loading

murchandamus left a comment •

edited

Loading


		### Block Propagation

		Legacy block propagation without Compact Blocks comprises of three steps:

		1. Node A sends an inv message or a block header to Node B.
		2. Node B makes a getdata request for the block.

	Below image illustrates how a non-Utreexo node would relay blocks without using Compact Blocks.
	The image below illustrates how a non-Utreexo node would relay blocks without using Compact Blocks.


		### Commitment scheme for TTL messages

		We choose an arbitrary height `X` and go through each of `TTL info` in all the the `Utreexo TTL` values up until that height.


		We choose an arbitrary height `X` and go through each of `TTL info` in all the the `Utreexo TTL` values up until that height.

		If the TTL in the `TTL info` is greater than the [numleaves](bip-0181.md#Definitions) value of the Utreexo accumulator at the chosen height `X`, we reset the `death position` and the `TTL` values to their default of 0.


		Why is there a separate NODE_UTREEXO_ARCHIVE service bit from the NODE_UTREEXO service bit?

		For archive nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival nodes.

	For archive nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival nodes.
	For archival nodes, we wanted the ability for a node to keep just the historical Utreexo proofs since the historical blocks can be served by any archival node.


		We decided to communicate the positions in the Utreexo merkle forest by inventory vectors instead of a separate message to avoid an extra round trip during the transaction propagation.

		As mentioned above in [Transaction Relay](#transaction-relay), non-Utreexo nodes propagate a transaction in these 3 steps:

		2. Send a message to get the positions in the Utreexo merkle forest for the transaction.
		3. Receive the positions in the Utreexo merkle forest.

	Utreexo is a dynamic accumulator that enables the UTXO set to be represented in just a few kilobytes,
	Utreexo is a dynamic positive accumulator that enables the UTXO set to be represented in just a few kilobytes