-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Implementation of SwiftSync #34004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Implementation of SwiftSync #34004
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34004. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. LLM Linter (✨ experimental)Possible typos and grammar issues:
2025-12-17 |
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
408e314 to
652d99d
Compare
652d99d to
fcdce16
Compare
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the change once to get a feel for it, left a ton of questions and suggestions.
Many of my suggestions apply to many places of the PR - I didn't comment on each occurrence. Some are very low-level, some very high - please put them into context.
As far as I'm concerned there are a few blockers here - it should indeed be a draft for now.
First, the storage assumes sparse blocks throughout the mainchain history - which is unlikely to be the case, I think we could achieve better storage if we used a mixed storage mechanism based on which is cheaper: an index-to-boolean map for each unspend tx or a bitset of the complete block. Given how easy it is to implement a compressed sparse bitset, I think we should investigate that. Could you plot the blockchain by height vs unspent tx for us?
It also bothers me greatly that this can only seed pruned nodes - it limits its use a lot, it's hard to argue what this provides over current AssumeUTXO.
Is it possible to carve out features from this change and progressively introduce chunks into the codebase - e.g. introduce hints file but use it to optimize the current storage to warm the cache based on whether we expect those to end up on disk or not. It would be a pure optimization without any validation change - in worst case it would just make the code slower. And it would already get our foot in the door - I don't expect this change to be merged soon, we should expect pushback.
In its current state even I have strong objections against it: the logic is spread around the validation code, I want to be able to review this in isolation instead of sprinkled around consensus code. As @sedited also mentioned in a deleted comment: could we separate the gist into a coins cache like @andrewtoth did in #31132 to encapsulate its effect? We should expect some refactoring changes before this PR.
| @@ -981,6 +983,10 @@ bool AppInitParameterInteraction(const ArgsManager& args) | |||
| if (args.GetBoolArg("-reindex-chainstate", false)) { | |||
| return InitError(_("Prune mode is incompatible with -reindex-chainstate. Use full -reindex instead.")); | |||
| } | |||
| } else { | |||
| if (args.IsArgSet("-utxohints")) { | |||
| return InitError(_("UTXO hints cannot be used without pruned mode.")); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a very serious limitation currently, I don't see a huge difference between this and just loading the final AssumeUTXO state. If this is targeted to very low-memory environments, AssumeUTXO would be even better. I'm not a fan of AssumeUTXO, but it still seems better than just pruned SwiftSync.
| m_file << preallocate; | ||
| for (uint32_t height = 1; height <= preallocate; height++) { | ||
| m_file << height; | ||
| m_file << dummy_file_pos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a dummy during file writing?
src/test/swiftsync_tests.cpp
Outdated
| FILE* file{fsbridge::fopen(temppath, "wb")}; | ||
| AutoFile afile{file}; | ||
| swiftsync::HintsfileWriter writer{afile, 4}; | ||
| BOOST_CHECK(writer.WriteNextUnspents(unspent_block_1, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is still the setup phase of the test, it could be a BOOST_REQUIRE
src/swiftsync.cpp
Outdated
| std::vector<uint64_t> offsets{}; | ||
| offsets.reserve(num_unspents); | ||
| for (uint64_t i = 0; i < num_unspents; i++) { | ||
| offsets.push_back(ReadCompactSize(m_file)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're currently "decompressing" the data during read, instead of during access - it should be possible to read the whole file data into the vector without interpreting it and decipher them when the data is needed.
Please see the way we've implemented it in l0rinc@8398cbc#diff-c754d7c547c3523356e9a430b5448ba384fe135cf3f83508a7d0384d30c9d2d0R45-R54
|
Thanks a lot for the thorough review. I will respond both high level and to review comments over the coming days. |
brunoerg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've ran a mutation analysis on this PR. Unkilled mutants should be killed (covered) by an unit or functional test (if make sense). However, I noticed that there are part of the code that doesn't even have test coverage (corecheck.dev would have show it but did not generated the report for this PR) and should have it. As soon as more tests are added to this PR, I can re-run the analysis again.
| if (nSigOpsCost > MAX_BLOCK_SIGOPS_COST) { | ||
| state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-sigops", "too many sigops"); | ||
| break; | ||
| if (!swiftsync_active) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unkilled mutant:
diff --git a/src/validation.cpp b/muts-pr-34004-validation-cpp/validation.mutant.16.cpp
index 3d84576e7d..efc5037570 100644
--- a/src/validation.cpp
+++ b/muts-pr-34004-validation-cpp/validation.mutant.16.cpp
@@ -2613,7 +2613,7 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
// * legacy (always)
// * p2sh (when P2SH enabled in flags and excludes coinbase)
// * witness (when witness enabled in flags and excludes coinbase)
- if (!swiftsync_active) {
+ if (1==0) {
nSigOpsCost += GetTransactionSigOpCost(tx, view, flags);
if (nSigOpsCost > MAX_BLOCK_SIGOPS_COST) {
state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-sigops", "too many sigops");
@@ -6571,4 +6571,4 @@ std::pair<int, int> ChainstateManager::GetPruneRange(const Chainstate& chainstat
int prune_end = std::min(last_height_can_prune, max_prune);
return {prune_start, prune_end};
| std::optional<swiftsync::HintsfileReader> utxo_hints; | ||
| if (args.IsArgSet("-utxohints")) { | ||
| fs::path path = fs::absolute(args.GetPathArg("-utxohints")); | ||
| if (!fs::exists(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could exercise these errors (file doesn't exist and open failure) in the functional tests. An (obvious) unkilled mutant:
diff --git a/src/init.cpp b/muts-pr-34004-init-cpp/init.mutant.1.cpp
index fb7336f9ae..59de0f6b7a 100644
--- a/src/init.cpp
+++ b/muts-pr-34004-init-cpp/init.mutant.1.cpp
@@ -1812,7 +1812,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
AutoFile afile{file};
if (afile.IsNull()) {
LogError("Failed to open UTXO hint file.");
- return false;
+ return true;
}
| std::unique_ptr<CBlock> pblock = std::make_unique<CBlock>(); | ||
| bool read = node.chainman->m_blockman.ReadBlock(*pblock, file_pos, curr->GetBlockHash()); | ||
| if (!read) { | ||
| throw JSONRPCError(RPC_DATABASE_ERROR, "Block could not be read from disk."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error isn't covered by any test and could be addressed in feature_swiftsync. Perhaps by deleting a block file before calling the generatetxohints RPC.
Eunovo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this SwiftSync implementation uses assumevalid assumptions, it makes sense to reason about attacks that apply to assumevalid, here as well.
What happens if an attack gives a SwiftSync node a malicious hint file and eclipses the node so that it is denied an honest chain? Assumevalid uses:
- the minimumchain work check to ensure that scripts are verified when the node is denied access to "any chain at least as good as the expected chain"
- the equivalent time check to force the attacker to generate 2 weeks' worth of blocks to bury the invalid block.
SeeLines 2377 to 2397 in d5c8199
} else if (m_chainman.m_best_header->nChainWork < m_chainman.MinimumChainWork()) { script_check_reason = "best header chainwork below minimumchainwork"; } else if (GetBlockProofEquivalentTime(*m_chainman.m_best_header, *pindex, *m_chainman.m_best_header, params.GetConsensus()) <= TWO_WEEKS_IN_SECONDS) { script_check_reason = "block too recent relative to best header"; } else { // This block is a member of the assumed verified chain and an ancestor of the best header. // Script verification is skipped when connecting blocks under the // assumevalid block. Assuming the assumevalid block is valid this // is safe because block merkle hashes are still computed and checked, // Of course, if an assumed valid block is invalid due to false scriptSigs // this optimization would allow an invalid chain to be accepted. // The equivalent time check discourages hash power from extorting the network via DOS attack // into accepting an invalid block through telling users they must manually set assumevalid. // Requiring a software change or burying the invalid block, regardless of the setting, makes // it hard to hide the implication of the demand. This also avoids having release candidates // that are hardly doing any signature verification at all in testing without having to // artificially set the default assumed verified block further back. // The test against the minimum chain work prevents the skipping when denied access to any chain at // least as good as the expected chain. script_check_reason = nullptr; }
I think we need to include these checks as well when doing SwiftSync with assumevalid assumptions. Unless, for some reason, SwiftSync with assumevalid is not vulnerable to the same attacks.
fcdce16 to
6853985
Compare
53c887b to
83cb4ab
Compare
This turned out to not be true. Using bit-packing as shown in the current state of the PR, the encoding was far worse. |
83cb4ab to
8abdfec
Compare
…valuated at comptime 5ac3579 refactor: Add compile-time-checked hex txid (rustaceanrob) Pull request description: Suggested by l0rinc as a comment in #34004. There are tests that utilize `FromHex` that will only fail during runtime if malformed. Adds a compile time constructor that can be caught by LSPs. ACKs for top commit: l0rinc: ACK 5ac3579 maflcko: review ACK 5ac3579 🦎 rkrux: crACK 5ac3579 Tree-SHA512: b0bae2bf0b8cd8c9a90765a14c46146313cf8b224a29d58a253e65ca95c4205c0beddea9c49ae58901e72c8c5202b91695d074ffb1c48e448d2e5606eb1bd5b4
This class allows for the comparison of two sets `A` and `B`. Elements of `A` are added to a wrapping integer, and elements of `B` are also added to a wrapping integer. If the two integers are equivalent, then sets `A` and `B` are equivalent. In this case we say they are "balanced". The client-side salt is required to prevent the generalized birthday attack, whereby an attacker may choose a targeted hash collision and reduce the search space by a square root of the original security. Co-authored-by: l0rinc <[email protected]>
8abdfec to
b125d65
Compare
Introduce a trivial example of creating and spending outpoints in arbitrary order.
The set of all inputs ever created and the set of all outputs are not equivalent sets. Naturally, some outputs will be unspent. To update the aggregate correctly, we require "hints" as to what outputs will remain in the UTXO set at a particular height in the chain. These hints allow a client to verify `inputs = outputs - UTXOs` for a terminal block height. The hints are encoded using a bitset, where a `0` represents an output that will be spent, and a `1` represents an output that will be unspent. A header section is added so multiple blocks may be encoded concurrently. The header section denotes the block height and file position.
Assert that writing elements to file in an arbitrary order and reading them back in an arbitrary order does not fail.
Generate hints for the location of UTXOs in each block. This RPC allows for a rollback of the chainstate to an arbitary height. Currently, the hintfile is made in sequential order, but this may be done in parallel as a future improvement.
This combines the aggregate and hints into a single class for consumers to interact with. The SwiftSync protocol is only possible if: the entire block history is being downloaded from genesis, the protocol is not already completed, and there are hints avaiable to use. If all of these conditions are met within the internal context, SwiftSync is possible.
Allow a user to pass a file path to UTXO hints, failing for any errors encountered. Namely, undo-data cannot be written during this implementation of SwiftSync. A partial compension for this is to only allow hints when using `-prune`, as undo-data will be deleted anyway. A SwiftSync context is applied to the active chainstate. If a hints file is present, it is passed to the active chainstate for use later.
During the SwiftSync protocol, inputs are not fetch-able from the coins view. This means the following cannot be checked until the terminal state may be verified as valid: 1. inputs do not attempt to spend more coins than available 2. script validity 3. sigops 4. subsidy + fees = coinbase At the terminal SwiftSync height, we verify the history of transaction inputs and outputs that have been fetched indeed correspond to the resulting UTXO set. In the case of success, IBD continues as usual. In the case of a failure, the UTXO set present does _not_ correspond to the block history received. In this situation, the program should crash and indicate to the user that they may recover the correct UTXO set with a `-reindex`. Of note, writing undo-data is not possible until the UTXO set present on disk is verified as correct.
The following test asserts that a client may sync to the network tip using a hints file and may reorganize thereafter. Furthermore, it tests a faulty hints file cannot be used to leave a client in an invalid state, and any client given a bad file may recover with a reindex.
b125d65 to
86d009d
Compare
Are you intending to work on this P2P message? Is this part of any specification? |
| } | ||
| BOOST_CHECK(writer.WriteNextUnspents(block_one, 1)); | ||
| swiftsync::BlockHintsWriter block_three{}; | ||
| for (const bool hint : unspent_block_3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
77b2518: nit: We could avoid this duplicated loop (we basically have the same thing for each unspent block) by creating an "util" function that does it.
| result = full_node.generatetxohints(GOOD_FILE) | ||
| hints_path = result["path"] | ||
| self.log.info(f"Created hints file at {hints_path}") | ||
| assert_equal(full_node.getblockcount(), NUM_SWIFTSYNC_BLOCKS + BASE_BLOCKS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
86d009d: You could also assert that the block count is the same as result["height"].
| argsman.AddArg("-alertnotify=<cmd>", "Execute command when an alert is raised (%s in cmd is replaced by message)", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS); | ||
| #endif | ||
| argsman.AddArg("-assumevalid=<hex>", strprintf("If this block is in the chain assume that it and its ancestors are valid and potentially skip their script verification (0 to verify all, default: %s, testnet3: %s, testnet4: %s, signet: %s)", defaultChainParams->GetConsensus().defaultAssumeValid.GetHex(), testnetChainParams->GetConsensus().defaultAssumeValid.GetHex(), testnet4ChainParams->GetConsensus().defaultAssumeValid.GetHex(), signetChainParams->GetConsensus().defaultAssumeValid.GetHex()), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS); | ||
| argsman.AddArg("-utxohints=<path>", "Accelerate initial block download with the assistance of a UTXO hint file.", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d019f9a: We could point here that it cannot be used without pruned mode.
|
|
||
| BOOST_FIXTURE_TEST_SUITE(swiftsync_tests, BasicTestingSetup); | ||
|
|
||
| BOOST_AUTO_TEST_CASE(swiftsync_aggregate_test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1ff59a1: Besides a unit test for the aggregate, I think a fuzz testing would be nice, e.g.:
// Copyright (c) 2025-present The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#include <random.h>
#include <test/fuzz/FuzzedDataProvider.h>
#include <test/fuzz/fuzz.h>
#include <test/fuzz/util.h>
#include <test/util/setup_common.h>
#include <swiftsync.h>
FUZZ_TARGET(swiftsync_aggregate_test)
{
SeedRandomStateForTest(SeedRand::ZEROS);
FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
swiftsync::Aggregate agg{};
bool good_data{true};
LIMITED_WHILE(good_data && fuzzed_data_provider.ConsumeBool(), 500) {
const std::optional<COutPoint> out_point{ConsumeDeserializable<COutPoint>(fuzzed_data_provider)};
if (!out_point) {
good_data = false;
return;
}
if (fuzzed_data_provider.ConsumeBool()) {
agg.Spend(*out_point);
} else {
agg.Create(*out_point);
}
}
(void)agg.IsBalanced();
}
Yeah I plan on creating a patch to add this, either here or separately. Notably, the UTREEXO BIPs specify serving undo data, however it is also grouped with proofs that SwiftSync would not require. I will suggest as a comment that these messages may be appropriate to split into undo-data and proofs, as there are a number of use cases for a network message for this data. The easiest approach would be to use the current compressed serialization format and send that directly over the wire. |
Suggested by @l0rinc in bitcoin#34004 Message by @l0rinc: This adds a consteval constructor to transaction_identifier (Txid/Wtxid) to allow parsing hex strings at compile-time. This replaces runtime FromHex checks in tests, ensuring that malformed hardcoded hashes cause build failures rather than runtime test failures. Test variables are explicitly marked constexpr. This is required to workaround a regression in GCC 14 (Bug 117501) where the compiler incorrectly flags consteval initialization of non-constexpr variables as "statements with no effect". GCC Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117501 Reproducer: https://godbolt.org/z/xb5TMaPs6 Co-authored-by: l0rinc <[email protected]>
Possible issue with
|
That block value is quite low and would take a long time to roll back to. I suggest trying with a number more like |
|
Given the current impact of this approach, which is somewhat limited in scope, I am going to take some time to prototype the fully-validating and parallel design. I'm hoping to gather some benchmarks and plan for necessary refactors by doing so. |
Link to the original protocol writeup found here. See the Delving Bitcoin post for initial works and discussion.
Motivation
While many coins are available in cache when spent, every cache miss requires a trip to the disk. Memory can be scarce for single board machines, so cache misses are expected to be far more frequent. A trip to disk is all-the-more severe on low end hardware, so any minimization of disk interactions can speed up initial block download for limited resource devices. One such example may be a block template provider serving a home miner.
Intuition
The vast majority of coins created are later spent. The cost of this observation is many coins will be written to disk, only to be removed later. The SwiftSync protocol alleviates the problem by recording what coins have been added and removed with a small, in-memory aggregate. This allows for the removal of intermediate coin states until SwiftSync has completed. The state of the aggregate is verified at a predetermined chain height, and IBD continues as usual thereafter.
For any chain height, we know
all outputs - all inputs = UTXO set, which is equivalentlyall outputs - UTXO set = all inputs. The goal is to verify the setall outputs - UTXO setis equal toall inputs. Addition and subtraction modulo a field offers a way to do so. For any set, if we add all elements of the set and subtract them, regardless of ordering, we will arrive at zero. In the current implementation, the set elements being compared are hashed outpoints.The SwiftSync protocol is comprised of two structures. First, the aggregate that will record the state of what outpoints have been added (created) or subtracted (spent). Next, notice that the UTXO set must be reported in order to verify
all outputs - UTXO set = all inputs. SwiftSync introduces a hints file, which may be provided to the client at startup. This hints file is nothing more than a pointer to each unspent output by recording the block height and the indexes that will remain unspent. By using this file, a client may omit outpoints that will be in the UTXO set when updating the aggregate. Note this file is not trusted. If a malicious actor provides a faulty UTXO set, the set comparison above will fail, and the client may begin a reindex.Implementation
Aggregate
The hash aggregate takes outpoints, hashes them with a salted hasher, and updates 4, 64-bit limbs with wrapping modulo arithmetic. The choice of 4, 64-bit limbs with no carries between limbs was made to: 1. minimize code complexity within the aggregate 2. conform to existing APIs, namely
uint256. After adding and subtracting some arbitrary list of outpoints, we may poll if the sets were equivalent withIsZero.Hintsfile
The hints file is a map of block height to the indexes of the outputs in that block that will be in the UTXO set. To construct this file for a particular UTXO set state, each block must be read and queried to find the outputs in that block that remain unspent. Because this is a read-only operation, this may be done on parallel threads in the future. To allow for parallelization, the hints file contains a "contents" section, which denotes the height and corresponding file position to find the hints for that block.
To save space when representing the indexes that remain unspent in a block, an easy choice is to take the difference between the last coin that was unspent and the next one in the block. AFICT this is a version of run-length-encoding. These indexes are encoded using
compactSizeto further save space, the run-lengths should normally fit into at least 16 bits, often 8 bits.Parameter Interaction
A side effect of omitting intermediate UTXO states while performing IBD is the "undo data" cannot be derived. There are already configurations that do not have full undo data history, namely pruned nodes. To avoid additional complexity in this PR, this version of accelerated IBD is only made available to pruned nodes, as the undo data would have been removed regardless of the this setting.
Other parameter interactions, not yet implemented:
assumeutxo: Startup should fail if both are configured. A hints file will have no effect, as the UTXO set is already present.assumevalid: A natural choice is to only allow SwiftSync when usingassumevalid, and to only allow a hints file that commits to theassumevalidblock. Similar toassumeutxo, a hash of the hints file may be committed to in the binary, which would be resistant to griefing attacks.Testing [Linux example]
In your
bitcoindirectory, download the hints from the test server:Verify the hash of the file:
Build the branch:
Remove existing chain state:
rm -rf ~/.bitcoin/signet/Provide the file at startup:
Other networks:
Testnet4: endpointhttp://utxohints.store/hints/testnet4, expected checksum7acbb25b651ffc5ebc59c46fd065a510c149f00e testnet4.hintsOpen Questions
Future Iteration
This is the first iteration of the SwiftSync proposal, however more optimizations may be made. Namely, because set elements may be added or subtracted from the aggregate in any order, blocks may be downloaded in parallel from multiple peers. If you are interested in benchmarking the performance of a fully-parallel version, see the Rust implementation.