Stricter & More Performant Invariant Checking in CheckBlock #14837

JeremyRubin · 2018-11-29T08:25:40Z

This PR supersedes #14387 and is an alternative to #14397.

Summary

Previously, "no duplicate inputs" was checked on a per transaction basis, with this PR, they are checked block wide. This PR also checks that long chains are in valid order, which was previously implicitly checked only. The resulting code is also more performant.

Benefits

Stricter Invariants This PR checks stricter properties. Before this PR, a block might pass checkblock with duplicate inputs spent across transactions and invalid longchain order. Enforcing stricter invariants in CheckBlock helps us guard against potential errors in later sections of the code and opens up the door to new optimizations, parallelizations, or simplifications in the application of transactions in a block. For instance, we could create all new UTXOS in parallel and then spend all inputs in parallel.
Better Performance This checker is 6.5x faster on the duplicate inputs benchmark, and 1.1x faster on a more general test.
Before this PR:

DuplicateInputs, 5, 10, 0.334544, 0.00665873, 0.00671868, 0.00669463
DeserializeAndCheckBlockTest, 5, 160, 2.17535, 0.00267973, 0.00273858, 0.00272915

After this PR:

DuplicateInputs, 5, 10, 0.0513754, 0.00100089, 0.00105766, 0.00102823
DeserializeAndCheckBlockTest, 5, 160, 1.97574, 0.00245479, 0.00248704, 0.00246725

Better Complexity The time complexity of the algorithm is E[O(N)], compared to E[O(N log(N)]. It also, in theory uses much less memory in the worst case (constant 300k) compared to O(N) (i.e., 1MB) worst case for the previous algorithm. Furthermore, the number of actual hash comparisons is substantially reduced, which contributes a big constant factor. The analysis is given in more detail in the PR which justifies the constants selected.

Potential Risks & Redresses

In this section I'll try to exhaustively go through the various risks of this PR.

Probabilistic Algorithms are Confusing. Should the PCG based hashing algorithm (or other probabilistic aspects) be broken in this code, then an attacker could theoretically cause a block to take O(N^2) time to validate. One way to de-risk this issue would be to, upon erroneous collision, fall back to checking against a std::set which would have O(n log n) worst-case behavior (the existing runtime). In general, the theory behind PCGs makes it unlikely that an attacker could predict collisions across the network because each PCG is salted individually (random seed xor'd in) and started with a different increment (the increment includes the output's n, and the seed is the 64 bit hash value -- this prevents adversary from using the output's n to generate collisions). Therefore, a spurious collision on one instance should have an exceedingly small probability of colliding on another. Perhaps there are further ways to harden this hash scheme while maintaining performance. In any case, the block still validates, just more slowly.
Code Complexity. Should the code be simply incorrect (e.g., not actually checking that duplicate inputs are not present somehow) then unintended inflation could occur. This is always a risk with any change. I believe the stricter invariant checking justifies this change as reducing likelihood of bugs long term, as this protects sensitive code in block application from encountering certain types of invalid blocks based on sequential application of transactions.
Block Size dependency. Technically this algorithm degrades in expected performance the more transactions are in a block, which may make it more difficult to increase block sizes. Fortunately, should block sizes ever be increased, the hash table size can be increased too (e.g. a 1MB table would support 4MB blocks easily).
Loss of WTXID caching of duplicate input checks. It has been proposed WTXIDs could cache that they don't have duplicate inputs. I don't see WTXID caching this property as a really useful thing to cache because it needs to be checked at the block level across all transactions anyways. Further, the use of PCG for hashing makes the performance overhead negligible compared to a cache lookup.
Testing Coverage Difficulty Because the algorithm fails infrequently, it's difficult to write a test that covers the case where the algorithm sees a spurious collision and the continues checking. This could be fixed by templating the Invariant function to allow for a biased PCG implementation which only generates collisions.
Changed failure messages Because the algorithm checks more invariants at once and detects them in a different order than before, clients relying on an exact reject reason may be confused. Clients should already not rely on exact reject reasons, so I don't believe this has a ramification.

Fix hash computations to get unique bits Use less error-prone code for bitset revert tx_verify.* to master Add probabalistic duplicate detector to CheckBlock Clean up the implementation of fast deduplication checking Small clean up of comment & code layout for Duplicate Input checking Make duplicate checker also check for longchain validity Cleanup the DuplicateInputs interface to CheckInputInvariants and remove some redundant checks Change test reasons to match new behavior Switch to PCG generator

gmaxwell · 2018-11-29T08:42:27Z

The argument that the CVE fix was a performance regression is based on a misunderstand of the system's current operation: Block validation is only very rarely on the critical path for block propagation. This wasn't the case when the duplicate checking skipping was added, but it is the case now. I can't imagine that the PR to skip the "redundant" duplicate check would have gone through if it wasn't on the block propagation critical path then, so I can't see a change that is 10x+ more complicated being adopted now that its off the critical path.

This adds hundreds of lines of code and a homebrew cryptographic hash that AFAICT isn't particularly cryptographic but if broken turns into a DOS attack (and no, XORing a seed is does not obviously produce pairwise independence, which some approximation of is required to achieve the claimed property that a bad input for one user would be okay for others), -- and it looks like doesn't actually result in an observable benefit except on microbenchmarks.

NAK.

JeremyRubin · 2018-11-29T09:05:01Z

The main benefit I'm emphasizing here is that it checks more strict properties.

As noted. the stricter check need not introduce a 'DoS attack' -- it can revert to the existing runtime easily. In any case, our goal isn't really to validate a maliciously created block quickly, it is to validate an honestly created block as quickly as possible and a maliciously created block in tolerable time -- I figured that they O(N / log(N)) speedup to switch back to the set algorithm upon collision wasn't worth the added complexity there, but it can certainly be done.

The PCG I am using is not homebrew (except in implementation). If you prefer, we could add a dependency to the standard PCG library which contains a similar function. Thus the unstudied portion is mostly limited to the inputs to the function. My understanding of PCG is such that the xor'd seed should produce pairwise independence, although I grant you that it may be better to use two different seeds k1_1 and k1_2. Perhaps there are other efficiently computable prfs which have pairwise independence that would be suited for this purpose -- I previously used SIPHASH for this.

DrahtBot · 2018-11-29T09:31:58Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#15773 (test: Add BitcoinTestFramework::sync_* methods by MarcoFalke)
#15639 (bitcoin-wallet tool: Drop libbitcoin_server.a dependency by ryanofsky)
#15638 (Move-only: Pull wallet code out of libbitcoin_server by ryanofsky)
#15141 (Rewrite DoS interface between validation and net_processing by sdaftuar)
#14696 (qa: Add explicit references to related CVE's in p2p_invalid_block test. by lucash-dev)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ryanofsky · 2018-11-29T15:44:24Z

Stricter Invariants This PR checks stricter properties. Before this PR, a block might pass checkblock with duplicate inputs spent across transactions and invalid longchain order.

The main benefit I'm emphasizing here is that it checks more strict properties.

What's the list of properties that this PR checks for? "Invalid longchain order" seems to mean that "outputs being created by this transaction being have not been spent by an earlier transaction." Are there other checks, too?

ryanofsky · 2018-11-29T16:27:22Z

What's the list of properties that this PR checks for?

Found the list here:

https://github.com/JeremyRubin/bitcoin/blob/0425c6409000aeb3270ba8f9c30d2746c5c5b784/src/validation.cpp#L3078-L3086

The actual implementation of this is change is short, clean and not hard to understand. This change doesn't add "hundreds of lines of code", though it does add a lot of comments and analysis.

If it's true that "Block validation is only very rarely on the critical path for block propagation" then making this change by itself try to help with performance and complexity is probably not worth the risks. But I am curious about:

Enforcing stricter invariants in CheckBlock helps us guard against potential errors in later sections of the code and opens up the door to new optimizations, parallelizations, or simplifications in the application of transactions in a block. For instance, we could create all new UTXOS in parallel and then spend all inputs in parallel.

This is an interesting idea, even though it seems like it would require a lock-free CCoinsViewCache to improve performance. It does seem conceptually like adding an "Invalid longchain order" invariant could make future optimizations possible, so maybe this is worth thinking about more.

practicalswift · 2018-11-30T09:05:14Z

src/validation.cpp

+    };
+
+    std::unique_ptr<std::bitset<1 << 21>> pTable = MakeUnique<std::bitset<1 << 21>>();
+    auto& table = *pTable.get();


This .get() is redundant, right?

practicalswift · 2018-11-30T09:07:20Z

src/validation.cpp

+
+
+    const auto pcg = [](uint64_t start, uint64_t inc) {
+        uint64_t nxt = (inc | 1) + (start * 6364136223846793005ULL);


Make it explicit that the unsigned integer wraparound that will take place here at run-time is intentional? Or alternatively rewrite so that no integer wraparound takes place at run-time? Verify with -fsanitize=integer.

How do I make the wraparound explicit?

You could use the method I used in PR #14224:

https://github.com/practicalswift/bitcoin/blob/0373038cedc07e3d775a1055c0da45f4e8fda2b7/src/attributes.h#L9-L15

Why would that be preferable to the sanitizer suppressions in ./test/?

Yeah I'm not sure about it. Unsigned integer overflow isn't a bug, it's a feature... I would also like a suppression that operates at the statement level (e.g., like rust's wrapping_mul) rather than marking the entire function with an attribute. What if there is an unsigned wraparound issue elsewhere in the code?

@MarcoFalke In contrast to the sanitizer suppressions it documents if the wraparound is intentional or not.

What do you think about using say INTENTIONAL_WRAPAROUND for the intentional wraparounds and the file based sanitizer suppressions to cover the unintentional wraparounds?

@JeremyRubin I assume you're talking about intentional wraparounds as features? The problem isn't intentional wraparounds – they are totally fine by definition. The problem is unintentional wraparounds.

I share your wish for suppressions that operated at the statement level, but they don't exist yet AFAIK :-)

practicalswift · 2018-11-30T09:09:22Z

src/validation.cpp

+        uint64_t nxt = (inc | 1) + (start * 6364136223846793005ULL);
+        uint32_t x = (nxt ^ (nxt >> 18)) >> 27;
+        uint32_t r = nxt >> 59;
+        return (x >> r) | (x << (31 & (-r)));


Make it explicit that the unsigned integer wraparound that will take place here at run-time is intentional? Or alternatively rewrite so that no integer wraparound takes place at run-time? Verify with -fsanitize=integer.

practicalswift · 2018-11-30T09:10:33Z

src/validation.cpp

    return true;
 }

+/* CheckInputInvariants checks for three criticial invariants for the inputs in a block:


Critical :-)

ken2812221 · 2018-11-30T12:55:11Z

src/validation.cpp

+        };
+    };
+
+    std::unique_ptr<std::bitset<1 << 21>> pTable = MakeUnique<std::bitset<1 << 21>>();


#include <bitset> to make appveyor happy.

sdaftuar · 2018-11-30T15:05:57Z

I agree with @gmaxwell. Adding complexity to the consensus code (or, as I often argue, changing it at all) should be something we do for only very good reasons, both because of the high review burden consensus code changes incur on the project, and because of the maintenance burden and cognitive load we put on future developers as well.

I really don't think we should consider making consensus code changes that are designed to have no effect but prep the way for unspecified future code changes (which is why I often chime in with a Concept NACK on consensus refactoring PRs). These incur all the review and testing costs of a consensus change, but by design have no benefit to the network. This project is way too busy for this to be a good use of developer effort IMO.

In my view, if we're going to make changes to the consensus code for performance reasons, then (a) those performance numbers should be demonstrably and meaningfully better for the network, and (b) we should generally discuss all the designs that might achieve the same performance benefit, and have a very good reason for not choosing the simplest such design. In the case of this change, the performance benefits could likely be realized by far simpler changes, as has been pointed out in the review comments on other versions of this PR.

I do think that it would be a useful discussion to figure out exactly what would be the simplest code design for where the duplicate inputs check should live -- I've had some offline conversations and in my view it's not obvious whether it should naturally be considered a context-free check we perform on transactions, or whether the check should reside instead at the utxo database layer. If we care to improve the underlying issue here, I think we would best served by engaging in that design discussion to come up with the simplest way of reasoning about things in the future.

Finally -- I think we should remember that we're not just writing code for ourselves, but for hopefully a much larger set of future engineers. If we are going to burden people with complicated reasoning about a consensus topic, it should be because it's really important.

gmaxwell · 2018-11-30T18:50:12Z

If there were a PR with some massive validation speedup that needed the new behaviour this provides, then it might be worth considering on the basis of that improvement. But as conjectural prepatory work I just don't see the gain.

To the extent that there is an argument that belt-and-suspendering the consistent check would make the code safer holds -- I'm not sure about that, but I think it's at least plausible-- that could be accomplished with a simpler and clearer check.

JeremyRubin · 2018-12-06T05:07:18Z

@sdaftuar Thanks for the review and feedback.

Understood. It's often times hard to tell the borderline between small-steps that are easy to reason about and sweeping changes which change lots of code. I tend to err on the side of small steps with easy to reason about changes because when things are larger I perceive there's more of a rejection for review simply because of large amount of changes required, even if it is decomposed into small steps. I also tend to prefer to do small steps so as not to invest to much effort on an approach that will not be acceptable early on.
I can clarify the types of changes I'm excited about that this opens up the door to in a bit. I do have drafts of them, but I'm still thinking about how to best structure them to make the project benefit maximally. I'm happy to give you a preview.
The differences between this approach and a simpler version (I also have an implementation of it which checks both duplicate inputs and longchain order) is somewhat substantial. But perhaps that difference would be less clear once the broader architecture changes are done -- happy to explore this as well. Also the simpler approaches mentioned in the other iteration do not have the same property that this one does (longchain checking) and cannot be augmented easily to support it (it becomes a separate check).
This sounds like a good conversation. I have some opinions on this, but am not strongly attached. I don't have a horse in the race on weather they are contextual or non contextual checks -- either is somewhat fine to me, but I do feel strongly that duplicate input checks are not a per-transaction check (they are block wide). I also believe from a performance perspective, they are best performed in ConnectBlock after the first Control.add and before Control.wait. From a code fragility perspective, I would prefer this check to happen before the UTXO database or caching hierarchy is written to at all. These two goals are slightly at odds, but there might be a way to make them work nicely together.
I do believe this code is actually much simpler than the current state of affairs. Currently, where are duplicate inputs checked? They are checked in two places -- per txn in check block, and then across transactions but not within transactions in ConnectBlock. That's what lead to the CVE. Changing the code to only perform the check in once place explicitly is dramatically simpler. Reasoning about this is simpler. Now, the hashtable implementation, I agree, is more complicated, which is why I wrote a detailed comment. If performance is sacrificable a bit here as it's no longer in the hot path at all, I would prefer the slightly less performant version as it doesn't need as much convincing for the reader. When working on the broader change which can follow this PR, I will base it on that simpler version.
I've always heard the project was scratch your own itch ;)

@gmaxwell, will let you know when I have something ready. I do think that the belt-and-suspendering the check holds merit; I will include the simplest version possible in that version, to the extent it isn't worse than master.

Do you have a falsifiable framework for testing this sort of change? I'd like to produce a testable change where I can demonstrate a change in an environment that you're happy with. Would it be -reindex-chainstate with assumevalid set to one month prior? Is there something simpler that would work and be convincing & permit more rapid iteration?

gmaxwell · 2018-12-09T14:55:20Z

A reindex (-chainstate is fine) would be the standard benchmark, you could set assumevalid to whatever you think will highlight the improvement the most-- both extremes of AV setting fairly characterize different but important aspects of sync performance (the performance on older vs more recent history). Similarly, default dbcache or maximized dbcache are both defensible benchmarking configurations (the performance on standard vs high performance hosts).

DrahtBot · 2019-04-10T14:21:20Z

Needs rebase

maflcko · 2019-04-19T17:33:11Z

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.

fanquake added the Consensus label Nov 29, 2018

JeremyRubin mentioned this pull request Nov 29, 2018

Faster Input Deduplication Algorithm #14387

Closed

practicalswift reviewed Nov 30, 2018

View reviewed changes

ken2812221 reviewed Nov 30, 2018

View reviewed changes

DrahtBot added the Needs rebase label Apr 10, 2019

maflcko closed this Apr 19, 2019

jnewbery mentioned this pull request May 7, 2019

Refactor: explicit VerifyScript control flow based on pattern matching #15969

Closed

laanwj removed the Needs rebase label Oct 24, 2019

bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021



		const auto pcg = [](uint64_t start, uint64_t inc) {
		uint64_t nxt = (inc \| 1) + (start * 6364136223846793005ULL);

Stricter & More Performant Invariant Checking in CheckBlock #14837

Stricter & More Performant Invariant Checking in CheckBlock #14837

Uh oh!

Conversation

JeremyRubin commented Nov 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benefits

Potential Risks & Redresses

Uh oh!

gmaxwell commented Nov 29, 2018

Uh oh!

JeremyRubin commented Nov 29, 2018

Uh oh!

DrahtBot commented Nov 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

ryanofsky commented Nov 29, 2018

Uh oh!

ryanofsky commented Nov 29, 2018

Uh oh!

practicalswift Nov 30, 2018

Choose a reason for hiding this comment

Uh oh!

practicalswift Nov 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JeremyRubin Nov 30, 2018

Choose a reason for hiding this comment

Uh oh!

practicalswift Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

maflcko Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

JeremyRubin Dec 6, 2018

Choose a reason for hiding this comment

Uh oh!

practicalswift Dec 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

practicalswift Dec 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

practicalswift Nov 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

practicalswift Nov 30, 2018

Choose a reason for hiding this comment

Uh oh!

ken2812221 Nov 30, 2018

Choose a reason for hiding this comment

Uh oh!

sdaftuar commented Nov 30, 2018

Uh oh!

gmaxwell commented Nov 30, 2018

Uh oh!

JeremyRubin commented Dec 6, 2018

Uh oh!

gmaxwell commented Dec 9, 2018

Uh oh!

DrahtBot commented Apr 10, 2019

Uh oh!

maflcko commented Apr 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

JeremyRubin commented Nov 29, 2018 •

edited

Loading

DrahtBot commented Nov 29, 2018 •

edited

Loading

practicalswift Nov 30, 2018 •

edited

Loading

practicalswift Dec 6, 2018 •

edited

Loading

practicalswift Dec 6, 2018 •

edited

Loading

practicalswift Nov 30, 2018 •

edited

Loading