-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Prepare block connection logic for headers-first #3370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I've synced both the testnet and mainnet chain with this patch applied (from the network), I've had no problems, and no apparent difference in speed. |
|
@laanwj It's not being able to sync I'm worried about. It must however also deal well with invalid blocks and orphans. I'll do a local pulltester run next weekend to see what goes wrong. |
|
Rebased, and fixed two bugs that PullTester found (thanks, @TheBlueMatt!):
|
|
Can I have some reviews? @gavinandresen @laanwj @gmaxwell @jgarzik |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're dereferencing a pointer and putting it into a reference, making it impossible to check for NULL, which can be returned from State(). Could this ever be an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main-specific node state is created when a CNode is created, and destroyed when it is deleted (called from constructor and destructor). This means that for an existing CNode (pto in this case), state will always exist. When using a stored NodeId, the chance exists that it has been deleted in the mean time.
|
Update: mapBlockSource now also stores source information for orphan blocks. That means that we can potentially assign them a DoS score if their block is found to be invalid later on. It also means we can potentially send them a "reject" message for it. |
|
Reverted to an older version... let's see if pulltester likes it again :( |
|
Are you sure it's not pull testers fault? |
|
I think it is indeterminism. I've seen report 1, 6 and 8 blocks which differ. Maybe I was just lucky the first time that it worked as expected. |
|
doesn't mean it's not pull testers not broken... It certainly doesn't take kindly to changing the download algorithm. |
|
Right the now, the rule this branch should be following is that in case of multiple valid equal-work chains, the one where the tip block was received first is chosen. |
|
I'm not sure that that's the same as master in all cases? though I doubt pull tester would identify the difference. |
|
I'm not sure that's the same as master either, though I can't immediately comeup with a scenario where it differs. The question is first: is this rule acceptable (imho, if it differs with the current behaviour, i prefer this clearer rule). If so, can we have pulltester tolerate it? :) |
|
If the rule is different on any more than non existent cases it's a different network consensus which could be problematic. Ideally pull tester should identify any difference in network consensus unless we can come up with a proof that it can never matter. |
|
Right, agree. Though it feels like we're not actually understanding well enough what the behaviour of the current code is, in non-trivial reorganization cases. My intuition says that as long as you select one of the valid highest-pow chains, and there is no way to make a node switch to an equal-work chain when it already has one, we should always converge - but perhaps we need to think harder about that. In any case, I like the rule "the earliest seen of all highest-pow valid chains". Perhaps this needs to be discussed outside of this pullreq though. |
|
Yes, no question this needs simulation and study. Until then, I'm not comfortable with any, even subtle, changes to the best chain selection. |
|
Does headers-first download need a behavior change there, or is it just by accident and can it be corrected? |
|
@laanwj I'm pretty sure it's possible to recreate the old behavior in a headers-first compatible implementation, but I'd rather not. I like having a clear rule about what the intended behavior is, and have that implemented. After having thought a bit about this, this is a case where the old and new implementation differ, IIRC:
I still like the "the earliest received of all valid maximal-PoW chains" rule, as it's easy to implement and stateless. The current rules depend on what the current best chain is. |
|
Yes, actually I believe this (and similar cases) are what pull tester is tickling. The latest run tickles a few cases:
Only b14 is invalid so pull tester expects to select b13 but this pull differs somehow (its not entirely clear from the logs what the difference is, and this one is very likely to be pull-tester responding incorrectly to getblocks/getheaders, confusing the download process to be confused) |
|
Seems there was a bug in my code - the Comparator for CBlockIndex objects compared pb->nSequenceId with itself rather than with pa->nSequenceId. I remember fixing this bug before - perhaps I lost some git commit. Let's see. |
|
Ha... |
|
Works for me |
|
Merged two commits, and added some comments. Also some earlier changes, which got lost because they were commented on by @mikehearn in commits that have since been rebased:
|
This changes the block processing logic from "try to atomically switch to a new block" to a continuous "(dis)connect a block, aiming for the assumed best chain". This means the smallest atomic operations on the chainstate become individual block connections or disconnections, instead of entire reorganizations. It may mean that we try to reorganize to one block, fail, and rereorganize again to the old block. This is slower, but doesn't require unbounded RAM. It also means that a ConnectBlock which fails may be no longer called from the ProcessBlock which knows which node sent it. To deal with that, a mapBlockSource is kept, and invalid blocks cause asynchronous "reject" messages and banning (if necessary).
|
Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/75f51f2a63e0ebe34ab290c2b7141dd240b98c3b for binaries and test log. |
|
ACK. |
Prepare block connection logic for headers-first
This was always quite spammy and so far never useful in debugging.
0bd882b refactor: remove RecursiveMutex cs_nBlockSequenceId (Sebastian Falbesoner) Pull request description: The RecursiveMutex `cs_nBlockSequenceId` is only used at one place in `CChainState::ReceivedBlockTransactions()` to atomically read-and-increment the nBlockSequenceId member: https://github.com/bitcoin/bitcoin/blob/83daf47898f8a79cb20d20316c64becd564cf54c/src/validation.cpp#L2973-L2976 ~~For this simple use-case, we can make the member `std::atomic` instead to achieve the same result (see https://en.cppreference.com/w/cpp/atomic/atomic/operator_arith).~~ ~~This is related to #19303. As suggested in the issue, I first planned to change the `RecursiveMutex` to `Mutex` (still possible if the change doesn't get Concept ACKs), but using a Mutex for this simple operation seems to be overkill. Note that at the time when this mutex was introduced (PR #3370, commit 75f51f2) `std::atomic` were not used in the codebase yet -- according to `git log -S std::atomic` they have first appeared in 2016 (commit 7e908c7), probably also because the compilers didn't support them properly earlier.~~ At this point, the cs_main lock is set, hence we can use a plain int for the member and mark it as guarded by cs_main. ACKs for top commit: Zero-1729: ACK 0bd882b promag: Code review ACK 0bd882b. hebasto: ACK 0bd882b Tree-SHA512: 435271ac8f877074099ddb31436665b500e555f7cab899e5c8414af299b154d1249996be500e8fdeff64e4639bcaf7386e12510b738ec6f20e415e7e35afaea9
0bd882b refactor: remove RecursiveMutex cs_nBlockSequenceId (Sebastian Falbesoner) Pull request description: The RecursiveMutex `cs_nBlockSequenceId` is only used at one place in `CChainState::ReceivedBlockTransactions()` to atomically read-and-increment the nBlockSequenceId member: https://github.com/bitcoin/bitcoin/blob/83daf47898f8a79cb20d20316c64becd564cf54c/src/validation.cpp#L2973-L2976 ~~For this simple use-case, we can make the member `std::atomic` instead to achieve the same result (see https://en.cppreference.com/w/cpp/atomic/atomic/operator_arith).~~ ~~This is related to bitcoin#19303. As suggested in the issue, I first planned to change the `RecursiveMutex` to `Mutex` (still possible if the change doesn't get Concept ACKs), but using a Mutex for this simple operation seems to be overkill. Note that at the time when this mutex was introduced (PR bitcoin#3370, commit 75f51f2) `std::atomic` were not used in the codebase yet -- according to `git log -S std::atomic` they have first appeared in 2016 (commit 7e908c7), probably also because the compilers didn't support them properly earlier.~~ At this point, the cs_main lock is set, hence we can use a plain int for the member and mark it as guarded by cs_main. ACKs for top commit: Zero-1729: ACK 0bd882b promag: Code review ACK 0bd882b. hebasto: ACK 0bd882b Tree-SHA512: 435271ac8f877074099ddb31436665b500e555f7cab899e5c8414af299b154d1249996be500e8fdeff64e4639bcaf7386e12510b738ec6f20e415e7e35afaea9
This is a preparation for headers-first, which is on itself useful, but doesn't change the downloading or verification semantics yet.
It does change how the block connection logic works: it always makes sure the oldest valid chain with most work is connected. Instead of doing a "connect the received block now" atomically during process of that block, there is just a generic loop that disconnects and connects block to aim for the assumed best chain - reverting if necessary.
Let's see if pulltester likes this...