-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Validation: Move CheckBlock() mutation guard to AcceptBlock() #17601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation: Move CheckBlock() mutation guard to AcceptBlock() #17601
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
ariard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
Maybe hold for follow-up but did you look to remove CheckBlockHeader from CheckBlock given it's called now in AcceptBlockHeader which is call in AcceptBlock ? I think that's okay if we add a CheckBlockHeader in TestBlockValidity.
On the DoS-side we now have AcceptBlockHeader called before CheckBlock, I think that's better given block header checks are cheaper than block ones.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you keep CheckBlock as it is can we keep this to inform future refactors ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should either:
- be a comment in
CheckBlocksaying thatcs_mainshould usually be held when callingCheckBlock - add an
AssertLockHeld(cs_main)toCheckBlock(and also add that toFillBlockand update the callers in bench to hold the lock) - remove
fCheckedso thatCheckBlockdoesn't need to holdcs_main
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said here than not caching failure is okay because CheckBlock() isn't very expensive but at same time we use fChecked to return early to avoid reprocessing. It seems a bit an inconsistent position. If it's cheap enough we shouldn't bother with fChecked and lock tacking shouldn't cover CheckBlock? Or we could split CheckBlock between CheckBlockIntegrity and CheckBlockValidity and have a fDefinitelyInvalid to skip both if see block again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fChecked is a slightly different caching mechanism. It's stored on the CBlock object and prevents having to do the merkle root and POW checking for the same object more than once. The CBlock object is new each time we redownload a block, so this caching doesn't prevent us from re-checking an invalid block downloaded more than once. On the other hand, the BlockManager.m_block_index is what prevents us from checking an invalid block downloaded a second time.
We may actually be able to remove fChecked after this PR. Before this PR, we call CheckBlock three times on the same block:
- from
ProcessNewBlock() - from
AcceptBlock() - from
ConnectBlock()
(1) is removed by this PR and (3) is not on the critical path for compact block relay (since we relay the compact block as soon as we've done the merkle tree/pow checks the first time, and before we save to disk or connect the block).
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: keep struct construction outside of cs_main, that's still a concurrency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's still a concurrency
What you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @ariard means that there's no need to construct the BlockValidationState object within the cs_main scope. That's true, but constructing this object is very cheap, so I don't think it's a problem (and placing the variable declaration next to where it's used make this clearer).
I'm actually going to remove this commit from the PR, since I don't think it's necessary (and may make it more likely to introduce a bug, since the callers to ProcessNewBlock() in net_processing don't check the return code of the function before using the fNewBlock out param.
promag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's still a concurrency
What you mean?
We do not mark any blocks that fail CheckBlock() as BLOCK_FAILED_VALID since they could have been mutated and marking a valid-but-mutated block as invalid would prevent us from ever syncing to that chain. See https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html for full details. The current guard against marking CheckBlock() failed blocks as invalid is by calling CheckBlock() prior to AcceptBlock() in ProcessNewBlock(). That is brittle since AcceptBlock() has an implicit assumption that any block submitted has been checked for mutation. A future change to ProcessNewBlock() could overlook that implicit assumption and introduce a consensus failure. In this commit we move the mutation guard logic into AcceptBlock() and add comments to explain why we never mark CheckBlock() failed blocks as invalid.
c433cc4 to
eb3b20e
Compare
|
Rebased on latest master and removed the final commit.
As you note, if we did that, we'd need to add |
|
Closing this for now. I think it's the right change, but it's not high priority and there doesn't seem to be much interest. |
jonatack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
GitHub doesn't seem to parse its own link #17601 (
|
I empathise with this sentiment and have closed my own pull requests for the same reason, combined with the growing stack of open PRs needing attention and thinking I shouldn't have too many PRs open. That said, unless a squeaky wheel calls for grease, review attention seems to be flood-or-drought. Dormant, then suddenly, unexpectedly present. So I guess what matters most is if the PR author is still interested in their own PR. |
|
This is consensus-critical and the changing the ordering of checks here could potentially introduce very subtle consensus failures, and so this PR requires very careful review. I have 9 other PRs open and more branches that I haven't PRed and I'd prefer to focus review attention on those. |
We do not mark any blocks that fail
CheckBlock()asBLOCK_FAILED_VALIDsince they could have been mutated and marking a valid-but-mutated block
as invalid would prevent us from ever syncing to that chain. See
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-February/016697.html
for full details.
The current guard against marking
CheckBlock()failed blocks as invalidis by calling
CheckBlock()prior toAcceptBlock()inProcessNewBlock().That is brittle since
AcceptBlock()has an implicit assumption that anyblock submitted has been checked for mutation. A future change to
ProcessNewBlock()could overlook that implicit assumption and introducea consensus failure.
Move the mutation guard logic into
AcceptBlock()andadd comments to explain why we never mark
CheckBlock()failed blocks asinvalid.