-
Notifications
You must be signed in to change notification settings - Fork 38.6k
Set notifications m_tip_block in LoadChainTip() #31346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31346. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
This is incomplete. You'll have to update the docs as well: "If the tip was not connected on |
b6d5be1 to
57e15ce
Compare
|
I updated the @maflcko which if-guard do you mean? |
I think probably this one: // Wait for genesis block to be processed
if (WITH_LOCK(chainman.GetMutex(), return chainman.ActiveTip() == nullptr)) {Removing the |
|
Also, the ZERO workaround could be removed completely, ensuring that any interfaces will only be spun up after that point in init? |
57e15ce to
5c45425
Compare
|
I dropped the if guard.
It could, but I'm not sure if it's worth putting such a restriction on the interfaces. Similarly we have If on the other hand an external application is started at the same time, it would just have to keep trying to connect until we listen. Such a race would exist even if we opened the interface a bit earlier in the init process, though it would be shorter. |
Tend to agree for now. Having waitTipChanged wait for a tip to be connected seems like a natural thing to implement and might avoid the need for clients to poll the node on startup. But I think this check could be dropped in BlockTemplate::waitNext in #31283 (maybe replaced with an assert there) since it should not be possible to get a blocktemplate reference before the tip is connected. After #31283 it might make sense to drop the |
|
Several reindex related tests are broken, investigating...
I think it's generically useful though. There are various types of applications that need to know when a new block is added to the chain (and have no need for a template). |
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
|
Specifically It stops after the second block: Putting the if-guard back "fixes" it. When I put a log statement before Turns out the only critical thin is that the |
5c45425 to
3317123
Compare
src/init.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that this will only ever actually wait in three cases:
1.) first startup with an empty datadir
2.) reindex
3.) reindex-chainstate
where the genesis block is being connected by the initload thread, whereas otherwise the m_tip_block_cv wait condition should be satisfied because m_tip_block should have been set earlier by the same thread. So removing the if guard is not necessary but just a matter of taste.
After this PR, it will be harder to understand how the folllowing code block refers to the genesis block, so maybe the comment could be expanded to mention some of the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I drafted a comment, but I think it's wrong so I haven't pushed it yet:
/*
* Wait for genesis block to be processed. Typically kernel_notifications.m_tip_block
* has already been set, except in three cases where the genesis block is
* being connected by the initload thread:
*
* 1. first startup with an empty datadir
* 2. reindex
* 3. reindex-chainstate
*/I don't think the initload thread ever sets kernel_notifications.m_tip_block. The thread only deals with indexes derived from BaseIndex, which doesn't include CBlockIndex.
I'm still confused under what circumstance it would actually be unset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's clearly doing something though, because the following breaks several functional tests:
--- a/src/init.cpp
+++ b/src/init.cpp
@@ -1792,13 +1792,14 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
});
// Wait for genesis block to be processed
{
WAIT_LOCK(kernel_notifications.m_tip_block_mutex, lock);
- kernel_notifications.m_tip_block_cv.wait(lock, [&]() EXCLUSIVE_LOCKS_REQUIRED(kernel_notifications.m_tip_block_mutex) {
- return !kernel_notifications.m_tip_block.IsNull() || ShutdownRequested(node);
- });
+ Assert(!kernel_notifications.m_tip_block.IsNull());
+ // kernel_notifications.m_tip_block_cv.wait(lock, [&]() EXCLUSIVE_LOCKS_REQUIRED(kernel_notifications.m_tip_block_mutex) {
+ // return !kernel_notifications.m_tip_block.IsNull() || ShutdownRequested(node);
+ // });
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With some manual testing I indeed get all three conditions from your comment to hit that assert. I'm just confused why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside the context of a snapshot, we only call LoadChainTip() if:
bitcoin/src/node/chainstate.cpp
Lines 147 to 150 in 2638fdb
| if (!is_coinsview_empty(chainstate)) { | |
| // LoadChainTip initializes the chain based on CoinsTip()'s best block | |
| if (!chainstate->LoadChainTip()) { | |
| return {ChainstateLoadStatus::FAILURE, _("Error initializing block database")}; |
That explains why these three scenarios are different. With an empty datadir or when we -reindex(-chainstate) the UTXO set is empty[0].
Still looking into where it is set.
[0] more accurately, doesn't point to a tip or we asked to wipe it. The name is misleading, it doesn't matter if by some magic every UTXO ends up spent.
bitcoin/src/node/chainstate.cpp
Lines 260 to 262 in 2638fdb
| auto is_coinsview_empty = [&](Chainstate* chainstate) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) { | |
| return options.wipe_chainstate_db || chainstate->CoinsTip().GetBestBlock().IsNull(); | |
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, so is the scenario specifically the case when a node started with -reindex is stopped before the final ActivateBestChain call? If it is stopped during the final ActivateBestChain call it seems like it would be picking up work where it left off when ActivateBestChain was called again, due to LoadChainTip setting the previous tip.
we will call already
ActivateBestChainearly, when the best header is somewhere mid-chain, connect a first batch of blocks to the chainstate, then reindex the remaining block files, and then connect a second batch.
This is referring to the ActivateBestChain call in LoadExternalBlockFile call you linked to earlier? And removing this call would fix the weirdness and assumevalid issue? It does seem like it would be nice to remove this call. I'm not sure what the point of connecting the genesis block early but not connecting any other blocks is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, so is the scenario specifically the case when a node started with
-reindexis stopped before the final ActivateBestChain call?
Yes.
This is referring to the ActivateBestChain call in LoadExternalBlockFile call you linked to earlier?
Yes!
And removing this call would fix the weirdness and assumevalid issue? It does seem like it would be nice to remove this call. I'm not sure what the point of connecting the genesis block early but not connecting any other blocks is.
I don't think we'd want to remove it completely, because in case of a -reindex it allows init to complete normal startup without having to wait for the import thread to finish reindexing block files (which can take a long time for mainnet). I think that @l0rinc is looking into possible fixes, but without having tested anything my first idea was to only call ActivateBestChain there it if the chain is empty, so it only doesn't get called in the restart scenario where we already loaded Genesis during LoadChainTip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in my backlog, but I would be very happy if someone who already understands this fixes it instead :)
I just noticed it during Core Dev that my perf logs are full of script validations when trying out different scenarios and cancelling a reindex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in my backlog, but I would be very happy if someone who already understands this fixes it instead :)
Ok thanks, I'll try out my suggestion from above and will open a PR if it's as straightforward as I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3317123 to
a6ee3e1
Compare
|
I added a check to ensure we're not the background chainstate when sending the notification. |
a6ee3e1 to
be78eaa
Compare
|
I added a comment based on #31346 (comment) to clarify under which circumstances I find it quite confusing that |
It is explained in the code comment above the line that does it. It was added as a fix for #2239 |
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mzumsande
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just a request to change one comment, will A CK after.
src/node/blockstorage.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment change is not correct and should just be dropped: m_blockfiles_indexed is true (its default value) in case of an empty datadir, false in case of a reindex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I'm confused about what causes case (1): #31346 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With an empty datadir, we don't have the genesis connected at this point. But we don't enter the reindex if clause here (the comment only refers to that), we just skip this and the loadblock block and go to the ActivateBestChain call below which connects Genesis and allows the init thread to continue.
Ensure KernelNotifications
m_tip_blockis set even if no new block arrives.Suggested in #31297 (comment)