Skip to content

Conversation

@Eunovo
Copy link
Contributor

@Eunovo Eunovo commented Nov 11, 2025

During a reindex, the assumevalid setting may be ignored if the chainwork of the best header is below minimumchainwork. This happens when the previous IBD was stopped before connecting enough blocks to reach the required minimumchainwork. The assumevalid block will also be missing from the block index because the assumevalid block is set to match the minimumchainwork. See Issue #31494.

As a result, reindex can take significantly longer to complete than necessary.

This PR addresses the issue by modifying reindex behaviour: if the chainwork of the best_header is below minimumchainwork, the node skips the ActivateBestChain() step and proceeds directly to synchronisation.

This PR takes a different approach from PR #31615 to keep the validation path for reindex the same as the regular IBD validation path.

Developers benchmarking IBD with -reindex may find this behaviour undesirable. In such cases, it’s recommended to use -reindex-chainstate instead or ensure that the available blocks have sufficient chainwork.

libbitcoinkernel exposes the ImportBlocks function and expects reindex to connect blocks, so an attempt was made to preserve the existing behaviour for libbitcoinkernel.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 11, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33854.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK l0rinc

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34253 (validation: cache tip recency for lock-free IsInitialBlockDownload() by l0rinc)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • LoadExternalBlockFile will immediately call ABC when it finds -> LoadExternalBlockFile will immediately call ActivateBestChains() when it finds ... [“ABC” is unclear/incomplete; spell out the function name and finish the sentence so the comment is comprehensible]
  • there is high enough work chain -> there is a chain with high enough work (or a high-enough-work chain) [grammar: missing article/poor word order makes the condition hard to parse]

Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

  • AcceptBlock(pblockrecursive, dummy, nullptr, true, &it->second, nullptr, true) in src/validation.cpp

2026-01-09

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task test each commit: https://github.com/bitcoin/bitcoin/actions/runs/19268555940/job/55090462335
LLM reason (✨ experimental): CTest failure: Bitcoin Kernel Test Suite fails due to an invalid block (invalid PoW/time) causing no chain entry at the requested height.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried this out on signet or mainnet?
I haven't yet, but just from looking at the code I would suspect that we now

  1. do the -reindex part without connecting any blocks (except genesis here)
  2. omplete headers-sync with a peer until minchainwork
  3. Download the first block we don't have on disk from a peer and only then call ActivateBestChain()
  4. try to connect that block, which will result in the entire chain up to that block being connected by the msghand thread. This could block the thread hours, in which no other peer would be processed, so existing peers will get disconnected (ping timeout), while net thread will still continually make new connections (that never complete the version handshake because we are too busy connecting blocks).

Also, attempting to reindex while offline wouldn't work anymore if we are below minchainwork.

@hodlinator
Copy link
Contributor

Thanks for trying this out @Eunovo!

Also, attempting to reindex while offline wouldn't work anymore if we are below minchainwork.

Would it be acceptable to emit an error and halt, saying that -reindex for cases where nodes don't have sufficient block data also requires decreasing -minimumchainwork= to whatever level we have in the block data?
Could also explain that this is in order to keep logic consistent regardless of -reindex, and also say that running with lowered -minimumchainwork long-term is not advisable.

Not the cleanest perhaps. But we have 2 main usages of -reindex AFAIK:

  • Regular node-runners which run into some kind of disk corruption issue - they should typically have peers available to sync headers from.
  • Developers doing benchmarks/other work who either have sufficient block data or are okay with lowering -minimumchainwork temporarily.

@l0rinc
Copy link
Contributor

l0rinc commented Nov 12, 2025

Developers doing benchmarks/other work who either have sufficient block data or are okay with lowering -minimumchainwork temporarily

I don't know anyone who does -reindex benchmarks anymore - -reindex-chainstate is usually enough, -reindex isn't representative of IBD so we have stopped relying on it.

reindex while offline wouldn't work anymore

I haven't checked the code yet (am planning on doing that later), but would it be possible to just issue a warning if we're offline and attempt the reindex anyway (especially now that we have proper script verification logging to show if script verification was enabled)?

@Eunovo Eunovo force-pushed the reindex-assumevalid-take2 branch 3 times, most recently from 8128a5e to 86c3dc7 Compare November 12, 2025 22:35
@Eunovo
Copy link
Contributor Author

Eunovo commented Nov 12, 2025

Also, attempting to reindex while offline wouldn't work anymore if we are below minchainwork.

I don't think it is particularly important that reindex be able to complete offline. As @hodlinator pointed out:

Not the cleanest perhaps. But we have 2 main usages of -reindex AFAIK:

  • Regular node-runners which run into some kind of disk corruption issue - they should typically have peers available to sync headers from.
  • Developers doing benchmarks/other work who either have sufficient block data or are okay with lowering -minimumchainwork temporarily.

In both use cases, network access is not an issue.

Would it be acceptable to emit an error and halt, saying that -reindex for cases where nodes don't have sufficient block data also requires decreasing -minimumchainwork= to whatever level we have in the block data? Could also explain that this is in order to keep logic consistent regardless of -reindex, and also say that running with lowered -minimumchainwork long-term is not advisable.

It is possible to do this, but will this be considered a better solution than automatically doing headers sync?

I haven't checked the code yet (am planning on doing that later), but would it be possible to just issue a warning if we're offline and attempt the reindex anyway (especially now that we have proper script verification logging to show if script verification was enabled)?

IBD does not work offline anyway. I'm not sure this will be better to implement, unless there is a use-case that requires -reindex to work offline with a low-work chain.

@Eunovo
Copy link
Contributor Author

Eunovo commented Nov 21, 2025

4. try to connect that block, which will result in the entire chain up to that block being connected by the msghand thread. This could block the thread hours, in which no other peer would be processed, so existing peers will get disconnected (ping timeout), while net thread will still continually make new connections (that never complete the version handshake because we are too busy connecting blocks)

I've pushed a solution to this. In the current implementation, the background init thread waits for enough headers to be indexed. When best_header.nChainwork becomes greater than or equal to MinimumChainWork(), the outstanding blocks will be connected on the background init thread; the p2p message processing thread will reject any new blocks while the outstanding blocks are being connected.

@Eunovo Eunovo marked this pull request as ready for review November 21, 2025 20:51
@hodlinator
Copy link
Contributor

Would it be acceptable to emit an error and halt, saying that -reindex for cases where nodes don't have sufficient block data also requires decreasing -minimumchainwork= to whatever level we have in the block data? Could also explain that this is in order to keep logic consistent regardless of -reindex, and also say that running with lowered -minimumchainwork long-term is not advisable.

It is possible to do this, but will this be considered a better solution than automatically doing headers sync?

This suggestion was meant for the case where we can't do headers sync due to being offline (using zero peers after a certain timeout or something like that, I'm not sure how to best check whether we are offline). I guess the error message could also point out that going online should resolve the situation.

@Eunovo
Copy link
Contributor Author

Eunovo commented Nov 24, 2025

This suggestion was meant for the case where we can't do headers sync due to being offline (using zero peers after a certain timeout or something like that, I'm not sure how to best check whether we are offline). I guess the error message could also point out that going online should resolve the situation.

I see. A message that indicates that the node is waiting for headers sync to finish reindex will be valuable. We probably don't need to check that we are offline at all.

@Eunovo Eunovo changed the title validation: fix assumevalid is ignored during reindex fix assumevalid is ignored during reindex Nov 24, 2025
@l0rinc
Copy link
Contributor

l0rinc commented Nov 24, 2025

I have checked if it fixes the problem I have noticed a year ago and now that #33336 landed it's even simpler to do a reproducer:

for COMMIT in 509dc91db143fe2ebb4e910680aca97ba62233b9 b59ca7602bf9efaad80b3465129685ee1d09a028; do
  (echo ""; git fetch -q origin $COMMIT >/dev/null 2>&1 && git checkout -q $COMMIT && git log -1 --pretty='%h %s' || exit 1) && \
  cmake -B build >/dev/null 2>&1 && cmake --build build -j$(nproc) >/dev/null 2>&1 && \
  killall -9 bitcoind >/dev/null 2>&1; rm -rf demo >/dev/null 2>&1; mkdir -p demo >/dev/null 2>&1 && \
  build/bin/bitcoind -datadir=demo -stopatheight=10 >/dev/null 2>&1 && \
  build/bin/bitcoind -datadir=demo -stopatheight=10 -reindex >/dev/null 2>&1 && \
  build/bin/bitcoind -datadir=demo -stopatheight=10 -reindex-chainstate | grep "script verification"
done

Which correctly prints:

509dc91db1 Merge bitcoin/bitcoin#33026: test, refactor: Embedded ASMap [1/3]: Selected minor preparatory work
2025-11-24T12:54:28Z Enabling script verification at block #1 (00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048): assumevalid hash not in headers.

b59ca7602b test/reindex: only connect minchainwork chain
2025-11-24T12:59:32Z Disabling script verification at block #1 (00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048).

nit: maybe this style of checking via the logs could simplify the testing of the PR
nit2: the LLM Linter has 2 valid comments

Concept ACK

@Eunovo Eunovo force-pushed the reindex-assumevalid-take2 branch from b59ca76 to 729bf4d Compare November 25, 2025 16:04
@Eunovo
Copy link
Contributor Author

Eunovo commented Nov 26, 2025

Thanks for the reviews.

@l0rinc I added your test as an automated test in 729bf4d
@hodlinator I added a useful log message in 729bf4d

@Eunovo Eunovo requested a review from mzumsande November 26, 2025 16:06
@gmaxwell
Copy link
Contributor

I think minimumchainwork being configurable may violate the underlying security assumption that made me feel comfortable writing assumevalid in the first place.

The concern is that attackers (or developers operating under coercion) could target victims with false statements that due to some "network incident" they must start nodes with -assumevalid= (and use DOS attacks to prevent syncing an honest chain). AV security assumption there is that the bad chain needs to have two weeks of work at a contemporary difficulty on top of it. If an attacker is able to do that then there are other at-least-as-serious problems.

The idea is that the attacker must either control enough hashpower to cause serious reorgs OR convince the users to actually patch/replace the software and not just reconfigure it. (And of course, if they can convince the user of that-- it's game over as they can do anything).

This security assumption is documented in the source code:

                // The equivalent time check discourages hash power from extorting the network via DOS attack
                //  into accepting an invalid block through telling users they must manually set assumevalid.
                //  Requiring a software change or burying the invalid block, regardless of the setting, makes
                //  it hard to hide the implication of the demand. 

So I think advising anywhere to change that value is not a good idea and that it was a mistake to make this value configurable on mainnet without clamping it to the hardcoded value... though #10357 doesn't really explain the motivation outside of test harnesses enough for me to say for certain. As it is, however, it does break my security assumptions in a way that concerns me.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 8, 2026

🚧 At least one of the CI tasks failed.
Task iwyu: https://github.com/bitcoin/bitcoin/actions/runs/20816372050/job/59792983458
LLM reason (✨ experimental): IWYU failure: include-what-you-use step edited files and returned non-zero, causing the CI to fail.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

Eunovo added 8 commits January 9, 2026 11:08
This commit sets the stage for future commits that will change the behavior of ImportBlocks to conditionally ActivateBestChain after reindex
This change will allow headers sync to discover a long enough chain before connecting blocks.
`force_activation` is required to preserve `ImportBlocks` behavior in libbitcoinkernel.
The background init thread will wait on this mutex for header tip updates to determin whether to ActivateBestChain  in future commits..
This commit removes:
- LoadingBlocks() guard that prevent HEADERS sync during reindex, and
- adds a LoadingBlocks() guard that prevents requesting blocks during reindex

Blocks must not be requested while LoadingBlocks() is `true` because they will be
rejected when received and this might lead to the peer being disconnected because
the requested block will never be removed from vBlocksInFlight.

Note that we cannot connect Blocks while LoadingBlocks() is `true` because this can
block the p2p process message thread for a long time if a long chain is being connected
in the background.
Future commits will change the LoadingBlocks stage to only end after best_header has at least minchainwork.
Mining a block at the tip will no longer be enough to trigger the "unrequested block" scenario.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants