Skip to content

Conversation

@marcofleon
Copy link
Contributor

@marcofleon marcofleon commented Jan 13, 2025

The headers presync logic (only downloading headers that lead to a chain with sufficient work, implemented in #25717) should be enough to prevent memory DoS using low-work headers. Therefore, we no longer have any use for checkpoints.

All checkpoints and checkpoint logic are removed in a single commit, to make it easy to revert if necessary.

Some previous discussion can be found in #25725. The conclusion at the time was that more testing of the presync logic was needed. Now that we have unit, functional, and fuzz tests for this logic, it seems safe to move forward with checkpoint removal.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 13, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31649.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Sjors, dergoegge, instagibbs
Concept ACK darosior, BrandonOdiwuor, ariard, glozow
Stale ACK sipa, maflcko, l0rinc, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #31991 (RFC: Add operator""_uint256 compile-time user-defined literal by l0rinc)
  • #31981 (Add checkBlock() to Mining interface by Sjors)
  • #31859 (test: Rename send_message to send_without_ping by maflcko)
  • #31260 (scripted-diff: Type-safe settings retrieval by ryanofsky)
  • #31250 (wallet: Disable creating and loading legacy wallets by achow101)
  • #30595 (kernel: Introduce initial C header API by TheCharlatan)
  • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
  • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
  • #28710 (Remove the legacy wallet and BDB dependency by achow101)
  • #17783 (common: Disallow calling IsArgSet() on ALLOW_LIST options by ryanofsky)
  • #17581 (refactor: Remove settings merge reverse precedence code by ryanofsky)
  • #17580 (refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky)
  • #17493 (util: Forbid ambiguous multiple assignments in config file by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sipa
Copy link
Member

sipa commented Jan 13, 2025

If we go through with this, perhaps it's worth keeping the -checkpoints option around, just to trigger a startup warning if provided (as opposed to an error if someone still has it in the config).

@dergoegge
Copy link
Member

Concept ACK

@Sjors
Copy link
Member

Sjors commented Jan 13, 2025

You can delete blockheader_testnet3.hex as it's no longer used and testnet3 may get deprecated anyway. #31583 introduces something similar based on mainnet.

@marcofleon marcofleon force-pushed the 2024/11/checkpoint-removal branch from 5417243 to a694e1d Compare January 13, 2025 18:01
@marcofleon
Copy link
Contributor Author

marcofleon commented Jan 13, 2025

Updated the -checkpoints option and added a warning (similar to -upnp). Also removed blockheader_testnet3.hex. Thanks!

@darosior
Copy link
Member

Concept ACK.

Comment on lines 4140 to 4159
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems worth keeping, at least for a few versions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed seems better to keep this. Perhaps mention the last mainnet checkpoint height, since MinimumChainWork() is far above that.

Copy link
Contributor Author

@marcofleon marcofleon Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that if there were code to keep for now it would be this. I ultimately decided to remove it because the purpose it serves no longer seems to apply if we remove checkpoints. With the headers sync logic, iiuc, it would be fine for our node in IBD to respond to GETHEADERS because the only headers we could have would belong to a chain with enough work. There isn't a risk anymore of sending a bogus chain to a peer and getting disconnected.

This behavior was first introduced in #6172 and then changed to MinimumChainWork() in #24178. Some more discussion can be found in #21106 (comment) and #21106 (comment).

Let me know if there's something I'm missing here, wrt to the logic of why it can be safely removed. If we do keep it, then I should at least update the comment (to explain why the check is still there?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way I can see this still being a problem is:

  • Someone started a node with 23.x or earlier (before p2p: Implement anti-DoS headers sync #25717)
  • Started being fed a low-work headers chain.
  • Upgraded to 30.x(?) or later (with this PR)
  • Never made any headers sync progress since.

That seems pretty unlikely, so I think it's fine to remove this if/when we get rid of checkpoints.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made sense to me without being familiar with how the anti-DoS sync works in details, asked on IRC, Suhas points out it's still possible for a node to have checkpoint-violating headers to serve:

11:00 darosior: that's not quite true, because an adversary can feed you the honest chain during pre-sync, and then a bogus one during redownload
11:00 darosior: so you'd detect it quickly and disconnect but be left with headers that are possibly checkpoint-violating (if your node isn't enforcing checkpoints)

Copy link
Member

@Sjors Sjors Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we respond to GETHEADERS as soon as we're listening. It seems like a good idea in general to not answer until we have MinimumChainWork worth of headers. (using an empty headers message to ensure they don't hang up on us)

Even without a malicious peer involved, we might be only part way in obtaining headers from an honest peer. If another peer then wants headers from us, IIUC we'd be wasting their time because we can't send them enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some more discussion on IRC and IRL.

For this to have an effect, it means an attacker needs to get us to accept blocks on an attack chain that has more work than minchainwork (at the time it is accepted), which should be very expensive. So I don't think it's a necessity to keep the check for the purpose of preventing being disconnected by checkpoint-enabled nodes.

On the other hand, there seems to be little reason not to do the check. It's cheap, and it doesn't seem unreasonable to avoid offering peers to sync from us before we're at a point where we can offer enough that we'd accept it ourselves, like @Sjors points out.

I'd vote to keep it, but update the comments.

Copy link
Contributor

@mzumsande mzumsande Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason to keep this check might be the following:
If we are in IBD and our chain is below minchainwork, we don't send headers to inbound peers, so they don't get to know our chain and won't attempt to download blocks from us (e.g. if they are even further behind in IBD).
I think that while in IBD it makes sense to use the bandwidth for downloading the chain instead of serving others. Though admittedly this isn't a very realistic scenario because we don't self-advertise during IBD and are unlikely to get inbound peers anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I kept the check and adjusted the comment in a separate commit. Let me know if it makes sense or can be improved in any way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the check in makes sense to me and can be removed later with little cost if desired.

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK
The remaining "checkpoint" references in the code seem unrelated.

Copy link
Contributor

@l0rinc l0rinc Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To guard against BLOCK_HEADER_LOW_WORK changing its numeric value now (to 8 instead of previous 9), we could hard-code the previous values for the remaining ones

enum class BlockValidationResult {
    BLOCK_RESULT_UNSET    = 0, //!< initial value. Block has not yet been rejected
    BLOCK_CONSENSUS       = 1, //!< invalid by consensus rules (excluding any below reasons)
    BLOCK_CACHED_INVALID  = 2, //!< this block was cached as being invalid and we didn't store the reason why
    BLOCK_INVALID_HEADER  = 3, //!< invalid proof of work or time too old
    BLOCK_MUTATED         = 4, //!< the block's data didn't match the data committed to by the PoW
    BLOCK_MISSING_PREV    = 5, //!< We don't have the previous block the checked one is built on
    BLOCK_INVALID_PREV    = 6, //!< A block this one builds on is invalid
    BLOCK_TIME_FUTURE     = 7, //!< block timestamp was > 2 hours in the future (or our clock is bad)
 // BLOCK_CHECKPOINT      = 8, //!< Deprecated and removed
    BLOCK_HEADER_LOW_WORK = 9  //!< the block header may be on a too-little-work chain
};

Copy link
Member

@Sjors Sjors Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if these are stored to disk as part of the block index, or exposed via RPC. Otherwise changing their numeric values shouldn't matter.

Copy link
Member

@Sjors Sjors Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging from the change to ipc_test.cpp below it seems that eventually (#10102) these values will be sent over IPC. At that point we have to fix their integer values. Might as well do it now? cc @ryanofsky

Should be a separate commit, since we would not want to revert that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these values will be sent over IPC. At that point we have to fix their integer values. Might as well do it now?

Why would they need to be fixed? IPC is only an internal interface. It is currently experimental and in draft and there are no plans for the interface to support mismatching server/client binaries. I'd say the code should be left as-is until there are plans (with support) to even support that. Until then, changing the code for that reason seems like pointless churn to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no plans for the interface to support mismatching server/client binaries

That tends to happen without a plan, but indeed we can wait.

part of the block index, or exposed via RPC

^ this still need to be clarified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part of the block index, or exposed via RPC

I don't believe either of these to be the case.

src/init.cpp Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement. I've been running my node with -nocheckpoints=1 for years, so now it will just throw a warning rather than abort launch. (which I tested)

@Sjors
Copy link
Member

Sjors commented Jan 14, 2025

Concept ACK

If we go ahead with this, right after the v29 branch-off would seem like a good time.

I plan to expand the alternate mainnet chain in #31583 to 11111+ blocks and have ~p2p_dos_header_tree.py` use that instead. That would make it easier to revert the checkpoints even after testnet3 is completely dropped.

(update: actually that test is hard to replace without also replaying the real mainnet)

@marcofleon marcofleon force-pushed the 2024/11/checkpoint-removal branch from a694e1d to 9e91e53 Compare January 14, 2025 15:10
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #25725 the approach was to remove just the actual checkpoints, but leave the supporting code to be removed in the next / a later release (#25725 (comment)). Did you consider this, instead of doing it all in one commit?

Copy link
Member

@dergoegge dergoegge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9e91e536a45e3fe802eb8b38ccc893ef894e72e7

The "new" headers pre-sync (shipped in v24.0) protects us against the same denial-of-service issues that the checkpoints were solving in the past. Therefore it is reasonable to remove them, especially considering the testing the pre-sync has undergone and the amount of time that has passed.

Even under the assumption that there is a bug in the headers pre-sync logic (i.e. it's somehow still possible to submit low work headers), the checkpoints only raise the cost of attack slightly to an ever shrinking fraction of the cost of mining a block at the tip (see https://bitcoincore.org/en/2024/09/18/disclose-headers-oom/).

If we are not comfortable with removing the checkpoints, then more of them should be added, as the protections they provide will converge towards zero otherwise.

Philosophically it is nice to get rid of the checkpoints, as they are one of the only things in this code base that dictate what the canonical chain is, which isn't a position we want to be in (this concern was also part of the design considerations for features such as assumevalid and assumeutxo).

I think we could reasonably ship this removal in v29.0, as I don't see what additional assurances waiting even longer would give us.

@DrahtBot DrahtBot requested review from Sjors, darosior and l0rinc January 14, 2025 15:50
@Sjors
Copy link
Member

Sjors commented Jan 14, 2025

I don't see what additional assurances waiting even longer would give us.

Merged pull requests tend to get additional attention, e.g. in the Optech Newsletter. Perhaps this will jolt someones memory, and there wouldn't be a rush to revert close to a release.

Perhaps someone is sitting on an exploit and wants to collect a bug bounty for it. For it to be a bug, it needs to be merged. That said, there's no such bounty and we probably don't want to incentivise that type of behaviour.