Skip to content

Conversation

@taratorio
Copy link
Member

@taratorio taratorio commented Jul 11, 2025

closes #15877
closes #15879
relates to https://github.com/erigontech/erigon-qa/issues/74#issuecomment-2956232796

This change is introduced with a feature flag --el.block.downloader.v2 (off by default) until sufficient testing has been done. The old flow will be used until we enable it by default. This should minimise any breaking changes.

Problems addressed

  1. We weren't able to join any of the perfnet shadow forks
    • problem is that when we send out "get header requests" over devp2p with sentry we always send it to the same 5 peers (due to a max limit of 5 peers when sending the message out) which causes issues for shadow forks
    • in a shadow fork environment we have very few peers that are on the shadow fork (e.g. 5 bootnodes) and majority of the other peers we are connected to are mainnet peers (the shadowfork uses the same networkid=1 as mainnet)
    • what ended up happening is that we would always hit the same 5 peers that were on mainnet instead of the shadowfork when trying to backward sync using the EL block downloader - leading to use being stuck indefinitely
    • same issue happened for bodies download - there it was even worse because we only send bodies request to max 1 peer (and we always would hit the same peer)
    • my workaround/short-term fix to unblock the perfnet work on the performance branch was to always send all messages to all peers (which is bad for the network and so we can't merge that temp fix into main, hence we need a better solution which this PR is aspiring to be)
  2. extreme chaindata growth for large backward chain downloads due to the el downloader not taking into account --sync.loop.block.limit batched execution (in batches of 5,000 blocks) for E3
  3. extreme chaindata growth for large backward chain downloads due to a long-lived chaindata RoTx
    • the long-lived RoTx is used in membatchwithdb.NewMemoryBatchWithCustomDB(tx, tmpDb, tmpTx) after the headers have been downloaded into an ETL
    • even after point 2. above was fixed and we called execution every 5,000 block (in order for the pruning to kick in) the chaindata was still growing uncontrollably - this is because there is a long lived RoTx holding the data (even though it has been pruned by the execution loop)
    • this RoTx lives for the entirety of the backward chain download
  4. issues with the EL block downloader getting stuck for gnosis as per: https://github.com/erigontech/erigon-qa/issues/74#issuecomment-2956232796

Solutions
To address problem 1:

  • We devise a new backward fetch algorithm which leverages the devp2p api I wrote for Astrid (it is a convenient wrapper on top of Sentry). It is a much better choice than the old HeaderDownload and BodyDownload approach because it gives the user more control over which peers to fetch from (e.g. it allows you to list peers and make smarter selections about which peers to download from) which makes a lot of the code simpler. It also allows us to do cool things like parallel body download from many peers which we weren't able to do in an easy way previously. Another convenience of the API is that it is blocking every time you send a request to a peer so you don't have to worry about the async nature of devp2p. Doing this will also allow us to remove HeaderDownload, BodyDownload and headers and bodies stage so that we can finally unify our devp2p code paths between Polygon and Ethereum
  • The algo looks like this:
1. List all peers
2. Try to download the initial header with the hash we need to download backwards from all peers so that we know which peers have this data and are available for future requests.
3. Send devp2p backward header requests in batches of 500 headers (configurable per request) to 1 peer at a time
  - Send every new batch to the next available peer (which has the initial header) in a circular fashion to distribute the requests
  - For every batch of headers that we receive, we traverse the headers backwards (in descending number) and check if the parent hash is available in our DB - if it is we have found a connection point and we stop otherwise we continue fetching headers backwards
  - We accumulate the headers in an ETL collector
  - if any of the available peers fail to respond to us we mark them as "exhausted" so we don't try to use them again
4. Once we have a connection point, we start doing a forward body download in batches of 500 (configurable per request) by traversing through the sorted ETL (without needing any temporary MDBX)
   - here we do some parallelisation (because we already know all the hashes) - depending on how many non-exhausted peers we have we split the batch of 500 headers into N smaller batches - where N is the number of non-exhausted peers
   - we fetch the bodies of these N batches in parallel (we have a configurable max workers property - currently set to 10)
   - if any of the peers fail to serve us the bodies we again mark them as exhausted so we don't try to use them again and retry the batch with the next available peer in a circular fashion
   - once we've downloaded the batch of 500 blocks we feed it to a `ResultFeed` where the caller can consume it (paging-like API)

To address problem 2:

  • we change the logic in the EngineBlockDownloader in the following way
  • if a backward downloaded header chain is > dbg.MaxReorgDepth (512 blocks) in NewPayload we abandon the download request
  • the algo in the solution to problem 1 has an option to abandon the backward header download (step 3) early if the backward chain length exceeds a configurable number (512 in this case)
  • the CL will eventually have to send a ForkChoiceUpdated request for that large backward chain if it is the canonical one - within a FCU we allow long backward downloads
  • when doing these long backward downloads with FCU - we add logic to take into account --sync.loop.block.limit - so we insert blocks to chain data (in batches of 500 blocks as currently done) then when we reach 5,000 blocks inserted we call ValidateChain+UpdateForkChoice to trigger execution + pruning (to keep chaindata size low)
  • note this works quite well with the ResultFeed paging approach in the solution for problem 1

To address problem 3:

  • thanks to the solution to problem 1. we actually no longer need to use membatchwithdb.NewMemoryBatchWithCustomDB(tx, tmpDb, tmpTx) and a long-lived chaindata RoTx - we only need an ETL collector which can be used without a tx/db

To address problem 4:

  • solved by the solution to problem 1. as well

Future follow-ups

  • We should move the shared code from polygon/p2p to p2p package
  • We should use the new bbd.BackwardBlockDownloader in Astrid too (there is 1 situation in the tip-sync mode there that may occasionally need to do long backward block downloads when handling long period of no milestones)
  • Remove the old code path and --el.block.downloader.v2 feature flag in EngineBlockDownloader

@taratorio taratorio merged commit ee48d8a into main Jul 17, 2025
15 checks passed
@taratorio taratorio deleted the el-block-downloader-v2 branch July 17, 2025 11:41
taratorio added a commit that referenced this pull request Jul 18, 2025
taratorio added a commit that referenced this pull request Jul 18, 2025
)

This reverts commit ee48d8a.


due to issues with hive - need more time to resolve - unblocking the
release
taratorio added a commit that referenced this pull request Jul 24, 2025
taratorio added a commit that referenced this pull request Aug 1, 2025
adding back #16073 (had to
revert it because it broke hive tests)

closes #15877
closes #15879
relates to
erigontech/erigon-qa#74 (comment)

This change is introduced with a feature flag `--el.block.downloader.v2`
(off by default) until sufficient testing has been done. The old flow
will be used until we enable it by default. This should minimise any
breaking changes.

**Problems addressed**
1. We weren't able to join any of the perfnet shadow forks
- problem is that when we send out "get header requests" over devp2p
with sentry we always send it to the same 5 peers (due to a max limit of
5 peers when sending the message out) which causes issues for shadow
forks
- in a shadow fork environment we have very few peers that are on the
shadow fork (e.g. 5 bootnodes) and majority of the other peers we are
connected to are mainnet peers (the shadowfork uses the same networkid=1
as mainnet)
- what ended up happening is that we would always hit the same 5 peers
that were on mainnet instead of the shadowfork when trying to backward
sync using the EL block downloader - leading to use being stuck
indefinitely
- same issue happened for bodies download - there it was even worse
because we only send bodies request to max 1 peer (and we always would
hit the same peer)
- my workaround/short-term fix to unblock the perfnet work on the
`performance` branch was to always send all messages to all peers (which
is bad for the network and so we can't merge that temp fix into main,
hence we need a better solution which this PR is aspiring to be)
2. extreme chaindata growth for large backward chain downloads due to
the el downloader not taking into account `--sync.loop.block.limit`
batched execution (in batches of 5,000 blocks) for E3
3. extreme chaindata growth for large backward chain downloads due to a
long-lived chaindata RoTx
- the long-lived RoTx is used in
`membatchwithdb.NewMemoryBatchWithCustomDB(tx, tmpDb, tmpTx)` after the
headers have been downloaded into an ETL
- even after point 2. above was fixed and we called execution every
5,000 block (in order for the pruning to kick in) the chaindata was
still growing uncontrollably - this is because there is a long lived
RoTx holding the data (even though it has been pruned by the execution
loop)
   - this RoTx lives for the entirety of the backward chain download
4. issues with the EL block downloader getting stuck for gnosis as per:
erigontech/erigon-qa#74 (comment)

**Solutions**
To address problem 1:
- We devise a new backward fetch algorithm which leverages the devp2p
api I wrote for Astrid (it is a convenient wrapper on top of Sentry). It
is a much better choice than the old `HeaderDownload` and `BodyDownload`
approach because it gives the user more control over which peers to
fetch from (e.g. it allows you to list peers and make smarter selections
about which peers to download from) which makes a lot of the code
simpler. It also allows us to do cool things like parallel body download
from many peers which we weren't able to do in an easy way previously.
Another convenience of the API is that it is blocking every time you
send a request to a peer so you don't have to worry about the async
nature of devp2p. Doing this will also allow us to remove
`HeaderDownload`, `BodyDownload` and headers and bodies stage so that we
can finally unify our devp2p code paths between Polygon and Ethereum
- The algo looks like this:
```
1. List all peers
2. Try to download the initial header with the hash we need to download backwards from all peers so that we know which peers have this data and are available for future requests.
3. Send devp2p backward header requests in batches of 500 headers (configurable per request) to 1 peer at a time
  - Send every new batch to the next available peer (which has the initial header) in a circular fashion to distribute the requests
  - For every batch of headers that we receive, we traverse the headers backwards (in descending number) and check if the parent hash is available in our DB - if it is we have found a connection point and we stop otherwise we continue fetching headers backwards
  - We accumulate the headers in an ETL collector
  - if any of the available peers fail to respond to us we mark them as "exhausted" so we don't try to use them again
4. Once we have a connection point, we start doing a forward body download in batches of 500 (configurable per request) by traversing through the sorted ETL (without needing any temporary MDBX)
   - here we do some parallelisation (because we already know all the hashes) - depending on how many non-exhausted peers we have we split the batch of 500 headers into N smaller batches - where N is the number of non-exhausted peers
   - we fetch the bodies of these N batches in parallel (we have a configurable max workers property - currently set to 10)
   - if any of the peers fail to serve us the bodies we again mark them as exhausted so we don't try to use them again and retry the batch with the next available peer in a circular fashion
   - once we've downloaded the batch of 500 blocks we feed it to a `ResultFeed` where the caller can consume it (paging-like API)
```

To address problem 2:
- we change the logic in the EngineBlockDownloader in the following way
- if a backward downloaded header chain is > dbg.MaxReorgDepth (512
blocks) in NewPayload we abandon the download request
- the algo in the solution to problem 1 has an option to abandon the
backward header download (step 3) early if the backward chain length
exceeds a configurable number (512 in this case)
- the CL will eventually have to send a ForkChoiceUpdated request for
that large backward chain if it is the canonical one - within a FCU we
allow long backward downloads
- when doing these long backward downloads with FCU - we add logic to
take into account --sync.loop.block.limit - so we insert blocks to chain
data (in batches of 500 blocks as currently done) then when we reach
5,000 blocks inserted we call ValidateChain+UpdateForkChoice to trigger
execution + pruning (to keep chaindata size low)
- note this works quite well with the `ResultFeed` paging approach in
the solution for problem 1

To address problem 3:
- thanks to the solution to problem 1. we actually no longer need to use
`membatchwithdb.NewMemoryBatchWithCustomDB(tx, tmpDb, tmpTx)` and a
long-lived chaindata RoTx - we only need an ETL collector which can be
used without a tx/db

To address problem 4:
- solved by the solution to problem 1. as well

**Future follow-ups**
- We should move the shared code from `polygon/p2p` to `p2p` package
- We should use the new `bbd.BackwardBlockDownloader` in Astrid too
(there is 1 situation in the tip-sync mode there that may occasionally
need to do long backward block downloads when handling long period of no
milestones)
- Remove the old code path and `--el.block.downloader.v2` feature flag
in EngineBlockDownloader
taratorio added a commit that referenced this pull request Sep 15, 2025
addressing 1st follow up from
#16073
now that we use the new backward block downloader in the Ethereum EL
Engine API server we should move the p2p code for it from polygon/p2p to
execution/p2p (the core pkg) so that the Ethereum EL Engine API server
and Astrid can import and use the same code from execution/p2p
taratorio added a commit that referenced this pull request Sep 22, 2025
2nd follow up from #16073 is to
have Astrid use the same flow for backward block downloads as Ethereum -
this unites the 2

benefits of this are:
- the new backward block downloader abstraction matures more - I found 2
small issues with it while doing this work that are now fixed - also
it's interface is more flexible to cater for both use cases now
- there will be performance gains in astrid when handling milestone
mismatches - the new downloader is much quicker to figure out what peers
it can backward download the milestone from and gives a fail/success
answer much quicker (no 60 sec stalling when we hit a peer that doesn't
have the milestone as previously) - also when getting a new block hash
event we can directly backward download to a connection point in the
canonical chain builder (instead of first downloading 1 block from a
peer and then realising that we need to download more because there is a
gap and sending more requests)
NazariiDenha pushed a commit that referenced this pull request Oct 24, 2025
addressing 1st follow up from
#16073
now that we use the new backward block downloader in the Ethereum EL
Engine API server we should move the p2p code for it from polygon/p2p to
execution/p2p (the core pkg) so that the Ethereum EL Engine API server
and Astrid can import and use the same code from execution/p2p
NazariiDenha pushed a commit that referenced this pull request Oct 24, 2025
2nd follow up from #16073 is to
have Astrid use the same flow for backward block downloads as Ethereum -
this unites the 2

benefits of this are:
- the new backward block downloader abstraction matures more - I found 2
small issues with it while doing this work that are now fixed - also
it's interface is more flexible to cater for both use cases now
- there will be performance gains in astrid when handling milestone
mismatches - the new downloader is much quicker to figure out what peers
it can backward download the milestone from and gives a fail/success
answer much quicker (no 60 sec stalling when we hit a peer that doesn't
have the milestone as previously) - also when getting a new block hash
event we can directly backward download to a connection point in the
canonical chain builder (instead of first downloading 1 block from a
peer and then realising that we need to download more because there is a
gap and sending more requests)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[perfnet] chaindata growth problem in EL backward downloader [perfnet] engine block downloader stuck on shadow forks

3 participants