-
Notifications
You must be signed in to change notification settings - Fork 957
Refactor get_validator_blocks_v3 fallback #8186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
macladson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a small nit
It's a shame we can't detect the type of error first:
- Connection refused (node offline)
- SSZ not supported
Then we if get 2 we can immediately try JSON rather than moving onto the second BN.
But we would have to parse the error type from the API which probably isn't doable.
At this point, pretty much all BNs should be supporting SSZ by default so this should be fine
macladson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a slight edit for clarity
Co-authored-by: Mac L <[email protected]>
Co-authored-by: Mac L <[email protected]>
|
Just needs a |
sorry I overlook the checks, fixed now |
Merge Queue Status
🚫 The pull request has left the queue (rule: This pull request spent 1 hour 4 minutes 56 seconds in the queue, including 31 minutes 52 seconds waiting for CI. Required conditions to merge
ReasonPull request #8186 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: 1 review requesting changes and 1 approving review by reviewers with write access.) HintYou should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it. |
Merge Queue Status
🚫 The pull request has left the queue (rule: This pull request spent 37 minutes 49 seconds in the queue, including 36 minutes 15 seconds waiting for CI. Required conditions to merge
ReasonPull request #8186 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: 1 review requesting changes and 1 approving review by reviewers with write access.) HintYou should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it. |
|
@mergify requeue |
✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically |
Merge Queue Status
🚫 The pull request has left the queue (rule: This pull request spent 33 minutes 30 seconds in the queue, including 31 minutes 46 seconds waiting for CI. Required conditions to merge
ReasonPull request #8186 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: 1 review requesting changes and 1 approving review by reviewers with write access.) HintYou should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it. |
|
@mergify requeue |
✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically |
Merge Queue Status
✅ The pull request has been merged This pull request spent 33 minutes in the queue, including 31 minutes 39 seconds waiting for CI. Required conditions to merge
|
Issue Addressed
#7727 introduced a bug in the logging, where as long as the node failed the SSZ
get_validator_blocks_v3endpoint, it would log asBeacon node does not support.... However, the failure can be due to other reasons, such as a timed out error as found by @jimmygchen:WARN Beacon node does not support SSZ in block production, falling back to JSON slot: 5283379, error: HttpClient(url: https://ho-h-bn-cowl.spesi.io:15052/, kind: timeout, detail: operation timed outProposed Changes
This PR made the error log more generic, so there is less confusion.
Additionally, suggested by @michaelsproul, this PR refactors the
get_validator_blocks_v3calls by trying all beacon nodes using the SSZ endpoint first, and if all beacon node fails the SSZ endpoint, only then fallback to JSON.It changes the logic from:
"SSZ -> JSON for primary beacon node, followed by SSZ -> JSON for second beacon node and so on" to
"SSZ for all beacon nodes -> JSON for all beacon nodes"
This has the advantage that if the primary beacon node is having issues and failed the SSZ, we avoid retrying the primary beacon node again on JSON (as it could be that the primary beacon node fail again); rather, we switch to the second beacon node.
Additional Info
As the calling of
get_validator_blocks_v3(and it's SSZ counterpart) is shifted to another function, the remaining part in the functionget_validator_blockis a bit of dangling in the air, so I moved them all to the now renamedget_validator_block_and_publish_blockfunction and delete the originalget_validator_blockfunction.