Skip to content

Conversation

@chong-he
Copy link
Member

Issue Addressed

#7727 introduced a bug in the logging, where as long as the node failed the SSZ get_validator_blocks_v3 endpoint, it would log as Beacon node does not support.... However, the failure can be due to other reasons, such as a timed out error as found by @jimmygchen:

WARN Beacon node does not support SSZ in block production, falling back to JSON slot: 5283379, error: HttpClient(url: https://ho-h-bn-cowl.spesi.io:15052/, kind: timeout, detail: operation timed out

Proposed Changes

This PR made the error log more generic, so there is less confusion.

Additionally, suggested by @michaelsproul, this PR refactors the get_validator_blocks_v3 calls by trying all beacon nodes using the SSZ endpoint first, and if all beacon node fails the SSZ endpoint, only then fallback to JSON.

It changes the logic from:
"SSZ -> JSON for primary beacon node, followed by SSZ -> JSON for second beacon node and so on" to
"SSZ for all beacon nodes -> JSON for all beacon nodes"

This has the advantage that if the primary beacon node is having issues and failed the SSZ, we avoid retrying the primary beacon node again on JSON (as it could be that the primary beacon node fail again); rather, we switch to the second beacon node.

Additional Info

As the calling of get_validator_blocks_v3 (and it's SSZ counterpart) is shifted to another function, the remaining part in the function get_validator_block is a bit of dangling in the air, so I moved them all to the now renamed get_validator_block_and_publish_block function and delete the original get_validator_block function.

@chong-he chong-he added val-client Relates to the validator client binary ready-for-review The code is ready for review labels Oct 10, 2025
@michaelsproul michaelsproul added the v8.1.0 Post-Fulu release label Oct 29, 2025
Copy link
Member

@macladson macladson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a small nit

It's a shame we can't detect the type of error first:

  1. Connection refused (node offline)
  2. SSZ not supported
    Then we if get 2 we can immediately try JSON rather than moving onto the second BN.
    But we would have to parse the error type from the API which probably isn't doable.

At this point, pretty much all BNs should be supporting SSZ by default so this should be fine

@macladson macladson added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Nov 13, 2025
@chong-he chong-he added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Nov 14, 2025
Copy link
Member

@macladson macladson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a slight edit for clarity

@macladson macladson removed the ready-for-review The code is ready for review label Nov 14, 2025
@macladson
Copy link
Member

Just needs a cargo fmt for some issue with whitespaces, otherwise looks pretty good to me!

@chong-he
Copy link
Member Author

Just needs a cargo fmt for some issue with whitespaces, otherwise looks pretty good to me!

sorry I overlook the checks, fixed now

@michaelsproul michaelsproul added the ready-for-review The code is ready for review label Nov 27, 2025
@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Nov 27, 2025
@mergify
Copy link

mergify bot commented Nov 27, 2025

Merge Queue Status Beta

🚫 The pull request has left the queue (rule: default)

This pull request spent 1 hour 4 minutes 56 seconds in the queue, including 31 minutes 52 seconds waiting for CI.
The checks were run on draft #8476.

Required conditions to merge
  • check-success=local-testnet-success
  • check-success=test-suite-success

Reason

Pull request #8186 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: 1 review requesting changes and 1 approving review by reviewers with write access.)

Hint

You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it.
If you do update this pull request, it will automatically be requeued once the queue conditions match again.
If you think this was a flaky issue instead, you can requeue the pull request, without updating it, by posting a @mergifyio requeue comment.

@mergify mergify bot added the queued label Nov 27, 2025
mergify bot added a commit that referenced this pull request Nov 27, 2025
mergify bot added a commit that referenced this pull request Nov 27, 2025
mergify bot added a commit that referenced this pull request Nov 27, 2025
@mergify mergify bot added dequeued and removed queued labels Dec 1, 2025
@mergify mergify bot removed the dequeued label Dec 1, 2025
@mergify
Copy link

mergify bot commented Dec 1, 2025

Merge Queue Status Beta

🚫 The pull request has left the queue (rule: default)

This pull request spent 37 minutes 49 seconds in the queue, including 36 minutes 15 seconds waiting for CI.
The checks were run on draft #8504.

Required conditions to merge
  • check-success=local-testnet-success
  • check-success=test-suite-success

Reason

Pull request #8186 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: 1 review requesting changes and 1 approving review by reviewers with write access.)

Hint

You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it.
If you do update this pull request, it will automatically be requeued once the queue conditions match again.
If you think this was a flaky issue instead, you can requeue the pull request, without updating it, by posting a @mergifyio requeue comment.

@mergify mergify bot added the queued label Dec 1, 2025
mergify bot added a commit that referenced this pull request Dec 1, 2025
@mergify mergify bot added dequeued and removed queued labels Dec 1, 2025
@michaelsproul
Copy link
Member

@mergify requeue

@mergify
Copy link

mergify bot commented Dec 1, 2025

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@mergify
Copy link

mergify bot commented Dec 1, 2025

Merge Queue Status Beta

🚫 The pull request has left the queue (rule: default)

This pull request spent 33 minutes 30 seconds in the queue, including 31 minutes 46 seconds waiting for CI.
The checks were run on draft #8506.

Required conditions to merge
  • check-success=local-testnet-success
  • check-success=test-suite-success

Reason

Pull request #8186 has been dequeued. The pull request could not be merged. This could be related to an activated branch protection or ruleset rule that prevents us from merging. (details: 1 review requesting changes and 1 approving review by reviewers with write access.)

Hint

You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it.
If you do update this pull request, it will automatically be requeued once the queue conditions match again.
If you think this was a flaky issue instead, you can requeue the pull request, without updating it, by posting a @mergifyio requeue comment.

@mergify mergify bot added queued and removed dequeued labels Dec 1, 2025
mergify bot added a commit that referenced this pull request Dec 1, 2025
@mergify mergify bot added dequeued and removed queued labels Dec 1, 2025
@michaelsproul
Copy link
Member

@mergify requeue

@mergify
Copy link

mergify bot commented Dec 1, 2025

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@mergify
Copy link

mergify bot commented Dec 1, 2025

Merge Queue Status Beta

✅ The pull request has been merged

This pull request spent 33 minutes in the queue, including 31 minutes 39 seconds waiting for CI.
The checks were run on draft #8507.

Required conditions to merge
  • check-success=local-testnet-success
  • check-success=test-suite-success

@mergify mergify bot added queued and removed dequeued labels Dec 1, 2025
mergify bot added a commit that referenced this pull request Dec 1, 2025
@mergify mergify bot merged commit 90dd5bb into sigp:unstable Dec 1, 2025
36 checks passed
@mergify mergify bot removed the queued label Dec 1, 2025
@chong-he chong-he deleted the bn-ssz-error branch December 17, 2025 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge This PR is ready to merge. v8.1.0 Post-Fulu release val-client Relates to the validator client binary

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants