Fix bug in fork calculation at fork boundaries #8121

pawanjay176 · 2025-09-26T20:25:08Z

Issue Addressed

N/A

Proposed Changes

In #8101 , when we modified the logic to get the proposer index post fulu, we seem to have missed advancing the state at the fork boundaries to get the right Fork for signature verification.
This led to lighthouse failing all gossip verification right after transitioning to fulu that was observed on the holesky shadow fork

Sep 26 14:24:00.088 DEBUG Rejected gossip block                         error: "InvalidSignature(ProposerSignature)", graffiti: "grandine-geth-super-1", slot: 640 
Sep 26 14:24:00.099 WARN  Could not verify block for gossip. Rejecting the block  error: InvalidSignature(ProposerSignature)

I'm not completely sure this is the correct fix, but this fixes the issue with InvalidProposerSignature on the holesky shadow fork.

Thanks to @eserilev for helping debug this

pawanjay176 · 2025-09-26T20:30:50Z

Kurtosis config I was using to test the change:

participants_matrix:
  el:
    - el_type: besu
      el_image: ethpandaops/besu:main
  cl:
    - cl_type: grandine
      cl_image: ethpandaops/grandine:develop
      supernode: true
      cl_log_level: debug
      count: 3
    - cl_type: lighthouse
      cl_image: lighthouse-local:latest
      supernode: true
      cl_log_level: debug
    - cl_type: lighthouse
      cl_image: ethpandaops/lighthouse:unstable
      supernode: true
      cl_log_level: debug


network_params:
  fulu_fork_epoch: 1

additional_services:
  - dora
  - spamoor

global_log_level: debug
spamoor_params:
  image: ethpandaops/spamoor:master
  max_mem: 4000
  spammers:
    - scenario: eoatx
      config:
        throughput: 200
    - scenario: blobs
      config:
        throughput: 20

The unstable lighthouse node stalls at slot 32 (fulu slot) while this branch progresses fine

eserilev · 2025-09-26T21:04:53Z

I was also able to repro the bug on current unstable, with pawans fix we are able to progress the chain just fine

jimmygchen · 2025-09-26T21:58:29Z

Nice find! I'll have to think about the fix a bit more.

Just to understand this scenario, this didn't happen on other paths (blob / column), because we state_provider function returns the advanced state to slot instead of parent state here:

lighthouse/beacon_node/beacon_chain/src/block_verification.rs

Lines 957 to 977 in a08fd93

    
           let proposer = chain.with_proposer_cache::<_, BlockError>( 
        
               proposer_shuffling_decision_block, 
        
               block_epoch, 
        
               |proposers| proposers.get_slot::<T::EthSpec>(block_slot), 
        
               || { 
        
                   // The proposer index was *not* cached and we must load the parent in order to 
        
                   // determine the proposer index. 
        
                   let (mut parent, _) = load_parent(block.clone(), chain)?; 
        
                   let parent_state_root = if let Some(state_root) = parent.beacon_state_root { 
        
                       state_root 
        
                   } else { 
        
                       // This is potentially a little inefficient, although we are likely to need 
        
                       // the state's hash eventually (if the block is valid), and we are also likely 
        
                       // to already have the hash cached (if fetched from the state cache). 
        
                       parent.pre_state.canonical_root()? 
        
                   }; 
        
                   let parent_state = parent.pre_state.clone(); 
        
                   opt_parent = Some(parent); 
        
                   Ok((parent_state_root, parent_state)) 
        
               }, 
        
           )?;

In gossip block verification > with_proposer_cache > ensure_state_can_determine_proposers_for_epoch didn't try to advance the electra parent state to the same slot as target_epoch (fulu epoch) because the state should have sufficient lookahead.

So does the failure here mean the proposer indices for fork (epoch N) calculated from the parent state (epoch N-1) is different to the proposer_lookahead computed during fulu fork transition? And we can't rely on the lookahead computed from the electra parent state?

jimmygchen · 2025-09-26T22:01:47Z

beacon_node/beacon_chain/src/beacon_proposer_cache.rs

+
+            // Advance the state into the same epoch as the block. Use the "partial" method since state
+            // roots are not important for proposer/attester shuffling.
+            partial_state_advance(state, Some(state_root), target_slot, spec)


I'm not sure if this is required in all cases as we have 2 epochs of proposer lookahead from Fulu.

this might only be required at fork boundaries, like when the target_epoch fork > state.current_fork

or maybe only at the fulu fork transition, i'm not fully across the fulu changes to proposer shuffling yet tbh

IMO it should only be required at the boundary, but I am fine with the "safer" fix or just always doing the state advance. We can come back to optimise it later

The optimised version would look something like setting minimum epoch to max(minimum_epoch, fulu_fork_epoch) when target_epoch >= fulu_fork_eppch.

Pawan mentioned that the proposer index was correct but the fork used for signing was incorrect (electra).

Can we simply return self.spec.fork_at_epoch(proposal_epoch) here instead of state.fork()?

lighthouse/beacon_node/beacon_chain/src/beacon_chain.rs

Lines 6609 to 6614 in 4ceccd2

let proposers = state.get_beacon_proposer_indices(proposal_epoch, &self.spec)?;

Ok::<_, E>(EpochBlockProposers::new(

proposal_epoch,

state.fork(),

proposers,

))

Yeah I think that would work. Do we actually have a fork_at_epoch method that takes into account multiple forks activating at the same epoch?

Yeah I think advancing the state is the more conservative approach. We can optimise further later

I believe there's a few other places where we are inserting into the beacon proposer cache which might also potentially need to use the fork_at_epoch method if we don't advance the state

Also to confirm, we don't need to build the committee cache after advancing the state like here right?

lighthouse/beacon_node/beacon_chain/src/block_verification.rs

Lines 2065 to 2066 in c754234

state.build_committee_cache(RelativeEpoch::Previous, spec)?;

state.build_committee_cache(RelativeEpoch::Current, spec)?;

Also to confirm, we don't need to build the committee cache after advancing the state like here right?

Hmm good point. These caches are not required in general for the proposer cache, but we may need them in block verification. It wouldn't hurt to add them back here (I removed them by not using cheap_state_advance):

lighthouse/beacon_node/beacon_chain/src/block_verification.rs

Lines 962 to 975 in c754234

// The proposer index was *not* cached and we must load the parent in order to

// determine the proposer index.

let (mut parent, _) = load_parent(block.clone(), chain)?;

let parent_state_root = if let Some(state_root) = parent.beacon_state_root {

state_root

} else {

// This is potentially a little inefficient, although we are likely to need

// the state's hash eventually (if the block is valid), and we are also likely

// to already have the hash cached (if fetched from the state cache).

parent.pre_state.canonical_root()?

};

let parent_state = parent.pre_state.clone();

opt_parent = Some(parent);

Ok((parent_state_root, parent_state))

Although I think we end up doing it here anyway:

lighthouse/beacon_node/beacon_chain/src/block_verification.rs

Lines 1187 to 1192 in c754234

let state = cheap_state_advance_to_obtain_committees::<_, BlockError>(

&mut parent.pre_state,

parent.beacon_state_root,

block.slot(),

&chain.spec,

)?;

michaelsproul · 2025-09-26T22:06:34Z

Good catch! I think another fix might be just to compute what the fork "will be" at some future epoch? So use something like state.fork_at_epoch(epoch) rather than state.fork()?

But yeah maybe just advancing the state fully into the new epoch is safer and clearer. We probably don't need the optimisation where we use the lookahead to advance less

jimmygchen · 2025-09-26T22:30:49Z

So does the failure here mean the proposer indices for fork (epoch N) calculated from the parent state (epoch N-1) is different to the proposer_lookahead computed during fulu fork transition? And we can't rely on the lookahead computed from the electra parent state?

@pawanjay176 pointed out that the proposer lookahead was correct, and it was to do with the proposer.fork

the proposer index that was calculated is correct.
the Fork for the signature verification that we pushed in the beacon proposers cache is incorrect
we retrieved it here
https://github.com/sigp/lighthouse/blob/unstable/beacon_node/beacon_chain/src/block_verification.rs#L979

jimmygchen

Thanks @pawanjay176 @eserilev
Great turnaround 👏 as discussed let’s merge this safer fix and optimise later!

Follow-up to the bug fixed in: - #8121 This fixes the root cause of that bug, which was introduced by me in: - #8101 Lion identified the issue here: - #8101 (comment) In the methods that compute the proposer shuffling decision root, ensure we don't use lookahead for the Fulu fork epoch itself. This is accomplished by checking if Fulu is enabled at `epoch - 1`, i.e. if `epoch > fulu_fork_epoch`. I haven't updated the methods that _compute_ shufflings to use these new corrected bounds (e.g. `BeaconState::compute_proposer_indices`), although we could make this change in future. The `get_beacon_proposer_indices` method already gracefully handles the Fulu boundary case by using the `proposer_lookahead` field (if initialised). Co-Authored-By: Michael Sproul <[email protected]>

@eserilev

N/A In sigp#8101 , when we modified the logic to get the proposer index post fulu, we seem to have missed advancing the state at the fork boundaries to get the right `Fork` for signature verification. This led to lighthouse failing all gossip verification right after transitioning to fulu that was observed on the holesky shadow fork ``` Sep 26 14:24:00.088 DEBUG Rejected gossip block error: "InvalidSignature(ProposerSignature)", graffiti: "grandine-geth-super-1", slot: 640 Sep 26 14:24:00.099 WARN Could not verify block for gossip. Rejecting the block error: InvalidSignature(ProposerSignature) ``` I'm not completely sure this is the correct fix, but this fixes the issue with `InvalidProposerSignature` on the holesky shadow fork. Thanks to @eserilev for helping debug this Co-Authored-By: Pawan Dhananjay <[email protected]>

Follow-up to the bug fixed in: - sigp#8121 This fixes the root cause of that bug, which was introduced by me in: - sigp#8101 Lion identified the issue here: - sigp#8101 (comment) In the methods that compute the proposer shuffling decision root, ensure we don't use lookahead for the Fulu fork epoch itself. This is accomplished by checking if Fulu is enabled at `epoch - 1`, i.e. if `epoch > fulu_fork_epoch`. I haven't updated the methods that _compute_ shufflings to use these new corrected bounds (e.g. `BeaconState::compute_proposer_indices`), although we could make this change in future. The `get_beacon_proposer_indices` method already gracefully handles the Fulu boundary case by using the `proposer_lookahead` field (if initialised). Co-Authored-By: Michael Sproul <[email protected]>

pawanjay176 added 2 commits September 26, 2025 13:19

Advance the state to get correct fork

cbbcc14

Remove debug log

6e49091

pawanjay176 added ready-for-review The code is ready for review v8.0.0-rc.0 Q3 2025 release for Fusaka on Holesky labels Sep 26, 2025

eserilev approved these changes Sep 26, 2025

View reviewed changes

jimmygchen reviewed Sep 26, 2025

View reviewed changes

jimmygchen approved these changes Sep 28, 2025

View reviewed changes

jimmygchen added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Sep 28, 2025

mergify bot added the queued label Sep 28, 2025

mergify bot merged commit edcfee6 into sigp:unstable Sep 28, 2025
37 checks passed

mergify bot removed the queued label Sep 28, 2025

michaelsproul mentioned this pull request Sep 28, 2025

Fix proposer shuffling decision slot at boundary #8128

Merged

2 tasks

michaelsproul mentioned this pull request Sep 29, 2025

More proposer shuffling cleanup #8130

Merged

2 tasks

michaelsproul added a commit to michaelsproul/lighthouse that referenced this pull request Sep 30, 2025

Start reverting the extra advance added in sigp#8121

ffe1c82

	let proposers = state.get_beacon_proposer_indices(proposal_epoch, &self.spec)?;
	Ok::<_, E>(EpochBlockProposers::new(
	proposal_epoch,
	state.fork(),
	proposers,
	))

	state.build_committee_cache(RelativeEpoch::Previous, spec)?;
	state.build_committee_cache(RelativeEpoch::Current, spec)?;

	// The proposer index was not cached and we must load the parent in order to
	// determine the proposer index.
	let (mut parent, _) = load_parent(block.clone(), chain)?;
	let parent_state_root = if let Some(state_root) = parent.beacon_state_root {
	state_root
	} else {
	// This is potentially a little inefficient, although we are likely to need
	// the state's hash eventually (if the block is valid), and we are also likely
	// to already have the hash cached (if fetched from the state cache).
	parent.pre_state.canonical_root()?
	};
	let parent_state = parent.pre_state.clone();
	opt_parent = Some(parent);
	Ok((parent_state_root, parent_state))

	let state = cheap_state_advance_to_obtain_committees::<_, BlockError>(
	&mut parent.pre_state,
	parent.beacon_state_root,
	block.slot(),
	&chain.spec,
	)?;

Fix bug in fork calculation at fork boundaries #8121

Fix bug in fork calculation at fork boundaries #8121

Uh oh!

Conversation

pawanjay176 commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue Addressed

Proposed Changes

Uh oh!

pawanjay176 commented Sep 26, 2025

Uh oh!

eserilev commented Sep 26, 2025

Uh oh!

jimmygchen commented Sep 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eserilev Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelsproul commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimmygchen commented Sep 26, 2025

Uh oh!

jimmygchen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pawanjay176 commented Sep 26, 2025 •

edited

Loading

eserilev Sep 26, 2025 •

edited

Loading

michaelsproul commented Sep 26, 2025 •

edited

Loading