-
Notifications
You must be signed in to change notification settings - Fork 958
Block proposal optimisations #8156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block proposal optimisations #8156
Conversation
|
Combined into one PR for ease of testing (and I've already been testing the state advance as part of testing the proposer shuffling cleanup). Can separate into 2 PRs if desired. |
jimmygchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good to me, just a few comments and a question to help me understand this better.
| |_, validator| validator.exit_epoch <= head_state.finalized_checkpoint().epoch, | ||
| head_state, | ||
| |_, validator| { | ||
| validator.slashed || validator.exit_epoch <= finalized_state.current_epoch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch here
| // Protect against advancing a state more than a single slot. | ||
| // | ||
| // Advancing more than one slot without storing the intermediate state would corrupt the | ||
| // database. Future works might store intermediate states inside this function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand this comment about corrupting the database, could you explain this and why is this no longer an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I think "corruption" is not quite the right word here (even though I wrote this comment long ago 🤣). Our database pruning used to only be capable of pruning states that were ancestors of some head state. This means that advanced states that would later have blocks applied to them (e.g. advanced state for slot 35 before it has block 35 applied) would not be considered by the pruning algorithm. So they would stay on disk forever taking up space.
Now our pruning constructs a dag based on the state summaries, so it sees these abandoned advanced states and deletes them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pruning seems to be working well and keeping the number of states on disk bounded:
Oct 07 03:07:13.232 DEBUG Extra pruning information new_finalized_checkpoint: Checkpoint { epoch: Epoch(166396), root: 0xd4883165eb3c4517578e3cef0090d7d9539841363c89b8e535a72b835db71840 }, new_finalized_state_root: 0x066c326a4d9f1ce3f9aab95799cab7accb3c07be00ff27391ec95d4db0a46f44, split_prior_to_migration: Split { slot: Slot(5324640), state_root: 0x79d2d1444290075d7346f1b5751df60a1fffef517dfc309bfdad09da86b00c1a, block_root: 0x20f2824c49d82d7735de9ec1d31805c8db8eb796c8ceaa4b17644bb466b2b559 }, newly_finalized_blocks: 33, newly_finalized_state_roots: 33, newly_finalized_states_min_slot: 5324640, required_finalized_diff_state_slots: [Slot(5324672), Slot(5324288), Slot(5322752), Slot(5316608), Slot(5308416), Slot(5242880), Slot(4194304)], kept_summaries_for_hdiff: [(0x202c4218875820e2b660bb5b4006cd70675c941ff95aa0ead165412e3bc0db2b, Slot(4194304)), (0x1d18004bf5fbf21e0d36d78f73242e12f9a4a8377ce14f83f4735101eb0dcfbc, Slot(5242880)), (0x6c96c8f0cd07df7abd7c6de27246a4b473542307f1e996df8d7f99d3bdc570d2, Slot(5308416)), (0xf9a30950d10706cd6f6e473372c0ecd54c11de457823c8baf88efa5e0de47b42, Slot(5316608)), (0xb6326f522e323dea61109403950277853b70cad81c35102fc41d71b9679b563b, Slot(5322752)), (0x60d6523dc58f09bde2507762b384ad8417f32225d121d9164d60c49968b3f744, Slot(5324288))], state_summaries_count: 176, state_summaries_dag_roots: [(0xb6326f522e323dea61109403950277853b70cad81c35102fc41d71b9679b563b, DAGStateSummary { slot: Slot(5322752), latest_block_root: 0x4aed3a69a36b7c8a3c126422b0757b70f530cf412fa996d416bacc0f374fb6af, latest_block_slot: Slot(5322750), previous_state_root: 0x21d1041bb81ba8f287448bf0e8d2e79f0b7d330f067928c70765f37c00e943d3 }), (0x1d18004bf5fbf21e0d36d78f73242e12f9a4a8377ce14f83f4735101eb0dcfbc, DAGStateSummary { slot: Slot(5242880), latest_block_root: 0xb1861ab5afce868a483d5bded463b0b485a29340dbaa2e2c4112ec2a2e33190b, latest_block_slot: Slot(5242880), previous_state_root: 0x1e5b11982292867f894c4186fbea86d2d6f9a3ead7de0beb19c8f846e3fe0283 }), (0x202c4218875820e2b660bb5b4006cd70675c941ff95aa0ead165412e3bc0db2b, DAGStateSummary { slot: Slot(4194304), latest_block_root: 0xedb6e6a0f982ad53c7a343f939fb0bfecdef0e466dd526be7905b24b583e28ce, latest_block_slot: Slot(4194302), previous_state_root: 0x8d7edcd77a1e4b13358624aa2a725f7e0125998f5f1d897fce273a03f96a8edc }), (0x6c96c8f0cd07df7abd7c6de27246a4b473542307f1e996df8d7f99d3bdc570d2, DAGStateSummary { slot: Slot(5308416), latest_block_root: 0x41c6db8c7ec97303c7f3a47d5e28d06205d0021334735721a0fe113567da045c, latest_block_slot: Slot(5308416), previous_state_root: 0xec3e4406f735f395187b92cd669809b4a0b1086e14c053afa30a30b897d3db77 }), (0x79d2d1444290075d7346f1b5751df60a1fffef517dfc309bfdad09da86b00c1a, DAGStateSummary { slot: Slot(5324640), latest_block_root: 0x20f2824c49d82d7735de9ec1d31805c8db8eb796c8ceaa4b17644bb466b2b559, latest_block_slot: Slot(5324640), previous_state_root: 0x78eaf0323a2c6bcc24c86df213478753d8ddccf0e3ee7a6b28ab2496aae1dc2e }), (0xf9a30950d10706cd6f6e473372c0ecd54c11de457823c8baf88efa5e0de47b42, DAGStateSummary { slot: Slot(5316608), latest_block_root: 0xf40285e8ca23f92bac183c5b0a7a9d2f445fc0f971b106e48a5f6e59bebbe53f, latest_block_slot: Slot(5316607), previous_state_root: 0xdfa0919e52a05aba02cfe3c302db82ab15f464d590f17e10c28971072096f9dc }), (0x60d6523dc58f09bde2507762b384ad8417f32225d121d9164d60c49968b3f744, DAGStateSummary { slot: Slot(5324288), latest_block_root: 0xcc7d8ebf20ab724e39618afff3408b130e71da7c467465961094896ed244c696, latest_block_slot: Slot(5324287), previous_state_root: 0xe88783c0e1e742daaf12240c0d37d5daf0f82ebaa79a6498d21c7b9b287d76ef })], finalized_and_descendant_state_roots_of_finalized_checkpoint: 112, blocks_to_prune: 0, states_to_prune: 58
We are pruning down to 112 states here, which is about the right number (around ~2 per slot for the 2 epochs of unfinalized chain). We get ~2 states because of the advanced and unadvanced versions. The majority of these are tiny because they are just stored using the block-reply strategy - the only data on disk for them is the block itself and the state summary.
jimmygchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic looks correct to me and the block production metrics on testnet nodes look pretty good 🎉
Closes: - sigp#4412 This should reduce Lighthouse's block proposal times on Holesky and prevent us getting reorged. - [x] Allow the head state to be advanced further than 1 slot. This lets us avoid epoch processing on hot paths including block production, by having new epoch boundaries pre-computed and available in the state cache. - [x] Use the finalized state to prune the op pool. We were previously using the head state and trying to infer slashing/exit relevance based on `exit_epoch`. However some exit epochs are far in the future, despite occurring recently. Co-Authored-By: Michael Sproul <[email protected]>
Issue Addressed
Closes:
This should reduce Lighthouse's block proposal times on Holesky and prevent us getting reorged.
Proposed Changes
exit_epoch. However some exit epochs are far in the future, despite occurring recently.