-
Notifications
You must be signed in to change notification settings - Fork 0
HDiff in the hot DB #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3025158 to
8eaee8d
Compare
| state_root, | ||
| state_slot: Some(summary.slot), | ||
| // Delete the diff as descendants of this state will never be used. | ||
| prune_hot_diff: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to change this to not delete the diff ancestors of the finalized state (use closest_layer_slots somewhere here).
We need to know the ancestor diffs of the new finalized state, so one approach would be:
- Reorder
migrate_database(hot to freezer migration) prior to pruning heads - Keep hot states which are either after the split, or before the split and descended from it according to the diff hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to rewrite the prune_abandoned_forks code, but if we don't want to do that now, we can take the approach above
|
Regarding state summaries stored during state advance (infamous Option 1 (current strategy) is to commit eagerly:
Option 2 (old strategy from ages ago) is to commit on import:
Option 3 (don't store intermediate states at all):
|
dcce765 to
0a230ec
Compare
| .viable_heads::<T::EthSpec>(head_slot) | ||
| .iter() | ||
| .map(|node| (node.root, node.slot)) | ||
| .collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use to implement the DB downgrade (recreate the headtracker after it has been deleted)
| // TODO(hdiff): Is it necessary to do this read tx now? Also why is it necessary to | ||
| // check that the summary exists at all? Are double writes common? Can this txn | ||
| // lock deadlock with the `do_atomically` call? | ||
| let txn_lock = chain.store.hot_db.begin_rw_transaction(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this lock is a race with the write in BeaconChain::import_block:
If we were to write a non-temporary state in import_block in between setting state_already_exists (false) and the write of the temporary state in this function, we can corrupt the DB:
- There is a state that is not temporary (required by some fully-imported block),
- But it is marked temporary due to the race condition here
- Temporary states risk being deleted by pruning -> invalid DB due to deletion of canonical state.
The lock prevents this case by preventing the interleaving of the read in this function with the write in the import function. However this is a bad abstraction forced upon us by LevelDB which lacks proper ACID transactions. If we move away from LevelDB eventually we can maybe have proper transactions, see:
|
|
||
| // Persist the new finalized checkpoint as the pruning checkpoint. | ||
| batch.push(StoreOp::KeyValueOp( | ||
| store.pruning_checkpoint_store_op(new_finalized_checkpoint), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets delete the pruning_checkpoint while we're here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in c8d82f0
| StateSummariesDAG::new(state_summaries, parent_block_roots) | ||
| }; | ||
|
|
||
| // From the DAG compute the list of roots that ascend from finalized root up to the split |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // From the DAG compute the list of roots that ascend from finalized root up to the split | |
| // From the DAG compute the list of roots that descend from finalized root up to the split |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with 4fa32cc
| }; | ||
|
|
||
| // From the DAG compute the list of roots that ascend from finalized root up to the split | ||
| // slot. And run `migrate_database` with it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // slot. And run `migrate_database` with it | |
| // slot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with 4fa32cc
| // keep those on the finalized canonical chain. Note that there may be lingering | ||
| // forks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // keep those on the finalized canonical chain. Note that there may be lingering | |
| // forks. | |
| // keep those on the finalized canonical chain. Checking the state root ensures we avoid lingering forks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with 4fa32cc
| // Abandoned forks will never be used, so we can safely delete the diffs | ||
| prune_hot_diff: true, | ||
| }] | ||
| // Abandoned forks will never be used, so we can safely delete the diffs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting this now includes:
- Abandoned forks (not descended from finalized)
- Old stuff that wasn't pruned previously but is getting pruned now (older than previous finalization)
- Finalized states from the canonical chain which are not required by the HDiff mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comments
|
|
||
| // Persist the head in case the process is killed or crashes here. This prevents | ||
| // the head tracker reverting after our mutation above. | ||
| let persisted_head = PersistedBeaconChain { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could remove this write, and probably delete the PersistedBeaconChain altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with 9ce4b33, but to fully remove PersistedBeaconChain we need to have a way to compute the genesis state root and genesis block root when booting from an existing DB. The issue is that while the state is in the DB it is keyed by root which we don't know.
|
|
||
| pub fn hot_hdiff_start_slot(&self) -> Slot { | ||
| // TODO(hdiff): read the start slot from somewhere | ||
| todo!(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking maybe we could recycle the anchor_slot for use here, as it is fairly useless and not used for much. The only load-bearing use of the anchor_slot that I can find is in try_prune_execution_payloads where it is used to halt the pruning process. However, this is not necessary as we could either:
- Keep iterating back to Bellatrix, or
- Stop iterating back once we find a payload that is missing, or
- Use the
oldest_block_slotto determine when to stop iterating back. This is my preferred option, as it is also compatible with storing execution payloads older than theanchor_slot, i.e. Store execution payloads during backfill if--prune-payloads falsesigp/lighthouse#6510
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conclusion: we can recycle the anchor_slot field and set it to the epoch-aligned slot of the snapshot state (not the slot of the checkpoint block, which might be older due to skipped slots).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 5ecf9db
06f6177 to
f4d44e4
Compare
|
Branch with commit history https://github.com/dapplion/lighthouse/pull/new/tree-states-hot-backup squashed and rebase on top of sigp/unstable |
1860064 to
fbc62f2
Compare
007727e to
ff69ee0
Compare
e033b3b to
ee13a1f
Compare
|
I have done manual testing and this branch works on mainnet:
All of them were broken originally, there's a lot of commits https://github.com/dapplion/lighthouse/pull/new/tree-states-hot-backup to make them work. I have also rebased to unstable, it was quite easy. For 3. I had to add a new column to store the new hot state summaries. Otherwise when diffing it will try to read a V22 summary and break. |
|
@michaelsproul If we stop updating the head_tracker in the persisted head, how can we downgrade safely? If you run an old version of Lighthouse it will read the persisted head which contains an empty head tracker. |
| /// | ||
| /// This column is populated after DB schema version 23 superseding `BeaconStateSummary`. The | ||
| /// new column is necessary to have a safe migration without data loss. | ||
| #[strum(serialize = "bs3")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #[strum(serialize = "bs3")] | |
| #[strum(serialize = "hss")] |
|
RE pruning routine: Iterate summaries from finalized state jumping by diff base root, and keep only those. required_finalized_diff_state_slots = to the diff path of the finalized state |
|
loop on each block and state to prune and log on separate lines so there's not a single giant dump of potentially megabytes
|
|
change to "not copying state to freezer DB"
|
The head tracker is a persisted piece of state that must be kept in sync with the fork-choice. It has been a source of pruning issues in the past, so we want to remove it - see #1785 When implementing tree-states in the hot DB we have to change the pruning routine (more details below) so we want to do those changes first in isolation. - see #6580 - If you want to see the full feature of tree-states hot dapplion#39 Closes #1785 **Current DB migration routine** - Locate abandoned heads with head tracker - Use a roots iterator to collect the ancestors of those heads can be pruned - Delete those abandoned blocks / states - Migrate the newly finalized chain to the freezer In summary, it computes what it has to delete and keeps the rest. Then it migrates data to the freezer. If the abandoned forks routine has a bug it can break the freezer migration. **Proposed migration routine (this PR)** - Migrate the newly finalized chain to the freezer - Load all state summaries from disk - From those, just knowing the head and finalized block compute two sets: (1) descendants of finalized (2) newly finalized chain - Iterate all summaries, if a summary does not belong to set (1) or (2), delete This strategy is more sound as it just checks what's there in the hot DB, computes what it has to keep and deletes the rest. Because it does not rely and 3rd pieces of data we can drop the head tracker and pruning checkpoint. Since the DB migration happens **first** now, as long as the computation of the sets to keep is correct we won't have pruning issues.
Partial implementation of https://hackmd.io/6WI3idBfR2q2AQyMUfYMyg