Skip to content

Add support for preprocessing ZooKeeper operations in clickhouse-keeper#37036

Merged
alesapin merged 33 commits intomasterfrom
keeper-preprocess-operations
May 20, 2022
Merged

Add support for preprocessing ZooKeeper operations in clickhouse-keeper#37036
alesapin merged 33 commits intomasterfrom
keeper-preprocess-operations

Conversation

@antonio2368
Copy link
Copy Markdown
Member

@antonio2368 antonio2368 commented May 9, 2022

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Currently, Keeper applies the operation only after the commit.
To support real-time digest check of operation results, we need to preprocess the operation at least on the Leader node so each Raft log can be replicated with it afterwards.
This is only the first step for adding support for digests.

Multiple operations can be batched for replication so we cannot calculate the digest by analysing possible changes a single operation will make, we need to process those operations in some way without committing them.

Additionally, the goal was to minimise load on the leader, a possible solution would be to process the requests twice on the leader, first time applying it to some in-memory store, and second time to the real storage but that would require some complex interaction with NuRaft because KeeperStorage would need to be aware of the current role of the node it's running on.

Because of the reasons above, I implemented preprocessing for all the operation in a such way that they minimise the commit itself.
In short, preprocessing transforms the ZooKeeper request in set of deltas.
Those deltas have 2 roles:

  • efficient way of getting the latest state of some ZooKeeper node with minimal copying
  • efficient way of committing the request by just applying those same deltas to our storage

Deltas are kept entirely in-memory, so applying them should be fast enough, while the data copying is minimal.

Those deltas also play nicely with the NuRaft interface (precommit -> transform into deltas, rollback -> simply delete deltas). If some node becomes a leader, it already has everything preprocessed and it can instantly continue processing the requests without any buildup beforehand.

Things left to do:

  • Session management should support preprocessing (ZooKeeper's SessionTracker) - should be okay because fetching the SessionID is a blocking operation, i.e. nothing else can be done until the new session is committed
  • Add support for rollback

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@robot-ch-test-poll robot-ch-test-poll added the pr-not-for-changelog This PR should not be mentioned in the changelog label May 9, 2022
@antonio2368 antonio2368 force-pushed the keeper-preprocess-operations branch from a0b63b6 to a988cfa Compare May 9, 2022 13:23
@antonio2368 antonio2368 marked this pull request as ready for review May 11, 2022 12:29
Copy link
Copy Markdown
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

@alesapin
Copy link
Copy Markdown
Member

Failures unrelated, but let's check one more time.

@alesapin
Copy link
Copy Markdown
Member

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented May 19, 2022

update

✅ Branch has been successfully updated

@alesapin
Copy link
Copy Markdown
Member

00991_system_parts_race_condition_long -- flaky in master :(

@alesapin
Copy link
Copy Markdown
Member

Failures are unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants