rgw: add per-zone sharded datalog backends#67456
Open
Conversation
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
Signed-off-by: Seena Fallah <[email protected]>
22e7db0 to
e9a21ca
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
RGW multisite sync currently uses a single shared datalog that all peer zones read from. When one zone trims entries it has already consumed, those entries become unavailable to other zones that may still need them. Recovery and trim operations must coordinate across all peers, creating unnecessary coupling.
This PR introduces per-zone datalogs: each consuming zone gets its own independent datalog stream on the source zone. The source zone writes entries to the appropriate per-zone backend, and each consumer reads and trims only its own stream without affecting others.
The feature is controlled by the
per_zone_datalogzone feature flag, which is enabled by default on new zonegroups. When enabled, writes go exclusively to per-zone backends. When disabled, the legacy shared datalog is used unchanged. There is no dual-write phase - the switch is immediate based on the feature flag, keeping the write path simple and avoiding doubled RADOS I/O.On the sync consumer side, each zone now passes its own
zone-idin HTTP requests to the source zone. The source zone's REST handlers route to the per-zone backend if one exists for that zone, and fall through to legacy otherwise. The trim path was extended with a per-zone variant that trims each zone's datalog based on that specific zone's sync progress rather than the minimum across all peers.The
radosgw-adminCLI gained a--log-zoneflag that accepts a zone name for per-zone datalog operations (list, status, trim, etc.).Fixed a pre-existing bug in the period POST handler where
create_period()internally callsupdate_latest_epoch(), making the subsequent explicit call in the handler always return-EEXIST. This caused the handler to skip the realm reload notification, preventing gateways from discovering newly added zones after a period commit.Added a safety guard in the coroutine framework against operating on completed stacks, and fixed a shutdown crash in neorados caused by circular ownership between Notifier and LingerOp objects.
Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.