rgw: add multisite configuration for cross-zonegroup replication by cbodley · Pull Request #61862 · ceph/ceph

cbodley · 2025-02-17T20:28:07Z

realm default and zonegroup overrides

defines the configuration for cross-zonegroup replication at the realm and zonegroup level. the default behavior is set by the realm, and individual zonegroups can override this to enable/disable cross-zonegroup replication to/from other zonegroups

opt-in vs opt-out

the realm configuration supports three models:

opt-out model where cross-zonegroup replication is "enabled" in the realm but "forbidden" for specific zonegroup pairs
opt-in model where replication is "allowed" in the realm but only "enabled" for specific zonegroup pairs
no cross-zonegroup replication when "forbidden" by the realm

the realm uses a tristate enum CanSync to represent the default, while the zonegroup uses two sets ("enable" and "forbid") to represent the overrides

import vs export

zonegroups can override the realm behavior both for import and export. while this can duplicate information (that is, disabling A's import from B is the same as disabling B's export to A), one form will generally be simpler than the other. for example, consider a realm with zonegroups A, B, C with cross-zonegroup replication disabled to/from C. to represent this with imports only:

realm:
  "cross_zonegroup": "enabled"
zonegroup A:
  "cross_zonegroup_import": { "forbid": {"C"} }
zonegroup B:
  "cross_zonegroup_import": { "forbid": {"C"} }
zonegroup C:
  "cross_zonegroup_import": { "forbid": {"A", "B"} }

using both imports and exports, the configuration can be localized to zonegroup C:

realm:
  "cross_zonegroup": "enabled"
zonegroup C:
  "cross_zonegroup_export": { "forbid": {"A", "B"} },
  "cross_zonegroup_import": { "forbid": {"A", "B"} }

wildcards

the wildcard string "*" can be used to match all peers, allowing the example above to be simplified as "forbid": {"*"}. but more importantly, wildcards adapt as zonegroups are added/removed from the realm. so if another zonegroup D is added to the realm above, that use of wildcards would automatically disable its replication to/from C

TODO:

add similar configuration for same_zonegroup replication policy
radosgw-admin commands to modify this configuration

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

clwluvw

Looks Great, Thanks Casey! 😍

clwluvw · 2025-02-18T23:29:12Z

src/rgw/driver/rados/rgw_zone.h

+                      const RGWZoneGroup& source);
+
+/// Test whether a destination bucket should sync from the given source bucket.
+bool should_sync_from(const SiteConfig& site,


I’ve been reflecting on where this should be invoked. In #59911, the approach I took was to mark bilog entries with the zones responsible for processing (log_zones), and I initially thought zones should unconditionally process those entries as the criteria (in terms of filters) had already been validated by the source zone when the entry was logged. This would cover most scenarios, aside from two edge cases—bucket deletion during replication or changes in permissions.

That approach was based on zones, as at the moment, sync pipes can only keep zones, not zonegroups. Based on your insights from #61140 (comment), I believe we need to shift focus from zones to zonegroups. The master can change over time, and we want to avoid missing any processing entries.

What I’m envisioning is this: during the logging phase, we still mark the relevant zonegroups based on the matching pipes, rather than individual zones. And if there were no matching pipes, we should log when there is more than one zone per zonegroup with empty log_zonegroups set. Then, during the processing phase, when we fetch entries, we would check if the zonegroup of the requested zone (rgwx-zonegroup) matches ourselves (destination). If it does, we don’t apply any additional filtering on log_zonegroups and return all entries. and if it doesn't we just return the ones having rgwx-zonegroup in the log_zonegroups set.

This would also require modifying the sync pipes to store a bucket’s zonegroup hint, instead of its zone, as constantly looking up the zonegroup for each request wouldn’t be efficient. Buckets don’t frequently move, and in cases of bucket recreation, we can update the policies as needed, maintaining compatibility with AWS—where even if a bucket is moved mid-replication, it still replicates.

So, my question is: do we need to perform this validation on the pull side, or would it be sufficient to handle it during the logging phase, relying on the log_zonegroups field? perhaps just ANDing the case where replication is possible from my zonegorup to the destination bucket's zonegroup. So we change this to should_sync_to rather than from to promote the case on which side this needs to be considered. although if it's on the logging side this would lose the immediate effect if an admin wants to stop replication from zonegroup A to B. So maybe both can be considered anyway.

What I’m envisioning is this: during the logging phase, we still mark the relevant zonegroups based on the matching pipes, rather than individual zones. And if there were no matching pipes, we should log when there is more than one zone per zonegroup with empty log_zonegroups set. Then, during the processing phase, when we fetch entries, we would check if the zonegroup of the requested zone (rgwx-zonegroup) matches ourselves (destination). If it does, we don’t apply any additional filtering on log_zonegroups and return all entries. and if it doesn't we just return the ones having rgwx-zonegroup in the log_zonegroups set.

that sounds right to me.

This would also require modifying the sync pipes to store a bucket’s zonegroup hint, instead of its zone, as constantly looking up the zonegroup for each request wouldn’t be efficient. Buckets don’t frequently move, and in cases of bucket recreation, we can update the policies as needed, maintaining compatibility with AWS—where even if a bucket is moved mid-replication, it still replicates.

sorry not clear, could you clarify what moving a bucket mean?

looks like we will need to introduce the concept of master zone in data sync.

sorry not clear, could you clarify what moving a bucket mean?

So if you have a policy on your source bucket pointing to a destination bucket in Region A and then you delete the destination bucket and create it in Region B, AWS will continue replicating from the source bucket to the newly created bucket in Region B without any modification needed by the user.

So, my question is: do we need to perform this validation on the pull side, or would it be sufficient to handle it during the logging phase, relying on the log_zonegroups field?

for the RGWBucketInfo overload of should_sync_from(), i was thinking we'd only use that on PutBucketReplication to reject any cross-zonegroup policy that's disabled by the admin. though i guess this would open a loophole if the destination bucket was deleted and recreated on a zonegroup that wasn't supposed to be enabled

cbodley · 2025-03-12T20:16:35Z

this is what i'm thinking for the associated radosgw-admin commands:

radosgw-admin realm cross-zonegroup <enable|allow|forbid> --rgw-realm myrealm
radosgw-admin realm same-zonegroup <enable|allow|forbid> --rgw-realm myrealm
radosgw-admin zonegroup cross-zonegroup <enable|forbid> --rgw-zonegroup myzonegroup --peer-zonegroup-id <uuid>
radosgw-admin zonegroup cross-zonegroup rm <enable|forbid> --rgw-zonegroup myzonegroup --peer-zonegroup-id <uuid>
radosgw-admin zonegroup same-zonegroup <enable|allow|forbid> --rgw-zonegroup myzonegroup

edit: zonegroup needs to track cross-zonegroup imports and exports separately:

radosgw-admin zonegroup import <enable|forbid> --rgw-zonegroup myzonegroup --peer-zonegroup-id <uuid>
radosgw-admin zonegroup import rm <enable|forbid> --rgw-zonegroup myzonegroup --peer-zonegroup-id <uuid>
radosgw-admin zonegroup export <enable|forbid> --rgw-zonegroup myzonegroup --peer-zonegroup-id <uuid>
radosgw-admin zonegroup export rm <enable|forbid> --rgw-zonegroup myzonegroup --peer-zonegroup-id <uuid>

github-actions · 2025-05-05T16:08:45Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Signed-off-by: Casey Bodley <[email protected]>

cbodley · 2025-05-06T16:33:51Z

squashed/rebased over conflicts from #62398

github-actions · 2025-06-18T17:22:41Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

github-actions · 2025-08-17T19:01:33Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

github-actions · 2025-10-17T19:01:38Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

github-actions · 2025-11-16T20:01:59Z

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

github-actions · 2026-01-20T19:19:19Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

github-actions · 2026-02-19T21:06:33Z

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

github-actions bot added the rgw label Feb 17, 2025

cbodley requested review from clwluvw and smanjara February 18, 2025 19:42

clwluvw approved these changes Feb 18, 2025

View reviewed changes

This was referenced Feb 18, 2025

rgw: add support bucket replication between zonegroups #59911

Closed

rgw: add support replication actions in policy #61962

Merged

cbodley force-pushed the wip-rgw-cross-zonegroup-config branch from e6ee79a to 3d7b349 Compare March 12, 2025 20:01

github-actions bot added the tests label Mar 12, 2025

cbodley force-pushed the wip-rgw-cross-zonegroup-config branch from 8c740ff to e54323f Compare March 17, 2025 14:14

github-actions bot added the needs-rebase label May 5, 2025

cbodley added 4 commits May 6, 2025 11:27

rgw: RGWRealm stores default cross-zonegroup configuration

91fb8b7

Signed-off-by: Casey Bodley <[email protected]>

rgw: RGWZoneGroup stores overrides to cross-zonegroup configuration

c02fbf2

Signed-off-by: Casey Bodley <[email protected]>

rgw: should_sync_from() applies cross-zonegroup configuration

b9996b8

Signed-off-by: Casey Bodley <[email protected]>

radosgw-admin: realm/zonegroup commands for replication policy

d1e9c45

Signed-off-by: Casey Bodley <[email protected]>

cbodley force-pushed the wip-rgw-cross-zonegroup-config branch from 59f752f to d1e9c45 Compare May 6, 2025 16:33

github-actions bot removed the needs-rebase label May 6, 2025

github-actions bot added the needs-rebase label Jun 18, 2025

github-actions bot added the stale label Aug 17, 2025

clwluvw removed the stale label Aug 18, 2025

github-actions bot added the stale label Oct 17, 2025

github-actions bot closed this Nov 16, 2025

clwluvw reopened this Nov 21, 2025

github-actions bot removed the stale label Nov 21, 2025

github-actions bot added the stale label Jan 20, 2026

github-actions bot closed this Feb 19, 2026

Comments

Conversation

cbodley commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

realm default and zonegroup overrides

opt-in vs opt-out

import vs export

wildcards

TODO:

Uh oh!

clwluvw left a comment

Choose a reason for hiding this comment

Uh oh!

clwluvw Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smanjara Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

smanjara Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

clwluvw Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cbodley Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

cbodley commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 5, 2025

Uh oh!

cbodley commented May 6, 2025

Uh oh!

github-actions bot commented Jun 18, 2025

Uh oh!

github-actions bot commented Aug 17, 2025

Uh oh!

github-actions bot commented Oct 17, 2025

Uh oh!

github-actions bot commented Nov 16, 2025

Uh oh!

github-actions bot commented Jan 20, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cbodley commented Feb 17, 2025 •

edited

Loading

clwluvw Feb 18, 2025 •

edited

Loading

clwluvw Feb 19, 2025 •

edited

Loading

cbodley commented Mar 12, 2025 •

edited

Loading