rgw: add multisite configuration for cross-zonegroup replication#61862
rgw: add multisite configuration for cross-zonegroup replication#61862
Conversation
| const RGWZoneGroup& source); | ||
|
|
||
| /// Test whether a destination bucket should sync from the given source bucket. | ||
| bool should_sync_from(const SiteConfig& site, |
There was a problem hiding this comment.
I’ve been reflecting on where this should be invoked. In #59911, the approach I took was to mark bilog entries with the zones responsible for processing (log_zones), and I initially thought zones should unconditionally process those entries as the criteria (in terms of filters) had already been validated by the source zone when the entry was logged. This would cover most scenarios, aside from two edge cases—bucket deletion during replication or changes in permissions.
That approach was based on zones, as at the moment, sync pipes can only keep zones, not zonegroups. Based on your insights from #61140 (comment), I believe we need to shift focus from zones to zonegroups. The master can change over time, and we want to avoid missing any processing entries.
What I’m envisioning is this: during the logging phase, we still mark the relevant zonegroups based on the matching pipes, rather than individual zones. And if there were no matching pipes, we should log when there is more than one zone per zonegroup with empty log_zonegroups set. Then, during the processing phase, when we fetch entries, we would check if the zonegroup of the requested zone (rgwx-zonegroup) matches ourselves (destination). If it does, we don’t apply any additional filtering on log_zonegroups and return all entries. and if it doesn't we just return the ones having rgwx-zonegroup in the log_zonegroups set.
This would also require modifying the sync pipes to store a bucket’s zonegroup hint, instead of its zone, as constantly looking up the zonegroup for each request wouldn’t be efficient. Buckets don’t frequently move, and in cases of bucket recreation, we can update the policies as needed, maintaining compatibility with AWS—where even if a bucket is moved mid-replication, it still replicates.
So, my question is: do we need to perform this validation on the pull side, or would it be sufficient to handle it during the logging phase, relying on the log_zonegroups field? perhaps just ANDing the case where replication is possible from my zonegorup to the destination bucket's zonegroup. So we change this to should_sync_to rather than from to promote the case on which side this needs to be considered. although if it's on the logging side this would lose the immediate effect if an admin wants to stop replication from zonegroup A to B. So maybe both can be considered anyway.
There was a problem hiding this comment.
What I’m envisioning is this: during the logging phase, we still mark the relevant zonegroups based on the matching pipes, rather than individual zones. And if there were no matching pipes, we should log when there is more than one zone per zonegroup with empty
log_zonegroupsset. Then, during the processing phase, when we fetch entries, we would check if the zonegroup of the requested zone (rgwx-zonegroup) matches ourselves (destination). If it does, we don’t apply any additional filtering onlog_zonegroupsand return all entries. and if it doesn't we just return the ones havingrgwx-zonegroupin thelog_zonegroupsset.
that sounds right to me.
This would also require modifying the sync pipes to store a bucket’s zonegroup hint, instead of its zone, as constantly looking up the zonegroup for each request wouldn’t be efficient. Buckets don’t frequently move, and in cases of bucket recreation, we can update the policies as needed, maintaining compatibility with AWS—where even if a bucket is moved mid-replication, it still replicates.
sorry not clear, could you clarify what moving a bucket mean?
There was a problem hiding this comment.
looks like we will need to introduce the concept of master zone in data sync.
There was a problem hiding this comment.
sorry not clear, could you clarify what moving a bucket mean?
So if you have a policy on your source bucket pointing to a destination bucket in Region A and then you delete the destination bucket and create it in Region B, AWS will continue replicating from the source bucket to the newly created bucket in Region B without any modification needed by the user.
There was a problem hiding this comment.
So, my question is: do we need to perform this validation on the pull side, or would it be sufficient to handle it during the logging phase, relying on the
log_zonegroupsfield?
for the RGWBucketInfo overload of should_sync_from(), i was thinking we'd only use that on PutBucketReplication to reject any cross-zonegroup policy that's disabled by the admin. though i guess this would open a loophole if the destination bucket was deleted and recreated on a zonegroup that wasn't supposed to be enabled
e6ee79a to
3d7b349
Compare
|
this is what i'm thinking for the associated radosgw-admin commands: edit: zonegroup needs to track cross-zonegroup imports and exports separately: |
8c740ff to
e54323f
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
Signed-off-by: Casey Bodley <[email protected]>
Signed-off-by: Casey Bodley <[email protected]>
Signed-off-by: Casey Bodley <[email protected]>
Signed-off-by: Casey Bodley <[email protected]>
59f752f to
d1e9c45
Compare
|
squashed/rebased over conflicts from #62398 |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
realm default and zonegroup overrides
defines the configuration for cross-zonegroup replication at the realm and zonegroup level. the default behavior is set by the realm, and individual zonegroups can override this to enable/disable cross-zonegroup replication to/from other zonegroups
opt-in vs opt-out
the realm configuration supports three models:
the realm uses a tristate
enum CanSyncto represent the default, while the zonegroup uses two sets ("enable" and "forbid") to represent the overridesimport vs export
zonegroups can override the realm behavior both for import and export. while this can duplicate information (that is, disabling A's import from B is the same as disabling B's export to A), one form will generally be simpler than the other. for example, consider a realm with zonegroups A, B, C with cross-zonegroup replication disabled to/from C. to represent this with imports only:
using both imports and exports, the configuration can be localized to zonegroup C:
wildcards
the wildcard string "*" can be used to match all peers, allowing the example above to be simplified as
"forbid": {"*"}. but more importantly, wildcards adapt as zonegroups are added/removed from the realm. so if another zonegroup D is added to the realm above, that use of wildcards would automatically disable its replication to/from CTODO:
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e