Skip to content

rgw: add per-zone sharded datalog backends#67456

Open
clwluvw wants to merge 9 commits intoceph:mainfrom
clwluvw:per-zone-datalog
Open

rgw: add per-zone sharded datalog backends#67456
clwluvw wants to merge 9 commits intoceph:mainfrom
clwluvw:per-zone-datalog

Conversation

@clwluvw
Copy link
Member

@clwluvw clwluvw commented Feb 22, 2026

RGW multisite sync currently uses a single shared datalog that all peer zones read from. When one zone trims entries it has already consumed, those entries become unavailable to other zones that may still need them. Recovery and trim operations must coordinate across all peers, creating unnecessary coupling.

This PR introduces per-zone datalogs: each consuming zone gets its own independent datalog stream on the source zone. The source zone writes entries to the appropriate per-zone backend, and each consumer reads and trims only its own stream without affecting others.

The feature is controlled by the per_zone_datalog zone feature flag, which is enabled by default on new zonegroups. When enabled, writes go exclusively to per-zone backends. When disabled, the legacy shared datalog is used unchanged. There is no dual-write phase - the switch is immediate based on the feature flag, keeping the write path simple and avoiding doubled RADOS I/O.

On the sync consumer side, each zone now passes its own zone-id in HTTP requests to the source zone. The source zone's REST handlers route to the per-zone backend if one exists for that zone, and fall through to legacy otherwise. The trim path was extended with a per-zone variant that trims each zone's datalog based on that specific zone's sync progress rather than the minimum across all peers.

The radosgw-admin CLI gained a --log-zone flag that accepts a zone name for per-zone datalog operations (list, status, trim, etc.).

Fixed a pre-existing bug in the period POST handler where create_period() internally calls update_latest_epoch(), making the subsequent explicit call in the handler always return -EEXIST. This caused the handler to skip the realm reload notification, preventing gateways from discovering newly added zones after a period commit.

Added a safety guard in the coroutine framework against operating on completed stacks, and fixed a shutdown crash in neorados caused by circular ownership between Notifier and LingerOp objects.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant