Skip to content

Comments

rgw: fetch_remote_obj() need to re-encrypt sse-s3 encrypted objects#62272

Open
clwluvw wants to merge 11 commits intoceph:mainfrom
clwluvw:sse-s3-fetch
Open

rgw: fetch_remote_obj() need to re-encrypt sse-s3 encrypted objects#62272
clwluvw wants to merge 11 commits intoceph:mainfrom
clwluvw:sse-s3-fetch

Conversation

@clwluvw
Copy link
Member

@clwluvw clwluvw commented Mar 13, 2025

As SSE-S3 encrypted objects are tied to the source bucket's encryption key, they must be re-encrypted using the destination bucket's key to eliminate any dependencies on the source bucket's key.
The same principle applies to replication.

Fixes: https://tracker.ceph.com/issues/70446

@clwluvw clwluvw marked this pull request as ready for review March 19, 2025 16:59
@clwluvw clwluvw requested a review from a team as a code owner March 19, 2025 16:59
Comment on lines 703 to 706
auto crypt_mode = attrs.find(RGW_ATTR_CRYPT_MODE);
if (crypt_mode != attrs.end()) {
if (crypt_mode->second.to_str() != "AES256") {
// AES256 needs to be decrypted so that the target zone can re-encrypt with destination bucket's key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with rgw's default replication for DR, the source bucket and destination bucket are always the same, where i assume we'd never need to re-encrypt the objects. can we find a way to make this specific to bucket replication policy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, the simplest approach would be to consider the rgwx-zonegroup argument if we had a clear distinction between zone replication and zonegroup replication.
However, for now, we could take the skip-encryption argument more seriously and always skip encryption when this argument is present. This way, the destination zone can decide whether encryption needs to be skipped by comparing the source and destination buckets.
If the argument is not present, the source zone would make the decision based on the source object’s encryption mode. In this case, we might need to check whether the request is for replication, which may require introducing another argument (the existing ones are implicit for that i think).

Alternatively, we could introduce a new argument, skip-enc-sse-s3, which the destination zone would send when the source and destination buckets are different. This would allow the source zone to decide based on either skip-encryption or skip-enc-sse-s3. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's hard to make changes here that preserve compatibility between ceph versions

maybe we should leave the server side unchanged (always skip decrypt), and rely on fetch_remote_obj() to fetch both source and destination bucket keys so it can decrypt/reencrypt the body itself when necessary

multisite really shouldn't transfer objects in plaintext when they're supposed to be encrypted on both sites

Copy link
Member Author

@clwluvw clwluvw Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's hard to make changes here that preserve compatibility between ceph versions

With the introduction of skip-enc-sse-s3, if the destination requests it from a source that hasn't been upgraded yet, the object will only be retrieved in plain text if it's actually encrypted with SSE-S3 (that's where we don't send skip-encryption and only send skip-enc-sse-s3 which the source doesn't understand it yet) and that is fine because the destination will create the keys and encrypt it again. Otherwise, the destination will send skip-encryption anyway, and the object will remain encrypted. The only scenario where this might not work is when both the source and destination are the same, and we don't want to decrypt it. However, this would only impact performance until the upgrade is complete, which I believe is negligible.

Vice-versa, the old version always requests skip-encryption, and the upgraded version will always respect that.

multisite really shouldn't transfer objects in plaintext when they're supposed to be encrypted on both sites

Is this due to any security constraints? if so isn't (enforcing) https enough for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multisite really shouldn't transfer objects in plaintext when they're supposed to be encrypted on both sites

Is this due to any security constraints? if so isn't (enforcing) https enough for that?

tls helps, but what does enforcement mean? failing bucket replication for sse-s3-encrypted objects, i guess?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client traffic isn't effected

i wasn't referring only to client traffic but the replication time and eventually the replication performance at scale.

i'm not sure it makes much difference?

But I'm not sure. if we don't have the right way to do it at least the other way, i guess the best is to start with the one you suggested and measure the performance see if this is really bothering so we improve later (even with not a clean way) with a reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but to decrypt and reencrypt perhaps we would need some implementations from #54543

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i find this a bit inefficient to decrypt and encrypt on the destination zone at the same time, i thought if we could leave the decryption to the source zone why not?

sorry, i thought we were talking about the cpu overhead of decryption. we either need to pay that cost on the server side or the client side, and either will count towards replication latency. am i missing something?

Copy link
Member Author

@clwluvw clwluvw Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no that is true, just thought if we distribute the load (decryption to source and encryption to destination) that could leave the benefit of better resource utilisation than gathering all at one place in the big picture across all clusters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's hard to make changes here that preserve compatibility between ceph versions

One another approach i could think of:
What if we base the decision to apply the encryption filter on a response header, such as x-rgw-decrypted?
This would allow us to determine whether the source zone we're interacting with has been upgraded and whether the content has actually been decrypted. If it has, we could then apply the encryption filter accordingly. If not, the sync would maintain its current behavior, ensuring backward compatibility.
we would need to introduce skip-decrypt-sse-s3 anyway to hint source not to if it was encrypted via sse-s3 when it's between buckets.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Sep 12, 2025
@clwluvw clwluvw removed the stale label Sep 12, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Nov 11, 2025
@clwluvw clwluvw removed the stale label Nov 12, 2025
requesting user can be a non-bucket owner, so picking the tenant can
result in different contexts. Picking bucket owner ensures the
consistent context.

Signed-off-by: Seena Fallah <[email protected]>
As sse-s3 encryption is bind to the bucket's key, we should always
serve the objects decrypted so it can be reencrypted again by the
destination bucket's key id.

Signed-off-by: Seena Fallah <[email protected]>
When the source object is encrypted via sse-s3, fetch_remote_obj()
need to pause the receive and generate a new sse-s3 key for the
destination bucket and object and continue to receive and process
filter based on the new enc filter.

Signed-off-by: Seena Fallah <[email protected]>
Passing decrypt-mode rather than a bool skip-decrypt would allow the
dest zone to pass "skip-except-sse-s3" when we have the src and dest
bucket tre not he same and passing "skip" when they are.
so with that, source zone can make the right call based on the string
provided by the decrypt-mode param.

Signed-off-by: Seena Fallah <[email protected]>
@github-actions
Copy link

github-actions bot commented Feb 1, 2026

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions github-actions bot removed the stale label Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants