Skip to content

Comments

rgw/cloud-restore [PART1] : Add Restore support from Glacier/Tape cloud endpoints#61745

Merged
thotz merged 3 commits intoceph:mainfrom
soumyakoduri:wip-skoduri-glacier-cli
Mar 21, 2025
Merged

rgw/cloud-restore [PART1] : Add Restore support from Glacier/Tape cloud endpoints#61745
thotz merged 3 commits intoceph:mainfrom
soumyakoduri:wip-skoduri-glacier-cli

Conversation

@soumyakoduri
Copy link
Contributor

@soumyakoduri soumyakoduri commented Feb 10, 2025

This PR is the first in a series of enhancements to the restore feature, adding support for S3 Glacier cloud services.

Design doc: https://docs.google.com/document/d/1rzLJAzHK6cLuzJswgoplgugNOFCKi8NrjPOGfgVrIFg/edit?tab=t.0#heading=h.sgrmb31roboc

Unlike regular S3 cloud services, restoring objects from S3/Tape or AWS Glacier services would require special handling. We need to first restore the object using Glacier RestoreObject API and then download it using GET.
https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html

This PR handles below support -

  1. Add new tier-type & options related to S3 Glacier
    A new cloud tier-type s3-glacier is added to handle S3 Glacier endpoints along with below tier-config options -
    glacier_restore_days - lifetime of the restored copy on the Glacier endpoint ; default: 1 day
    glacier_restore_tier_type - Retrieval tier at which the restore will be processed.
    Only "Standard" (default) and "Expedited" options are supported.

In addition, a new option restore_storage_class is added to configure the storage class the objects need to be restored to. Default value: STANDARD

  1. Add Restore support from Glacier/Tape cloud endpoints
    This PR handles "Expedited" tier retrieval type. That means the cases where in restore would be quick and the object can be downloaded soon.

TODO: "Standard" tier-type support. Need to handle the case where in restore from cloud endpoint could take a longer time and need to be monitored periodically in the background.

TODO: Update cloud-transition doc & add tests.

Note: The first commit is already part of #61569 and will be rebased once merged.

Fixes: https://tracker.ceph.com/issues/68012
Signed-off-by: Soumya Koduri [email protected]

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

}

bool is_tier_type_s3() {
return (tier_type == "cloud-s3" || tier_type == "cloud-s3-glacier");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO better to seperate them out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would still need one common routine for all s3-* style tier types as most of the S3 endpoint options and processing would be common. And there is already one defined for just "glacier" below to use when needed specific to glacier

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-glacier-cli branch from 61d2bd5 to 4963de8 Compare March 1, 2025 17:44
@soumyakoduri soumyakoduri changed the title rgw/cloud-restore: Add new tier-type & options related to S3 Glacier rgw/cloud-restore [PART1] : Add Restore support from Glacier/Tape cloud endpoints Mar 1, 2025
@soumyakoduri soumyakoduri requested a review from adamemerson March 1, 2025 17:50

/** Get the type of this tier */
virtual const std::string& get_tier_type() = 0;
/** Is the type of this tier cloud-s3/clous-s3-glacier */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clous -> cloud

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


/* Read object or just head from remote endpoint.
*/
int rgw_cloud_tier_restore_object(RGWLCCloudTierCtx& tier_ctx, bool head,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

head is not used in this function, rgw_cloud_tier_get_object() with a boolean value directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

RGWRadosPutObj cb(dpp, cct, plugin, compressor, &processor, progress_cb, progress_data,
[&](map<string, bufferlist> obj_attrs) {
// XXX: do we need filter() like in fetch_remote_obj() cb
// XXX: do we need filter() lke in fetch_remote_obj() cb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unintented change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

* for permanent restore
*/
dest_placement.storage_class = RGW_STORAGE_CLASS_STANDARD;
// dest_placement.storage_class = RGW_STORAGE_CLASS_STANDARD;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will be the storage in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now updated in Line:5281 above. A new option "restore_storage_class" is added (default:STANDARD) to configure restored object storage-class.

Copy link
Contributor

@thotz thotz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments overall LGTM, saw two swap files related to vim in PR

@soumyakoduri
Copy link
Contributor Author

Minor comments overall LGTM, saw two swap files related to vim in PR

Thanks. Addressed them.

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-glacier-cli branch from 4963de8 to 05e4bbd Compare March 5, 2025 06:48
@github-actions
Copy link

github-actions bot commented Mar 6, 2025

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-glacier-cli branch from 05e4bbd to a3d412a Compare March 6, 2025 16:58
@soumyakoduri soumyakoduri force-pushed the wip-skoduri-glacier-cli branch from a3d412a to d4241c4 Compare March 7, 2025 08:51
@soumyakoduri
Copy link
Contributor Author

jenkins test make check

@soumyakoduri
Copy link
Contributor Author

jenkins test make check

Copy link
Member

@ivancich ivancich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever I see so many duplicates of hard-coded strings, I think that argues for a macro (old style) or (better) (static const or static constexpr). That avoids potential minor typo-issues and allows the compiler to do some checking.

A few that I noticed:

  • "cloud-s3-glacier"
  • "cloud-s3"
  • "glacier_restore_days"
  • "restore_storage_class"


if (!s.empty()){
const char *r_str = "ongoing-request=\"true\"";
char *found = strstr((char*)s.c_str(), r_str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit -- why is there a cast to (char*)? strstr takes a const char* as a parameter and that's what c_str() returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

o.back()->tier_type = "cloud-s3";
o.back()->storage_class = "STANDARD";
o.back()->allow_read_through = false;
o.back()->restore_storage_class = "STANDARD";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have a macro RGW_STORAGE_CLASS_STANDARD for this, maybe use it instead of hard-coding the string here and a few lines above. (I realize you're just duplicating the local coding style.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-glacier-cli branch from d4241c4 to 6ffe699 Compare March 14, 2025 10:35
@soumyakoduri
Copy link
Contributor Author

Whenever I see so many duplicates of hard-coded strings, I think that argues for a macro (old style) or (better) (static const or static constexpr). That avoids potential minor typo-issues and allows the compiler to do some checking.

A few that I noticed:

* "cloud-s3-glacier"

* "cloud-s3"

* "glacier_restore_days"

* "restore_storage_class"

Thanks Eric. I have addressed string literals "cloud-s3-glacier" and "cloud-s3" as suggested above in c6559af . Please review.

Unlike regular S3 cloud services, restoring objects from S3/Tape
or AWS Glacier services would require special handling. We need to
first restore the object using Glacier `RestoreObject` API and then
download it using `GET`.
https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html

A new cloud tier-type `s3-glacier` is added to handle S3 Glacier
endpoints along with below tier-config options -

`glacier_restore_days` - lifetime of the restored copy on the Glacier
			 endpoint ; default: 1 day
`glacier_restore_tier_type` - Retrieval tier at which the restore will be processed.
			      Only "Standard" (default) and "Expedited" options
			      are supported.

In addition, a new option `restore_storage_class` is added to configure
the storage class the objects need to be restored to. Default value:
STANDARD

Design doc: https://docs.google.com/document/d/1rzLJAzHK6cLuzJswgoplgugNOFCKi8NrjPOGfgVrIFg/edit?tab=t.0#heading=h.sgrmb31roboc

Signed-off-by: Soumya Koduri <[email protected]>
Unlike regular S3 cloud services, restoring objects from S3/Tape or AWS Glacier services
would require special handling. We need to first restore the object using Glacier
RestoreObject API and then download it using GET.
https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html

This PR adds that support for "Expedited" tier retrieval type. That means the
restore would be quick and the object can be downloaded soon.

TODO: "Standard" tier-type support. Need to handle the case where in restore from
cloud endpoint could take a longer time and need to be monitored periodically
in the background.

Signed-off-by: Soumya Koduri <[email protected]>
@soumyakoduri soumyakoduri force-pushed the wip-skoduri-glacier-cli branch from 6ffe699 to 300f643 Compare March 15, 2025 16:16
@soumyakoduri
Copy link
Contributor Author

jenkins test api

1 similar comment
@thotz
Copy link
Contributor

thotz commented Mar 20, 2025

jenkins test api

@thotz thotz requested review from cbodley and ivancich March 20, 2025 06:37
@thotz
Copy link
Contributor

thotz commented Mar 20, 2025

jenkins test api

@thotz thotz merged commit 6332769 into ceph:main Mar 21, 2025
11 checks passed
rzarzynski added a commit to rzarzynski/ceph that referenced this pull request Mar 26, 2025
This commit fixes an undetected merge conflict between PRs ceph#61745
and ceph#60159. The dencoding problem has been introduced very recently,
it is straightforward and causes failures of the make check bot
everywhere, therefore -- if no objections -- I want to merge this
patch without the Teuthology testing.

Signed-off-by: Radoslaw Zarzynski <[email protected]>
Comment on lines -563 to +612
ENCODE_START(3, 1, bl);
ENCODE_START(4, 1, bl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumyakoduri @thotz please note the fix for this in #62515 (has this been backported?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This feature is not present in squid upstream release. (Yet to be backported into downstream though.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet to be backported into downstream though

okay, thanks - just wanted to make sure we don't miss the fix downstream

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. Thank you! We weren't aware of this issue till now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants