Skip to content

Comments

squid: rgw: implement S3 additional checksum support#62312

Open
adamemerson wants to merge 37 commits intoceph:squidfrom
adamemerson:wip-63951-squid
Open

squid: rgw: implement S3 additional checksum support#62312
adamemerson wants to merge 37 commits intoceph:squidfrom
adamemerson:wip-63951-squid

Conversation

@adamemerson
Copy link
Contributor

@adamemerson adamemerson commented Mar 14, 2025

Adds new Blake3 digest format (native), a concrete type to
represent digests, and static_visitor machinery to unify varying
checksum computations.

This framework, together with new trailing checksum header support,
is used to implement S3 additional checksum verification. Parts of
the AWS content checksum API work build on a prior contribution from
imtzw [email protected].
Thank you!

Fixes: https://tracker.ceph.com/issues/42080
Fixes: https://tracker.ceph.com/issues/63951

squashed commits:

rgw_cksum: add trival test vectors for sha-format digests
Computed digests match those produced by sha1sum, sha256sum,
and sha512sum utilities.
rgw_cksum: add test vectors for blake3
Tests the same input strings with digests validated by
b3sum (https://crates.io/crates/b3sum).
rgw_ckum: switch to accel crc32c
The internal Ceph convention appears to be to omit a final
xor where ceph_crc32c is used, but it's required for compatibility
with AWS implementations.
rgw_cksum: add XXH3 digest
rgw_cksum: write class encoder for rgw::digest::Cksum
rgw_cksum: also reverse crc32c (REBASEME)
Mark noticed that the crc32c output was being tested against a
byteswapped value (crc32c also needs byteswap on LE).
rgw_cksum: add digest::Cksum serde tests
rgw_cksum: fix main(...) linkage
(so we run our main unit and not the one in gmock
rgw_cksum: convenience extensions for integration with RGW/S3
introduce rgw_cksum unique_ptr factory
rgw_cksum: mark string transform accessors const
rgw_cksum: fixup unittest_rgw_chksum compilation--all existing tests pass
rgw_cksum: hook up put-object checksum workflow
tweaks to report on content checksum mismatch
rgw_cksum: match SDK as well as general checksum header
make it more efficient
initialize RGWPutObj_Cksum::digest
rgw_cksum: write parse_cksum_type w/const char* arg
initialize _type correctly; doing armored wrong
fix expected checksum header name, clean up verify
fix output on checksum verify fail, cleanup
introduce Cksum::to_armor(); all AWS cases pass
oops, extra 0-byte at end of to_armor() result

Fixes: https://tracker.ceph.com/issues/66837

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@adamemerson
Copy link
Contributor Author

Need to backport dependencies…

@adamemerson adamemerson modified the milestones: v19.2.2, squid Mar 14, 2025
@adamemerson adamemerson marked this pull request as ready for review March 14, 2025 23:00
@adamemerson adamemerson modified the milestones: squid, v19.2.2 Mar 14, 2025
@cbodley cbodley changed the title rgw: implement S3 additional checksum suppor squid: rgw: implement S3 additional checksum support Mar 17, 2025
Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per @mattbenjamin, we shouldn't merge to squid without followup changes in #61878

@adamemerson adamemerson marked this pull request as draft April 2, 2025 19:03
Support uses conditionally compiled implementations from spdk
and ISA-L.

Validated against SPDK test vectors and one example generated and
passed via awscliv2 (AWS SDK) version 2.24.5.

Restored the ability of unittest_rgw_cksum to create a binary input
file for external checksum testing, but only when requested via
cmdline option.

Fixes: https://tracker.ceph.com/issues/70040

Signed-off-by: Matt Benjamin <[email protected]>
These constructions provide conforming implemenations of Amazon
S3 CRC32 (ISO hdlc), CRC32C (iscsi), and CRC64/NVME checksums,
in particular, linear combining of adjacent checksums, by
Mark Adler.

Source: https://github.com/madler/crcany
License: approximate BSD with "optional" advertising clause

Signed-off-by: Matt Benjamin <[email protected]>
* internally compensate for at-rest byteswapped crc64 representation (e.g., before combine step)
* generalize Cksum crc api for 32bit and 64bit crcs, other cleanup
* prototype abstract cksum::Combiner workflow
* add support for forward and backward handling of composite vs full object checksums
  by marking the composites and the update flag day
* clean up checksum formatting and checksum type reporting
* add unit tests for Combiner interface
* validate and track requested checksum type (i.e., composite or full), plus
  unit test fixture for combinations of full matrix of cksum types
* doh.  GET/HEAD checksums are in headers
* add crcany license to COPYING
* return ChecksumType as header in GET/HEAD

    Found by Casey in review

* avoid fmt of char* null when no checksum header supplied

    A cksum_hdr_t(nullptr, nullptr) results in this case, and is intended,
    but its components obviously can't be presented to std::format unless
    cast to a safe pointer type.

* fail checksum mismatch with BadDigest

    When uploading an S3 object with an invalid checksum, the return code
    should be BadDigest to mirror more closely the AWS S3 implementation.
    See: https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html

    Fixes: https://tracker.ceph.com/issues/70614

    Reported by Alex Wojno <[email protected]>

* fix comparison and display of composite checksums

A string comparision bounds error and a bitmask comparison
    error are fixed.

* fix build on centos9

On gcc(?13?) we can't declare rgw::cksum::FLAG_NONE
    and also rgw::cksum::Cksum::FlAG_NONE.

    SAD!!

* include <variant> invariantly

* fix checksum type return from complete-multipart

    This one is in the XML response

* aieee, don't leak Combiners

   Use unique_ptr to polymorphic type...correctly.

Signed-off-by: Matt Benjamin <[email protected]>
Signed-off-by: matt benjamin <[email protected]>
@mattbenjamin
Copy link
Contributor

with today's update (also pullup), this backport contains the complete S3 additional checksum feature

@adamemerson
Copy link
Contributor Author

@anrao19 Could you please pull this into a run on squid when you do one?

@cbodley
Copy link
Contributor

cbodley commented Jul 18, 2025

@anrao19 Could you please pull this into a run on squid when you do one?

@adamemerson this needs to be validated against the appropriate checksum s3-tests. someone will need to do that manually

@adamemerson
Copy link
Contributor Author

@anrao19 Could you please pull this into a run on squid when you do one?

@adamemerson this needs to be validated against the appropriate checksum s3-tests. someone will need to do that manually

@mattbenjamin Can you backport the Checksum tests to a branch based on squid s3tests? I can run the tests against that.

@mattbenjamin
Copy link
Contributor

@adamemerson I've created ceph/s3-tests#685

@cbodley
Copy link
Contributor

cbodley commented Sep 9, 2025

@adamemerson I've created ceph/s3-tests#685

@adamemerson did qa include these test cases?

@adamemerson
Copy link
Contributor Author

@adamemerson I've created ceph/s3-tests#685

@adamemerson did qa include these test cases?

No. I'll run an RGW suite against it with the s3tests branch specified.

@cbodley
Copy link
Contributor

cbodley commented Sep 9, 2025

No. I'll run an RGW suite against it with the s3tests branch specified.

thanks. and sorry, i know it's a huge pain :(

@adamemerson
Copy link
Contributor Author

@mattbenjamin Do you have a branch of backported s3-tests? I think I remember you mentioning making one, but I don't recall the name.

@adamemerson
Copy link
Contributor Author

@mattbenjamin Do you have a branch of backported s3-tests? I think I remember you mentioning making one, but I don't recall the name.

Never mind, I see it there. Sorry

@adamemerson
Copy link
Contributor Author

@adamemerson
Copy link
Contributor Author

@yuriw We'll QA this one ourselves since it has s3-tests requirements

@yuriw
Copy link
Contributor

yuriw commented Oct 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants