Skip to content

ceph: arrow 19 compilation fix with protobuf 30; snappy: restore of RTTI patch#427905

Merged
nh2 merged 2 commits intoNixOS:stagingfrom
benaryorg:ceph-arrow-19-fix
Jul 29, 2025
Merged

ceph: arrow 19 compilation fix with protobuf 30; snappy: restore of RTTI patch#427905
nh2 merged 2 commits intoNixOS:stagingfrom
benaryorg:ceph-arrow-19-fix

Conversation

@benaryorg
Copy link
Contributor

@benaryorg benaryorg commented Jul 23, 2025

It seems that Arrow 20 which broke compilation initially causes issues with our tests when the patch which was merged upstream is applied. At the same time the bundled Arrow 17 can't work with current Re2 due to its C++ version. The thing that broke the Arrow 19 build is because Arrow 19 is itself not compatible with Protobuf 30, which is pulled in somewhere along the dependency graph by something else causing linking issues. However, the patch for Protobuf 30 compat applies cleanly to our Arrow 19, which means we can dump the older version pin and instead use this patch to get Arrow 19 working again.


Resolves #426401
Supersedes #426609 (here's to hoping the next Ceph major version will have less issues with arrow)

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

@benaryorg benaryorg mentioned this pull request Jul 23, 2025
3 tasks
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. labels Jul 23, 2025
@benaryorg benaryorg force-pushed the ceph-arrow-19-fix branch from 7d7e7c7 to f0ce964 Compare July 24, 2025 11:24
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-linux: 1001-2500 This PR causes many rebuilds on Linux and should target the staging branches. 10.rebuild-darwin: 501-1000 This PR causes many rebuilds on Darwin and should normally target the staging branches. and removed 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. labels Jul 24, 2025
@benaryorg benaryorg force-pushed the ceph-arrow-19-fix branch 2 times, most recently from 86883ca to a30308e Compare July 26, 2025 15:36
@benaryorg benaryorg changed the base branch from master to staging July 26, 2025 15:36
@nixpkgs-ci nixpkgs-ci bot closed this Jul 26, 2025
@nixpkgs-ci nixpkgs-ci bot reopened this Jul 26, 2025
@benaryorg benaryorg force-pushed the ceph-arrow-19-fix branch from a30308e to a46c5ab Compare July 26, 2025 15:50
@benaryorg
Copy link
Contributor Author

benaryorg commented Jul 26, 2025

  • changed the target to staging due to the number of rebuilds
  • as discussed in the associated issue the snappy patch was reintroduced without a downgrade
  • bump said patch to the current sidestream patch version
  • I'll mark this as ready for review (and update the OP with the currently commented out "Resolves" and "Supersedes") statements after the Hydra tests have succeeded I have underestimated the resources required to build the staging based branch, I'll mark this as ready for review early

@benaryorg benaryorg marked this pull request as ready for review July 26, 2025 17:39
@nh2
Copy link
Contributor

nh2 commented Jul 26, 2025

  • as discussed in the associated issue the snappy patch was reintroduced without a downgrade

Direct link: #426401 (comment)

@nh2 nh2 changed the title ceph: arrow 19 compilation fix with protobuf 30 ceph: arrow 19 compilation fix with protobuf 30; snappy: restore of RTTI patch Jul 26, 2025
@nh2
Copy link
Contributor

nh2 commented Jul 26, 2025

@trofi @pbsds in this PR we revert the removal of the snappy RTTI patch from:

#406663 (review)

Please point out if for some reason there might a problem with that (even better, do you know why that was removed in the first place?).

@trofi
Copy link
Contributor

trofi commented Jul 27, 2025

@trofi @pbsds in this PR we revert the removal of the snappy RTTI patch from:

#406663 (review)

Please point out if for some reason there might a problem with that (even better, do you know why that was removed in the first place?).

To copy #406663 (comment) here:

I think (I don't remember' sorry) I removed the RTTI patch because it was rejected upstream google/snappy#144 and was failing to apply downstream.

@benaryorg
Copy link
Contributor Author

I think (I don't remember' sorry) I removed the RTTI patch because it was rejected upstream and was failing to apply downstream.

I'll add a comment for that then to avoid it happening in the future.
The referenced patch is revisioned and regularly updated but that's not readily apparent, but usually bumping it takes a bit of clicking around to get the right rev off the site and the new one should apply.

@benaryorg benaryorg force-pushed the ceph-arrow-19-fix branch from a46c5ab to ad66cba Compare July 27, 2025 08:15
@benaryorg benaryorg requested a review from nh2 July 27, 2025 08:16
@benaryorg benaryorg force-pushed the ceph-arrow-19-fix branch from ad66cba to 9e04fd9 Compare July 27, 2025 08:19
@benaryorg
Copy link
Contributor Author

@trofi seems I can't request a review since you're not a member of the GitHub group, so I'll ping you directly; could you please just give a nod on whether or not the comment in this version is sufficient information to convey the importance of the patch, and also to provide clear enough instructions to keep it updated in the future (the site feels a bit tedious to navigate on if you're looking for the current rev IMHO)?

@nh2 if you could have a glance at the PR in its current state, and double check the Ceph side of things that'd be great.

If both of you approve the PR I think we can merge it given that it targets staging anyway, so we'll see any failures on upstream Hydra, and we can fix those using commits to staging-next to avoid extra cycles and get that merged into master soon (enough).

@trofi
Copy link
Contributor

trofi commented Jul 27, 2025

@trofi seems I can't request a review since you're not a member of the GitHub group, so I'll ping you directly; could you please just give a nod on whether or not the comment in this version is sufficient information to convey the importance of the patch, and also to provide clear enough instructions to keep it updated in the future (the site feels a bit tedious to navigate on if you're looking for the current rev IMHO)?

Restoring the patch sounds fine to me.

But to address longer term maintainability health it would be nice if ceph would not require patched source of snappy.

@nixpkgs-ci nixpkgs-ci bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Jul 27, 2025
@benaryorg
Copy link
Contributor Author

But to address longer term maintainability health it would be nice if ceph would not require patched source of snappy.

As far as I understand (which may well be wrong), any application using dlopen but still requiring to extend a certain class does require that patch, and upstream has rejected that patch on grounds that they want to use the same compilation environment as Google Chrome regardless of breaking other software.
My impression is that Ceph's usage is legitimate and absolutely within what software normally does (given that Ceph uses a lot of Python the dlopen part does make sense).

grumble

I do not see a future where the patch is not needed, and also given that Google is a very small startup with limited resources and barely any involvement with build systems or CI I fully understand why they simply cannot afford the maintenance burden of a change which removes code actively messing with provided compile options.
If upstream provided those options at compile time externally (i.e. within their own build) instead of literally search and replacing them within cmake, none of this would be required in the first place but – again – such a small company sometimes has to cut corners lest they lose out to bigger competitors on the market.

@trofi
Copy link
Contributor

trofi commented Jul 27, 2025

Added the comment in google/snappy#189 (comment)

@nh2
Copy link
Contributor

nh2 commented Jul 27, 2025

I also think it's not a big thing to carry that CMake-only patch perpetually: Red Hat can have their dislike for static linking and Google can have a dislike for dlopen(); adjusting that on our side is OK in my opinion, if the changes stay indeed a 2-line CMakeLists.txt diff.

@benaryorg I think the added comment

    # While the patch was rejected upstream, it does not make it any less necessary to carry forward.
    # ==> lack of RTTI *breaks* Ceph (and others) <==

is very good and makes the story clear.

Copy link
Contributor

@nh2 nh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Let's ship it to staging.

@nh2
Copy link
Contributor

nh2 commented Jul 27, 2025

The ofborg failed with error: Nix daemon disconnected unexpectedly (maybe it crashed?)

@nixpkgs-ci nixpkgs-ci bot added 12.approvals: 2 This PR was reviewed and approved by two persons. and removed 12.approvals: 1 This PR was reviewed and approved by one person. labels Jul 27, 2025
@nh2
Copy link
Contributor

nh2 commented Jul 27, 2025

@benaryorg: One problem: When porting the PR onto 7fd36ee (origin/nixos-unstable) I get:

error: hash mismatch in fixed-output derivation '/nix/store/106kxr13p7105pbhwpxqdp4jmvm8k6rk-reenable-rtti.patch?rev=e3449869b466869fc6b8a03a1a528fa6.drv':
         specified: sha256-RMuM5yd6zP1eekN/+vfS54EyY4cFbGDVor1E1vj3134=
            got:    sha256-JhVhkHh7XPx1Bzf5xnOgWLgwh1oihX3O+emQWzE4Dho=

Copy link
Contributor

@nh2 nh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting back to "request changes" just to check where the hash mismatch comes from; beyond that it's still good to merge.

This is required for *Ceph* to compile, as the comment states, and was accidentally removed.
A similar patch is also present for *leveldb* for instance.
Also bumps the patch to the appropriate version.

partial revert of 5240499

Signed-off-by: benaryorg <[email protected]>
It seems that Arrow 20 which broke compilation initially causes issues with our tests when the patch which was merged upstream is applied.
At the same time the bundled Arrow 17 can't work with current Re2 due to its C++ version.
The thing that broke the Arrow 19 build is because Arrow 19 is itself not compatible with Protobuf 30, which is pulled in somewhere along the dependency graph by something else causing linking issues.
However, the patch for Protobuf 30 compat applies cleanly to our Arrow 19, which means we can dump the older version pin and instead use this patch to get Arrow 19 working again.

Signed-off-by: benaryorg <[email protected]>
@benaryorg benaryorg force-pushed the ceph-arrow-19-fix branch from 9e04fd9 to b727207 Compare July 27, 2025 23:24
@benaryorg
Copy link
Contributor Author

benaryorg commented Jul 27, 2025

Setting back to "request changes" just to check where the hash mismatch comes from; beyond that it's still good to merge.

While rebasing it seems I pulled in the hash for an older patch revision (https://build.opensuse.org/public/source/openSUSE:Factory/snappy/reenable-rtti.patch?rev=a759aa6fba405cd40025e3f0ab89941d).
So the mismatch is correct and yet another FOD fail on my end (I keep messing those up and I'm used to compilers yelling at me when I do).

Updated the hash.

@benaryorg
Copy link
Contributor Author

I just tested the branch rebased onto an older commit of master (1a689bc) and it built, however given the merge to staging this wants to rebuild everything starting at curl which is a whole damn lot.

@nh2
Copy link
Contributor

nh2 commented Jul 29, 2025

-A ceph.tests -A samba4Full builds for me, merging.

@nh2 nh2 merged commit ad7654a into NixOS:staging Jul 29, 2025
25 of 27 checks passed
@ccicnce113424 ccicnce113424 mentioned this pull request Jul 29, 2025
13 tasks
@nh2 nh2 mentioned this pull request Jul 29, 2025
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

10.rebuild-darwin: 501-1000 This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-darwin: 501+ This PR causes many rebuilds on Darwin and should normally target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 1001-2500 This PR causes many rebuilds on Linux and should target the staging branches. 12.approvals: 2 This PR was reviewed and approved by two persons.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants