Skip to content

Increase max device size from 100TiB to 1PiB#64056

Open
dvanders wants to merge 2 commits intomainfrom
dvanders_large_crush_devices
Open

Increase max device size from 100TiB to 1PiB#64056
dvanders wants to merge 2 commits intomainfrom
dvanders_large_crush_devices

Conversation

@dvanders
Copy link
Contributor

@dvanders dvanders requested a review from a team as a code owner June 20, 2025 05:44
@github-actions github-actions bot added the core label Jun 20, 2025
@dvanders dvanders changed the title crush: Increase CRUSH_MAX_DEVICE_WEIGHT to 1000u crush: Increase CRUSH_MAX_DEVICE_WEIGHT to 1000u (i.e. increase max device size from 100TiB to 1PiB) Jun 23, 2025
@dvanders dvanders changed the title crush: Increase CRUSH_MAX_DEVICE_WEIGHT to 1000u (i.e. increase max device size from 100TiB to 1PiB) Increase max device size from 100TiB to 1PiB Jun 23, 2025
@kshtsk
Copy link
Contributor

kshtsk commented Jun 24, 2025

jenkins test make check

@dvanders dvanders requested a review from athanatos June 24, 2025 19:33
@athanatos
Copy link
Contributor

I'll get to this this week, apologies for the delay.

@athanatos
Copy link
Contributor

ceph-workspace/main » git grep 'CRUSH_MAX_DEVICE_WEIGHT'
src/crush/CrushCompiler.cc:       if (weight > CRUSH_MAX_DEVICE_WEIGHT && itemid >= 0) {
src/crush/CrushCompiler.cc:         err << "device weight limited to " << CRUSH_MAX_DEVICE_WEIGHT / 0x10000 << std::endl;
src/crush/crush.h:#define CRUSH_MAX_DEVICE_WEIGHT (100u * 0x10000u)

AFAICT, this is truly only used in CrushCompiler.cc to check the parsed weight. I don't see it used in any other context. I'd guess it's actually possible to add a larger device through other crush commands already (@dvanders this is worth checking, there might be other checks like this which don't use this constant).

This should be safe.

Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvanders Please confirm that other crush commands already permit 1PiB devices, this change should only affect manually constructed or modified maps unless I'm missing something.

@shraddhaag
Copy link
Contributor

@athanatos athanatos self-requested a review July 24, 2025 18:58
Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvanders Before we merge this, can you confirm that you've checked other commands -- afaict, this will only affect manually compiling a crush map.

@SrinivasaBharath
Copy link
Contributor

@dvanders - The validation of the PR has been successfully completed. As per the above comment , the review is currently pending. Once the review is done,feel free to merge the PR.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Sep 28, 2025
@kshtsk
Copy link
Contributor

kshtsk commented Sep 28, 2025

jenkins test make check

@github-actions github-actions bot removed the stale label Sep 28, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Nov 27, 2025
@kshtsk
Copy link
Contributor

kshtsk commented Nov 27, 2025

@athanatos it is still awaits your review.

@github-actions github-actions bot removed the stale label Nov 27, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jan 26, 2026
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Feb 25, 2026
@dvanders dvanders reopened this Feb 25, 2026
@dvanders dvanders force-pushed the dvanders_large_crush_devices branch from dca2cc4 to 8c7863f Compare February 25, 2026 16:48
@dvanders dvanders requested a review from a team February 25, 2026 16:49
@github-actions github-actions bot removed the stale label Feb 25, 2026
@dvanders
Copy link
Contributor Author

@athanatos @kshtsk I've checked the OSDMonitor commands like crush add, crush set, crush create-or-move, also OSD::update_crush_location, crimson, and cephadm usage of all that. I didn't find any other device size limits.

@github-actions github-actions bot added the tests label Feb 25, 2026
@dvanders
Copy link
Contributor Author

ceph-workspace/main » git grep 'CRUSH_MAX_DEVICE_WEIGHT'
src/crush/CrushCompiler.cc:       if (weight > CRUSH_MAX_DEVICE_WEIGHT && itemid >= 0) {
src/crush/CrushCompiler.cc:         err << "device weight limited to " << CRUSH_MAX_DEVICE_WEIGHT / 0x10000 << std::endl;
src/crush/crush.h:#define CRUSH_MAX_DEVICE_WEIGHT (100u * 0x10000u)

AFAICT, this is truly only used in CrushCompiler.cc to check the parsed weight. I don't see it used in any other context. I'd guess it's actually possible to add a larger device through other crush commands already (@dvanders this is worth checking, there might be other checks like this which don't use this constant).

This should be safe.

You're right, this applies to crushtool. I'm adding a crushtool test.

@dvanders dvanders force-pushed the dvanders_large_crush_devices branch from 81961a5 to 9a84d70 Compare February 25, 2026 19:54
@dvanders
Copy link
Contributor Author

jenkins retest this please

@dvanders
Copy link
Contributor Author

testing from @JoshuaGabriel

root@ctest-node-01:/# ceph -s
  cluster:
    id:     701a5c32-128c-11f1-b214-bc24113f050c
    health: HEALTH_OK

  services:
    mon: 4 daemons, quorum ctest-node-01,ctest-node-03,ctest-node-04,ctest-node-02 (age 14s) [leader: ctest-node-01]
    mgr: ctest-node-01.otxeyv(active, since 2m), standbys: ctest-node-04.xtlkfj
    osd: 4 osds: 4 up (since 26s), 4 in (since 50s)

  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 577 KiB
    usage:   108 MiB used, 400 GiB / 400 GiB avail
    pgs:     1 active+clean

croot@ctest-node-01:/# ceph osd tree
ID  CLASS  WEIGHT      TYPE NAME               STATUS  REWEIGHT  PRI-AFF
-1         1680.00000  root default
-3          420.00000      host ctest-node-01
 0    ssd   420.00000          osd.0               up   1.00000  1.00000
-9          420.00000      host ctest-node-02
 2    ssd   420.00000          osd.2               up   1.00000  1.00000
-7          420.00000      host ctest-node-03
 1    ssd   420.00000          osd.1               up   1.00000  1.00000
-5          420.00000      host ctest-node-04
 3    ssd   420.00000          osd.3               up   1.00000  1.00000
root@ctest-node-01:/# ceph config dump
WHO     MASK                LEVEL     OPTION                                 VALUE                                                                                                 RO
global                      advanced  osd_crush_initial_weight               420.000000

@JoshuaGabriel
Copy link
Contributor

crushtool seems happy

root@ctest-node-01:/# ceph osd tree
ID  CLASS  WEIGHT      TYPE NAME               STATUS  REWEIGHT  PRI-AFF
-1         1680.00000  root default
-9          420.00000      host ctest-node-01
 2    ssd   420.00000          osd.2               up   1.00000  1.00000
-7          420.00000      host ctest-node-02
 1    ssd   420.00000          osd.1               up   1.00000  1.00000
-3          420.00000      host ctest-node-03
 0    ssd   420.00000          osd.0               up   1.00000  1.00000
-5          420.00000      host ctest-node-04
 3    ssd   420.00000          osd.3               up   1.00000  1.00000
root@ctest-node-01:/# ceph osd getcrushmap -o crush.map
5
root@ctest-node-01:/# crushtool -d crush.map -o crush.txt
root@ctest-node-01:/# vi crush.txt
root@ctest-node-01:/# cat crush.txt  | grep osd.0
device 0 osd.0 class ssd
        item osd.0 weight 421.00000
root@ctest-node-01:/# crushtool -c crush.txt -o crush.map2
root@ctest-node-01:/#

Add a test which compiles a crush map with a device larger than
100TiB.

Signed-off-by: Dan van der Ster <[email protected]>
@dvanders dvanders force-pushed the dvanders_large_crush_devices branch from 9a84d70 to 382c4b5 Compare February 26, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants