Skip to content

mgr/cephadm: autotune osd_memory_target#39550

Merged
liewegas merged 8 commits intoceph:masterfrom
liewegas:cephadm-memory
May 13, 2021
Merged

mgr/cephadm: autotune osd_memory_target#39550
liewegas merged 8 commits intoceph:masterfrom
liewegas:cephadm-memory

Conversation

@liewegas
Copy link
Member

@liewegas liewegas commented Feb 18, 2021

  • report both limits and current/recent memory usage in 'ceph orch ps' result
  • osd_memory_target_autotune bool (default: false) will allow mgr to tune the memory
  • rechecked every 10m

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@liewegas liewegas changed the title WIP: mgr/cephadm: set container memory limits mgr/cephadm: autotune osd_memory_target May 4, 2021
@github-actions github-actions bot added the tests label May 4, 2021
@liewegas
Copy link
Member Author

liewegas commented May 6, 2021

orch ps snapshot from one run:

2021-05-06T19:41:01.092 INFO:teuthology.orchestra.run.smithi016.stdout:NAME                            HOST       PORTS        STATUS         REFRESHED  AGE  MEM USED  TARGET  VERSION                IMAGE ID      CONTAINER ID
2021-05-06T19:41:01.092 INFO:teuthology.orchestra.run.smithi016.stdout:alertmanager.smithi016          smithi016  *:9093,9094  running (7m)      8s ago   8m     10.9M       -  0.20.0                 0881eb8f169f  be69c0279a25
2021-05-06T19:41:01.093 INFO:teuthology.orchestra.run.smithi016.stdout:cephfs-mirror.smithi016.xiylqc  smithi016               running (16s)     8s ago  16s     12.9M       -  17.0.0-3900-gff422198  3fd33d75a753  2cb5deb00482
2021-05-06T19:41:01.093 INFO:teuthology.orchestra.run.smithi016.stdout:cephfs-mirror.smithi023.mnlrbi  smithi023               running (14s)     9s ago  14s     13.0M       -  17.0.0-3900-gff422198  3fd33d75a753  12b38a1ab012
2021-05-06T19:41:01.093 INFO:teuthology.orchestra.run.smithi016.stdout:crash.smithi016                 smithi016               running (8m)      8s ago   8m     7312k       -  17.0.0-3900-gff422198  3fd33d75a753  66944c4232fe
2021-05-06T19:41:01.093 INFO:teuthology.orchestra.run.smithi016.stdout:crash.smithi023                 smithi023               running (8m)      9s ago   8m     9.83M       -  17.0.0-3900-gff422198  3fd33d75a753  028f8f1872c6
2021-05-06T19:41:01.094 INFO:teuthology.orchestra.run.smithi016.stdout:grafana.smithi016               smithi016  *:3000       running (7m)      8s ago   8m     23.9M       -  6.7.4                  ae5c36c3d3cd  c05d3c4e24e4
2021-05-06T19:41:01.094 INFO:teuthology.orchestra.run.smithi016.stdout:mgr.smithi016.bxbemg            smithi016  *:9283       running (10m)     8s ago  10m      421M       -  17.0.0-3900-gff422198  3fd33d75a753  ac679947d772
2021-05-06T19:41:01.094 INFO:teuthology.orchestra.run.smithi016.stdout:mgr.smithi023.vfzaup            smithi023  *:8443,9283  running (7m)      9s ago   7m      364M       -  17.0.0-3900-gff422198  3fd33d75a753  8bf0f5d2dfaa
2021-05-06T19:41:01.094 INFO:teuthology.orchestra.run.smithi016.stdout:mon.smithi016                   smithi016               running (10m)     8s ago  10m     49.5M   2048M  17.0.0-3900-gff422198  3fd33d75a753  8a1d0f1dfedf
2021-05-06T19:41:01.094 INFO:teuthology.orchestra.run.smithi016.stdout:mon.smithi023                   smithi023               running (7m)      9s ago   7m     37.7M   2048M  17.0.0-3900-gff422198  3fd33d75a753  d917a7b66ce0
2021-05-06T19:41:01.095 INFO:teuthology.orchestra.run.smithi016.stdout:node-exporter.smithi016         smithi016  *:9100       running (8m)      8s ago   8m     11.6M       -  0.18.1                 e5a616e4b9cf  411e58182ae4
2021-05-06T19:41:01.095 INFO:teuthology.orchestra.run.smithi016.stdout:node-exporter.smithi023         smithi023  *:9100       running (7m)      9s ago   7m     11.3M       -  0.18.1                 e5a616e4b9cf  e0686e2304b5
2021-05-06T19:41:01.095 INFO:teuthology.orchestra.run.smithi016.stdout:osd.0                           smithi016               running (7m)      8s ago   7m     37.2M   3265M  17.0.0-3900-gff422198  3fd33d75a753  fe5e9a5fde08
2021-05-06T19:41:01.095 INFO:teuthology.orchestra.run.smithi016.stdout:osd.1                           smithi016               running (6m)      8s ago   6m     39.3M   3265M  17.0.0-3900-gff422198  3fd33d75a753  77140ddb9679
2021-05-06T19:41:01.096 INFO:teuthology.orchestra.run.smithi016.stdout:osd.2                           smithi016               running (6m)      8s ago   6m     36.7M   3265M  17.0.0-3900-gff422198  3fd33d75a753  1a658f71acd9
2021-05-06T19:41:01.096 INFO:teuthology.orchestra.run.smithi016.stdout:osd.3                           smithi016               running (5m)      8s ago   5m     36.2M   3265M  17.0.0-3900-gff422198  3fd33d75a753  5e23bf9c85ae
2021-05-06T19:41:01.096 INFO:teuthology.orchestra.run.smithi016.stdout:osd.4                           smithi023               running (5m)      9s ago   5m     37.4M   4032M  17.0.0-3900-gff422198  3fd33d75a753  61fafce45aa2
2021-05-06T19:41:01.096 INFO:teuthology.orchestra.run.smithi016.stdout:osd.5                           smithi023               running (4m)      9s ago   4m     36.2M   4032M  17.0.0-3900-gff422198  3fd33d75a753  5ad337c877c2
2021-05-06T19:41:01.096 INFO:teuthology.orchestra.run.smithi016.stdout:osd.6                           smithi023               running (4m)      9s ago   4m     35.5M   4032M  17.0.0-3900-gff422198  3fd33d75a753  bf09684a951a
2021-05-06T19:41:01.097 INFO:teuthology.orchestra.run.smithi016.stdout:osd.7                           smithi023               running (3m)      9s ago   4m     36.2M   4032M  17.0.0-3900-gff422198  3fd33d75a753  e3c190706c82
2021-05-06T19:41:01.097 INFO:teuthology.orchestra.run.smithi016.stdout:prometheus.smithi016            smithi016  *:9095       running (7m)      8s ago   8m     29.8M       -  2.18.1                 de242295e225  92cb43f6caca
2021-05-06T19:41:01.097 INFO:teuthology.orchestra.run.smithi016.stdout:rbd-mirror.smithi016.ukrnqx     smithi016               running (26s)     8s ago  26s     12.9M       -  17.0.0-3900-gff422198  3fd33d75a753  1b671a7a2353
2021-05-06T19:41:01.097 INFO:teuthology.orchestra.run.smithi016.stdout:rbd-mirror.smithi023.jcdjbx     smithi023               running (30s)     9s ago  31s     12.8M       -  17.0.0-3900-gff422198  3fd33d75a753  db5cdbfd06da

@liewegas liewegas requested a review from sebastian-philipp May 6, 2021 21:00
Copy link
Contributor

@adk3798 adk3798 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

liewegas added 7 commits May 12, 2021 11:01
Fill in from {osd,mon}_memory_target if no container limit is set.

Signed-off-by: Sage Weil <[email protected]>
- set osd_memory_target_autotune=true to enable
- tuning is periodic (check every 10m by default)
- tuned values are reflected by osd_memory_target config options scoped
  to the host
- only make a change if it appears that we will affect at least 1 of the
  relevant OSDs
- attempt to clean out conflicting options.  (This is imperfect, since any
  manner of weirdly-scoped config options could be responsible; we only
  attempt to clean out one scoped directly to the osd name.)

Signed-off-by: Sage Weil <[email protected]>
@liewegas liewegas merged commit beb5767 into ceph:master May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants