Bug #67339
openmon.a (mon.0) 1144 : cluster [WRN] Health check failed: 1 OSD(s) experiencing slow operations in BlueStore (BLUESTORE_SLOW_OP_ALERT)" in cluster log
0%
Description
Seen with fs:workload: /a/vshankar-2024-08-01_09:58:56-fs-wip-vshankar-testing-20240801.064407-debug-testing-default-smithi/7829876
Updated by Radoslaw Zarzynski over 1 year ago
- Project changed from RADOS to bluestore
- Component(RADOS) deleted (
BlueStore)
Updated by Milind Changire over 1 year ago
main: https://pulpito.ceph.com/mchangir-2024-08-02_07:51:06-fs-wip-mchangir-uninline-debug-distro-default-smithi/7831733
main: https://pulpito.ceph.com/mchangir-2024-08-02_07:51:06-fs-wip-mchangir-uninline-debug-distro-default-smithi/7832021
Updated by Igor Fedotov over 1 year ago
Having https://github.com/ceph/ceph/pull/59481 might be helpful to investigate the root cause.
For now it looks like the alert itself is valid as osd did report slow op(s):
2024-08-03T14:16:14.705+0000 7fd3cd9ce640 0 bluestore(/var/lib/ceph/osd/ceph-4) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.122113705s, txc = 0x559d64adb8802
This could be caused e.g. by large transaction issued to OSD or something. So we need more information at this point.
Updated by Igor Fedotov over 1 year ago
Curious thing is that in all 3 cases it was osd.4 which reported slow ops.
Updated by Milind Changire over 1 year ago
main: https://pulpito.ceph.com/mchangir-2024-08-30_08:30:37-fs-wip-mchangir-testing-20240828.085847-main-debug-distro-default-smithi/7881426
main: https://pulpito.ceph.com/mchangir-2024-08-30_08:30:37-fs-wip-mchangir-testing-20240828.085847-main-debug-distro-default-smithi/7881456
Updated by Patrick Donnelly over 1 year ago
- Related to Enhancement #68283: qa: ignore BLUESTORE_SLOW_OP_ALERT added
Updated by Igor Fedotov over 1 year ago
- Related to Bug #68337: OSD(s) experiencing slow operations in BlueStore (BLUESTORE_SLOW_OP_ALERT) in cluster log added
Updated by Milind Changire about 1 year ago
squid:
https://pulpito.ceph.com/vshankar-2024-11-25_18:02:39-fs-wip-vshankar-testing-20241125.064001-squid-debug-testing-default-smithi/8008351
https://pulpito.ceph.com/vshankar-2024-11-25_18:02:39-fs-wip-vshankar-testing-20241125.064001-squid-debug-testing-default-smithi/8008524
Updated by Milind Changire about 1 year ago
https://pulpito.ceph.com/vshankar-2024-11-23_10:32:29-fs-wip-vshankar-testing-20241122.180955-squid-debug-testing-default-smithi/8005625
https://pulpito.ceph.com/vshankar-2024-11-23_10:32:29-fs-wip-vshankar-testing-20241122.180955-squid-debug-testing-default-smithi/8005701
Updated by Milind Changire about 1 year ago
squid:
https://pulpito.ceph.com/vshankar-2025-01-05_17:50:50-fs-wip-vshankar-testing-20250105.135958-squid-debug-testing-default-smithi/8062503
https://pulpito.ceph.com/vshankar-2025-01-05_17:50:50-fs-wip-vshankar-testing-20250105.135958-squid-debug-testing-default-smithi/8062636