Project

General

Profile

Actions

Bug #67339

open

mon.a (mon.0) 1144 : cluster [WRN] Health check failed: 1 OSD(s) experiencing slow operations in BlueStore (BLUESTORE_SLOW_OP_ALERT)" in cluster log

Added by Venky Shankar over 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Q/A
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

Seen with fs:workload: /a/vshankar-2024-08-01_09:58:56-fs-wip-vshankar-testing-20240801.064407-debug-testing-default-smithi/7829876


Related issues 2 (2 open0 closed)

Related to CephFS - Enhancement #68283: qa: ignore BLUESTORE_SLOW_OP_ALERTPending BackportPatrick Donnelly

Actions
Related to bluestore - Bug #68337: OSD(s) experiencing slow operations in BlueStore (BLUESTORE_SLOW_OP_ALERT) in cluster logNewIgor Fedotov

Actions
Actions #1

Updated by Radoslaw Zarzynski over 1 year ago

  • Project changed from RADOS to bluestore
  • Component(RADOS) deleted (BlueStore)
Actions #3

Updated by Igor Fedotov over 1 year ago

Having https://github.com/ceph/ceph/pull/59481 might be helpful to investigate the root cause.
For now it looks like the alert itself is valid as osd did report slow op(s):

2024-08-03T14:16:14.705+0000 7fd3cd9ce640 0 bluestore(/var/lib/ceph/osd/ceph-4) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.122113705s, txc = 0x559d64adb8802

This could be caused e.g. by large transaction issued to OSD or something. So we need more information at this point.

Actions #4

Updated by Igor Fedotov over 1 year ago

Curious thing is that in all 3 cases it was osd.4 which reported slow ops.

Actions #6

Updated by Patrick Donnelly over 1 year ago

Actions #7

Updated by Igor Fedotov over 1 year ago

  • Related to Bug #68337: OSD(s) experiencing slow operations in BlueStore (BLUESTORE_SLOW_OP_ALERT) in cluster log added
Actions

Also available in: Atom PDF