[RFC] crimson: understand performance impacts by skipping mempool in raw #52404
[RFC] crimson: understand performance impacts by skipping mempool in raw #52404
Conversation
|
Interesting. I'm in favor of globally disabling mempool counters for crimson for now. IIRC, their usage is informational -- it's a way of dumping memory usage of different structures. I expect the issue is the usage of atomics? If so, we'll probably want per-reactor counters that can be dynamically combined when queried. |
Yes, these counters are shared across-cores, IIUC it causes CPUs to race the same cache-line, and simply changing the atomics to non-atomics cannot help in this case.
ceph/src/crimson/tools/perf_crimson_msgr.cc Lines 233 to 238 in c50c042 This is why the mempool counters are the only issue in the tests. When it comes to OSD, things are more complicated because we will submit messages as well as their buffers across cores, and I think we currently construct and destruct buffers from different shards, which requires their ref-counters to be atomic and shared across-cores in the hot path. So the problem becomes synthetic with OSD and I'm not sure whether enforcing construction/destruction in the same shard will be faster (i.e. atomic ref-counter vs submitting buffer destructions to the original shard). And it seems to me is not a simple change. |
|
Close in favor of #53130 |
The above results are using
perf_crimson_msgr (client mode)to pressurizeperf_crimson_msgr (server mode)orperf_async_msgrto understand apple-to-apple performance by CPU cores used.Initially, both crimson and async msgr have similar performance curve, but further analysis shows they have different bottlenecks:
Async msgr should share the same infrastructure about the buffer and mempool, but its immediate bottleneck is not there, which implies that its scaling issue might be synthetic.
For crimson msgr, if comment out the mempool counters in this PR, the CPU scaling problem is relaxed greatly, see the curve of crimson-poc.
In short:
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows