Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies#9462
Conversation
…tion/latency by cumulative distribution of latencies
oranagra
left a comment
There was a problem hiding this comment.
Code LGTM (with minor code style issues).
I've briefly read the discussion in the issue, and the compromises you made seem OK.
The one part that's not solved is the fact the request and some of the discussion was around tracking the full command processing + waiting time, and that's too complicated to handle.
Considering the low impact of this PR on both complexity and performance, I suppose we can take it even if the value is not as high as we wished.
Regarding the format, the only concern I have is what if we'll want to add more metrics in the future, like one that's including the blocked time for blocking commands.
I.E. For normal command stats (total time) we can simply add more fields, but here if we wanna do that we need to add more sections.
|
@filipecosta90 For the sake of completeness, can you run your perf test again with an r values greater than 100k. I'd like to make sure the benchmark holds up when basically everything isn't in some level of the CPU cache you probably tested on. I think this is really exciting! |
madolson
left a comment
There was a problem hiding this comment.
Also looks like you're missing module init?
Co-authored-by: Madelyn Olson <[email protected]>
|
The size of redis/deps/hdr_histogram/hdr_histogram.h Lines 17 to 35 in d96f47c 104 bytes itself. The hdr_histogram count array is stored in int64_t* counts; and is of size counts_len when initialized.
Initializing a histogram with the values from aaac165, i.e., Would lead to the creation of a histogram with a redis/deps/hdr_histogram/hdr_histogram.c Lines 319 to 358 in d96f47c (11264 * 8) + 104 = 90216 bytes
|
|
@filipecosta90 do you wanna pick this up? |
@oranagra this weekend will focus on it. 👍 |
…/redis into commands.latency.histogram
…s configuration parameters
@madolson I've updated the benchmark results with the latest unstable vs this feature branch ( with the feature enabled disabled ) and a larger keyspace len as requested overhead test using redis-benchmark on unstable vs commands.latency.histogramI've tested using ping and set commands ( fast ones ) to assess the largest possible overhead of histogram latency tracking. results on unstable ( commit hash = af0b50f )Used redis-server command results on commands.latency.histogram branch with latency track disabledUsed redis-server command results on commands.latency.histogram branch with latency track enabledUsed redis-server command |
ranshid
left a comment
There was a problem hiding this comment.
@filipecosta90 LGTM. I think @oranagra touched most of the remaining issues.
placed some small comments. but looks good. thank you for this!
…am allocators: iteration 2
…tiplied, do not wrap. zmalloc_oom_handler logs the num_items trying to allocate alongside the item size.
Co-authored-by: Oran Agra <[email protected]>
…ds:debug. Removed unecessary stddef header in hdrhistogram
…locking commands test
|
For the record, the core-team approved this PR on a voice meeting. |
|
@filipecosta90 thank you. |
thank you all the review cycles!
will do so |
|
@oranagra @yoav-steinberg @itamarhaber redis-doc PR: redis/redis-doc#1733 |
Fixes #4707.
Short description
The Redis extended latency stats track per command latencies and enables:
INFO LATENCYSTATScommand. ( percentile distribution is not mergeable between cluster nodes ).LATENCY HISTOGRAMcommand. Using the cumulative distribution of latencies we can merge several stats from different cluster nodes to calculate aggregate metrics .By default, the extended latency monitoring is enabled since the overhead of keeping track of the command latency is very small.
If you don't want to track extended latency metrics, you can easily disable it at runtime using the command:
CONFIG SET latency-tracking noBy default, the exported latency percentiles are the p50, p99, and p999. You can alter them at runtime using the command:
CONFIG SET latency-tracking-info-percentiles "0.0 50.0 100.0"Some details:
overhead test using redis-benchmark on unstable vs commands.latency.histogram
I've tested using ping and set commands ( fast ones ) to assess the largest possible overhead of histogram latency tracking.
As seen below there is no measurable overhead on the achievable ops/sec or full latency spectrum on the client. This matches the expected behaviour, given on previous microbenchmarks of the hdr_record_value (the function used) >>see reference<< took on average 5-6 ns/op.
results on unstable ( commit hash = af0b50f )
Used redis-server command
results on commands.latency.histogram branch with latency track disabled
Used redis-server command
results on commands.latency.histogram branch with latency track enabled
INFO LATENCYSTATSexposition formatlatency_percentiles_usec_<CMDNAME>:p0=XX,p50....latencystats output sample:
LATENCY HISTOGRAM [command ...]exposition formatReturn a cumulative distribution of latencies in the format of a histogram for the specified command names.
The histogram is composed of a map of time buckets:
We reply a map for each command in the format
<command name> : { calls: <total command calls> , histogram : { <bucket 1> : latency , < bucket 2> : latency, ... } }RESP2:
RESP3: