Skip to content

Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies#9462

Merged
oranagra merged 50 commits intoredis:unstablefrom
filipecosta90:commands.latency.histogram
Jan 5, 2022
Merged

Added INFO LATENCYSTATS section: latency by percentile distribution/latency by cumulative distribution of latencies#9462
oranagra merged 50 commits intoredis:unstablefrom
filipecosta90:commands.latency.histogram

Conversation

@filipecosta90
Copy link
Contributor

@filipecosta90 filipecosta90 commented Sep 5, 2021

Fixes #4707.

Short description

The Redis extended latency stats track per command latencies and enables:

  • exporting the per-command percentile distribution via the INFO LATENCYSTATS command. ( percentile distribution is not mergeable between cluster nodes ).
  • exporting the per-command cumulative latency distributions via the LATENCY HISTOGRAM command. Using the cumulative distribution of latencies we can merge several stats from different cluster nodes to calculate aggregate metrics .

By default, the extended latency monitoring is enabled since the overhead of keeping track of the command latency is very small.

If you don't want to track extended latency metrics, you can easily disable it at runtime using the command:

  • CONFIG SET latency-tracking no

By default, the exported latency percentiles are the p50, p99, and p999. You can alter them at runtime using the command:

  • CONFIG SET latency-tracking-info-percentiles "0.0 50.0 100.0"

Some details:

  • The total size per histogram should sit around 40 KiB. We only allocate those 40KiB when a command was called for the first time.
  • With regards to the WRITE overhead As seen below, there is no measurable overhead on the achievable ops/sec or full latency spectrum on the client. Including also the measured redis-benchmark for unstable vs this branch.
  • We track from 1 nanosecond to 1 second ( everything above 1 second is considered +Inf )

overhead test using redis-benchmark on unstable vs commands.latency.histogram

I've tested using ping and set commands ( fast ones ) to assess the largest possible overhead of histogram latency tracking.
As seen below there is no measurable overhead on the achievable ops/sec or full latency spectrum on the client. This matches the expected behaviour, given on previous microbenchmarks of the hdr_record_value (the function used) >>see reference<< took on average 5-6 ns/op.

results on unstable ( commit hash = af0b50f )

Used redis-server command

taskset -c 0 ./src/redis-server --save "" --daemonize yes
taskset -c 1,2 redis-benchmark -c 50 --threads 2 -d 1024 -r 1000000 -n 10000000 --csv -t ping,set,get
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms"
"PING_INLINE","212512.75","0.220","0.040","0.215","0.279","0.295","24.383"
"PING_MBULK","220716.45","0.213","0.040","0.215","0.263","0.287","28.127"
"SET","175198.86","0.269","0.040","0.263","0.359","0.391","20.383"
"GET","179121.59","0.263","0.040","0.263","0.351","0.367","32.127"

results on commands.latency.histogram branch with latency track disabled

Used redis-server command

taskset -c 0 ./src/redis-server --save "" --latency-tracking no --daemonize yes
taskset -c 1,2 redis-benchmark -c 50 --threads 2 -d 1024 -r 1000000 -n 10000000 --csv -t ping,set,get
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms"
"PING_INLINE","212408.94","0.221","0.040","0.223","0.279","0.295","24.383"
"PING_MBULK","219375.22","0.214","0.032","0.215","0.263","0.287","28.367"
"SET","173644.27","0.270","0.040","0.263","0.367","0.391","28.207"
"GET","177619.89","0.265","0.040","0.263","0.351","0.375","28.127"

results on commands.latency.histogram branch with latency track enabled

taskset -c 1,2 redis-benchmark -c 50 --threads 2 -d 1024 -r 1000000 -n 10000000 --csv -t ping,set,get
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms"
"PING_INLINE","211528.28","0.223","0.040","0.223","0.279","0.295","24.191"
"PING_MBULK","219664.36","0.214","0.040","0.215","0.263","0.287","24.127"
"SET","175204.98","0.271","0.040","0.263","0.359","0.383","24.399"
"GET","179172.94","0.263","0.040","0.263","0.351","0.375","24.239"

INFO LATENCYSTATS exposition format

  • Format: latency_percentiles_usec_<CMDNAME>:p0=XX,p50....

latencystats output sample:

root@hpe10:~# redis-cli INFO LATENCYSTATS
# Latencystats - latency by percentile distribution
latency_percentiles_usec_zadd:p50.000000=0.001,p99.000000=1.003,p99.900000=4.015
latency_percentiles_usec_hset:p50.000000=0.001,p99.000000=1.003,p99.900000=4.015
latency_percentiles_usec_rpop:p50.000000=0.001,p99.000000=1.003,p99.900000=6.015
latency_percentiles_usec_set:p50.000000=0.001,p99.000000=1.003,p99.900000=4.015
latency_percentiles_usec_zpopmin:p50.000000=0.001,p99.000000=1.003,p99.900000=3.007
latency_percentiles_usec_spop:p50.000000=0.001,p99.000000=1.003,p99.900000=3.007
latency_percentiles_usec_lpush:p50.000000=0.001,p99.000000=1.003,p99.900000=4.015
latency_percentiles_usec_ping:p50.000000=0.001,p99.000000=1.003,p99.900000=1.003
latency_percentiles_usec_lpop:p50.000000=0.001,p99.000000=1.003,p99.900000=5.023
latency_percentiles_usec_lrange:p50.000000=14.015,p99.000000=22.015,p99.900000=56.063
latency_percentiles_usec_incr:p50.000000=0.001,p99.000000=1.003,p99.900000=4.015
latency_percentiles_usec_get:p50.000000=0.001,p99.000000=1.003,p99.900000=3.007
latency_percentiles_usec_mset:p50.000000=1.003,p99.000000=1.003,p99.900000=2.007
latency_percentiles_usec_rpush:p50.000000=0.001,p99.000000=1.003,p99.900000=4.015
latency_percentiles_usec_sadd:p50.000000=0.001,p99.000000=1.003,p99.900000=3.007

LATENCY HISTOGRAM [command ...] exposition format

Return a cumulative distribution of latencies in the format of a histogram for the specified command names.

The histogram is composed of a map of time buckets:

  • Each representing a latency range, between 1 nanosecond and roughly 1 second.
  • Each bucket covers twice the previous bucket's range.
  • Empty buckets are not printed.
  • Everything above 1 sec is considered +Inf.
  • At max there will be log2(1000000000)=30 buckets

We reply a map for each command in the format <command name> : { calls: <total command calls> , histogram : { <bucket 1> : latency , < bucket 2> : latency, ... } }

RESP2:

 redis-cli latency histogram set spop
1) "set"
2) 1) "calls"
   2) (integer) 100000
   3) "histogram_usec"
   4) 1) (integer) 1
      2) (integer) 99986
      3) (integer) 4
      4) (integer) 99991
      5) (integer) 16
      6) (integer) 100000
3) "spop"
4) 1) "calls"
   2) (integer) 100000
   3) "histogram_usec"
   4) 1) (integer) 1
      2) (integer) 99991
      3) (integer) 4
      4) (integer) 99995
      5) (integer) 8
      6) (integer) 99997
      7) (integer) 16
      8) (integer) 100000

RESP3:

# redis-cli -3 latency histogram
 1# "set" => 1# "calls" => (integer) 100000
    2# "histogram_usec" => 1# (integer) 1 => (integer) 99986
       2# (integer) 4 => (integer) 99991
       3# (integer) 16 => (integer) 100000
 2# "spop" => 1# "calls" => (integer) 100000
    2# "histogram_usec" => 1# (integer) 1 => (integer) 99991
       2# (integer) 4 => (integer) 99995
       3# (integer) 8 => (integer) 99997
       4# (integer) 16 => (integer) 100000

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM (with minor code style issues).
I've briefly read the discussion in the issue, and the compromises you made seem OK.

The one part that's not solved is the fact the request and some of the discussion was around tracking the full command processing + waiting time, and that's too complicated to handle.
Considering the low impact of this PR on both complexity and performance, I suppose we can take it even if the value is not as high as we wished.

Regarding the format, the only concern I have is what if we'll want to add more metrics in the future, like one that's including the blocked time for blocking commands.
I.E. For normal command stats (total time) we can simply add more fields, but here if we wanna do that we need to add more sections.

@madolson
Copy link
Contributor

madolson commented Sep 6, 2021

@filipecosta90 For the sake of completeness, can you run your perf test again with an r values greater than 100k. I'd like to make sure the benchmark holds up when basically everything isn't in some level of the CPU cache you probably tested on.

I think this is really exciting!

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also looks like you're missing module init?

@YacineTALEB
Copy link

The size of hdr_histogram struct from

struct hdr_histogram
{
int64_t lowest_trackable_value;
int64_t highest_trackable_value;
int32_t unit_magnitude;
int32_t significant_figures;
int32_t sub_bucket_half_count_magnitude;
int32_t sub_bucket_half_count;
int64_t sub_bucket_mask;
int32_t sub_bucket_count;
int32_t bucket_count;
int64_t min_value;
int64_t max_value;
int32_t normalizing_index_offset;
double conversion_ratio;
int32_t counts_len;
int64_t total_count;
int64_t* counts;
};
is 104 bytes itself. The hdr_histogram count array is stored in int64_t* counts; and is of size counts_len when initialized.

Initializing a histogram with the values from aaac165, i.e.,

LATENCY_HISTOGRAM_MIN_VALUE 1L 
LATENCY_HISTOGRAM_MAX_VALUE 1000000L 
LATENCY_HISTOGRAM_PRECISION 3

Would lead to the creation of a histogram with a counts_len of 11264. Overall the overhead size of a single histogram, according to

int hdr_calculate_bucket_config(
int64_t lowest_trackable_value,
int64_t highest_trackable_value,
int significant_figures,
struct hdr_histogram_bucket_config* cfg)
{
int32_t sub_bucket_count_magnitude;
int64_t largest_value_with_single_unit_resolution;
if (lowest_trackable_value < 1 ||
significant_figures < 1 || 5 < significant_figures ||
lowest_trackable_value * 2 > highest_trackable_value)
{
return EINVAL;
}
cfg->lowest_trackable_value = lowest_trackable_value;
cfg->significant_figures = significant_figures;
cfg->highest_trackable_value = highest_trackable_value;
largest_value_with_single_unit_resolution = 2 * power(10, significant_figures);
sub_bucket_count_magnitude = (int32_t) ceil(log((double)largest_value_with_single_unit_resolution) / log(2));
cfg->sub_bucket_half_count_magnitude = ((sub_bucket_count_magnitude > 1) ? sub_bucket_count_magnitude : 1) - 1;
cfg->unit_magnitude = (int32_t) floor(log((double)lowest_trackable_value) / log(2));
cfg->sub_bucket_count = (int32_t) pow(2, (cfg->sub_bucket_half_count_magnitude + 1));
cfg->sub_bucket_half_count = cfg->sub_bucket_count / 2;
cfg->sub_bucket_mask = ((int64_t) cfg->sub_bucket_count - 1) << cfg->unit_magnitude;
if (cfg->unit_magnitude + cfg->sub_bucket_half_count_magnitude > 61)
{
return EINVAL;
}
cfg->bucket_count = buckets_needed_to_cover_value(highest_trackable_value, cfg->sub_bucket_count, (int32_t)cfg->unit_magnitude);
cfg->counts_len = (cfg->bucket_count + 1) * (cfg->sub_bucket_count / 2);
return 0;
}
would be (11264 * 8) + 104 = 90216 bytes

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

@filipecosta90 do you wanna pick this up?

@filipecosta90
Copy link
Contributor Author

@filipecosta90 do you wanna pick this up?

@oranagra this weekend will focus on it. 👍

@filipecosta90
Copy link
Contributor Author

@oranagra and @madolson I've picked this one again, and now all tests pass, and added the ability to enable/disable this at runtime, and to configure the exported percentiles.
IMO missing extra testing, and extra benchmarks, that I will address in the following days.

@filipecosta90 filipecosta90 added the action:run-benchmark Triggers the benchmark suite for this Pull Request label Dec 27, 2021
@filipecosta90
Copy link
Contributor Author

filipecosta90 commented Dec 27, 2021

@filipecosta90 For the sake of completeness, can you run your perf test again with an r values greater than 100k. I'd like to make sure the benchmark holds up when basically everything isn't in some level of the CPU cache you probably tested on.

@madolson I've updated the benchmark results with the latest unstable vs this feature branch ( with the feature enabled disabled ) and a larger keyspace len as requested

overhead test using redis-benchmark on unstable vs commands.latency.histogram

I've tested using ping and set commands ( fast ones ) to assess the largest possible overhead of histogram latency tracking.
As seen below there is no measurable overhead on the achievable ops/sec or full latency spectrum on the client. This matches the expected behaviour, given on previous microbenchmarks of the hdr_record_value (the function used) >>see reference<< took on average 5-6 ns/op.

results on unstable ( commit hash = af0b50f )

Used redis-server command

taskset -c 0 ./src/redis-server --save "" --daemonize yes
taskset -c 1,2 redis-benchmark -c 50 --threads 2 -d 1024 -r 1000000 -n 10000000 --csv -t ping,set,get
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms"
"PING_INLINE","212512.75","0.220","0.040","0.215","0.279","0.295","24.383"
"PING_MBULK","220716.45","0.213","0.040","0.215","0.263","0.287","28.127"
"SET","175198.86","0.269","0.040","0.263","0.359","0.391","20.383"
"GET","179121.59","0.263","0.040","0.263","0.351","0.367","32.127"

results on commands.latency.histogram branch with latency track disabled

Used redis-server command

taskset -c 0 ./src/redis-server --save "" --latency-track no --daemonize yes
taskset -c 1,2 redis-benchmark -c 50 --threads 2 -d 1024 -r 1000000 -n 10000000 --csv -t ping,set,get
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms"
"PING_INLINE","212408.94","0.221","0.040","0.223","0.279","0.295","24.383"
"PING_MBULK","219375.22","0.214","0.032","0.215","0.263","0.287","28.367"
"SET","173644.27","0.270","0.040","0.263","0.367","0.391","28.207"
"GET","177619.89","0.265","0.040","0.263","0.351","0.375","28.127"

results on commands.latency.histogram branch with latency track enabled

Used redis-server command

taskset -c 0 ./src/redis-server --save "" --latency-track yes --daemonize yes
taskset -c 1,2 redis-benchmark -c 50 --threads 2 -d 1024 -r 1000000 -n 10000000 --csv -t ping,set,get
"test","rps","avg_latency_ms","min_latency_ms","p50_latency_ms","p95_latency_ms","p99_latency_ms","max_latency_ms"
"PING_INLINE","211528.28","0.223","0.040","0.223","0.279","0.295","24.191"
"PING_MBULK","219664.36","0.214","0.040","0.215","0.263","0.287","24.127"
"SET","175204.98","0.271","0.040","0.263","0.359","0.383","24.399"
"GET","179172.94","0.263","0.040","0.263","0.351","0.375","24.239"

@filipecosta90
Copy link
Contributor Author

@madolson and @oranagra I believe all requested changes have been addressed. When you have the time can you check it?

Copy link
Contributor

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filipecosta90 LGTM. I think @oranagra touched most of the remaining issues.
placed some small comments. but looks good. thank you for this!

…tiplied, do not wrap. zmalloc_oom_handler logs the num_items trying to allocate alongside the item size.
filipecosta90 and others added 2 commits January 5, 2022 11:26
…ds:debug. Removed unecessary stddef header in hdrhistogram
@oranagra
Copy link
Member

oranagra commented Jan 5, 2022

For the record, the core-team approved this PR on a voice meeting.

@oranagra oranagra merged commit 5dd1544 into redis:unstable Jan 5, 2022
@oranagra
Copy link
Member

oranagra commented Jan 5, 2022

@filipecosta90 thank you.
can you please make a redis-doc PR for LATENCY HISTOGRAM and INFO

@filipecosta90
Copy link
Contributor Author

@filipecosta90 thank you. can you please make a redis-doc PR for LATENCY HISTOGRAM and INFO

thank you all the review cycles!

can you please make a redis-doc PR for LATENCY HISTOGRAM and INFO

will do so

@filipecosta90
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action:run-benchmark Triggers the benchmark suite for this Pull Request approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus state:needs-doc-pr requires a PR to redis-doc repository

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Question : Latency Histograms per Command

6 participants