Skip to content

Optimize client memory usage tracking operation while client eviction is disabled#11348

Merged
oranagra merged 15 commits intoredis:unstablefrom
hpatro:optimize_client_memory_usage
Dec 7, 2022
Merged

Optimize client memory usage tracking operation while client eviction is disabled#11348
oranagra merged 15 commits intoredis:unstablefrom
hpatro:optimize_client_memory_usage

Conversation

@hpatro
Copy link
Contributor

@hpatro hpatro commented Oct 3, 2022

Issue

During the client input/output buffer processing, the memory usage is incrementally updated to keep track of clients going beyond a certain threshold maxmemory-clients to be evicted. However, this additional tracking activity leads to unnecessary CPU cycles wasted when no client-eviction is required. It is applicable in two cases.

  • maxmemory-clients is set to 0 which equates to no client eviction (applicable to all clients)
  • CLIENT NO-EVICT flag is set to ON which equates to a particular client not applicable for eviction.

Solution

  • Disable client memory usage tracking during the read/write flow when maxmemory-clients is set to 0 or client no-evict is on. The memory usage is tracked only during the clientCron i.e. it gets periodically updated.
  • Cleanup the clients from the memory usage bucket when client eviction is disabled.
  • When the maxmemory-clients config is enabled or disabled at runtime, we immediately update the memory usage buckets for all clients (tested scanning 80000 took some 20ms)

Benchmark shown that this can improve performance by about 5% in certain situations.

Testing

Redis Benchmark

./redis-benchmark -t set -r 100000 -n 10000000

During the above benchmark operation, ran the following CPU profiling command.

perf record -g --pid $(pgrep redis-server) -F 999 -- sleep 60

Machine

c5.4xlarge

Observation

The CPU utilization for updateClientMemoryUsage dropped from 2.39% during write and 1.07% during read to 0.

FlameGraph on the current branch

FlameGraph on the current branch

FlameGraph on the unstable branch

FlameGraph on the unstable branch

@hpatro hpatro requested review from madolson and oranagra October 3, 2022 03:40
@oranagra
Copy link
Member

oranagra commented Oct 3, 2022

@hpatro can you post some benchmark results as well?

please keep me honest (didn't review the code carefully)
this change only affects the client eviction buckets, right?
it doesn't prevent any memory introspection on the clients' memory usage?
and when client eviction is eventually enabled, it'll become effective after 1 second (when cron scanned all clients), right? (without any accounting gaps)

@hpatro
Copy link
Contributor Author

hpatro commented Oct 4, 2022

@hpatro can you post some benchmark results as well?

Will gather that shortly.

this change only affects the client eviction buckets, right?
it doesn't prevent any memory introspection on the clients' memory usage?

Correct. Memory introspection is still present but the accounting only happens via the clientsCron and not during input/output buffer modification when client-eviction is not enabled.

and when client eviction is eventually enabled, it'll become effective after 1 second (when cron scanned all clients), right? (without any accounting gaps)

When client-eviction is eventually enabled, I'm updating the memory usage and placing it in the correct bucket during the config update itself. This is an O(N) operation (N = number of clients) but with this there is no race condition with client-eviction and client-memory-usage update operation.

@oranagra
Copy link
Member

oranagra commented Oct 4, 2022

ohh, so it's not just the buckets that's not updated, it also means the INFO MEMORY fields are slightly outdated (up to the last second).
IIRC this was the situation in redis 6.2 (before client eviction was introduced).

I'm not sure an O(N) operation on all clients in CONFIG SET is acceptable. i'd rather let it kick in with a delay.
but on the other hand, we can't afford it to take any action until all the clients are set to their buckets.

@yoav-steinberg maybe you can have a look at this.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 4, 2022

ohh, so it's not just the buckets that's not updated, it also means the INFO MEMORY fields are slightly outdated (up to the last second). IIRC this was the situation in redis 6.2 (before client eviction was introduced).

Yes the same behavior as 6.2

@yoav-steinberg
Copy link
Contributor

I'm not sure an O(N) operation on all clients in CONFIG SET is acceptable. i'd rather let it kick in with a delay.
but on the other hand, we can't afford it to take any action until all the clients are set to their buckets.

Worth checking if there are some other slow CONFIG SETs. If so, it might be enough to document this and leave the code as is. Alternatively you can track how many clients aren't flagged to be excluded from client-eviction and how many clients are currently updated in the buckets, if the number doesn't match you skip the client eviction checks because you can assume the data isn't updated yet.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 20, 2022

Benchmark update

With a single redis-benchmark process, the redis-server wasn't getting saturated and the results were turning out to be flaky. Hence, I used the following approach to measure the performance for this.

  1. Used multiple redis-benchmark process(es) to saturate redis-server (CPU utilization 100%).
  2. Python script was used to measure the average ops/sec based on the instantaneous_ops_per_sec reported in info stats section. Average value reported below are computed for at least period of 60s.
import redis
import time

r = redis.Redis(host='localhost', port=6379, db=0)
count = 0
sum = 0
while True:
    count+=1
    cur_ops = r.info('stats')['instantaneous_ops_per_sec']
    sum += cur_ops
    print("Current Ops : ", cur_ops, "Average: ", sum/count)
    time.sleep(1)

Machine

c5.4xlarge

Benchmark 1: 2 x redis-benchmark -q -n 100000 -t set -l (Roughly 100 clients)

Average ops/sec with optimization: 187994
Average ops/sec without optimization: 183590
Gain in performance: 2.3%

Benchmark 2: 2 x redis-benchmark -q -r 100000 -n 100000 -t set,get -l (Roughly 100 clients)

Average ops/sec with optimization: 176319
Average ops/sec without optimization: 171587
Gain in performance: 2.7%

Benchmark 3: 2 x redis-benchmark -q -r 100000 -n 100000 -t set,get -l -c 100 (Roughly 200 clients)

Average ops/sec with optimization: 170203
Average ops/sec without optimization: 164883
Gain in performance: 3.2%

Benchmark 4: 4 x src/redis-benchmark -q -r 100000 -n 100000 -t set,get -l -c 1000 (Roughly 4000 clients)

Average ops/sec with optimization: 130238
Average ops/sec without optimization: 117926
Gain in performance: 10.4%

All the data points can be found here: https://gist.github.com/hpatro/a98297a9f29bd610315bd9731d428d52

@hpatro
Copy link
Contributor Author

hpatro commented Oct 20, 2022

@oranagra Also, I tried switching the config maxmemory-clients from zero to non-zero multiple times with 10000/20000 open clients and the config set operation was instantaneous. I guess the O(N) operation here might not be an issue. I could add a note in redis.conf to highlight this if we proceed with the above approach.

@oranagra
Copy link
Member

@hpatro can you measure the LATENCY LATEST or SLOWLOG value of that CONFIG SET? (instantaneous doesn't mean a lot to me).
also, i've seen cases with 70k clients, let's measure that.

@oranagra
Copy link
Member

@hpatro any news here on my last request?

@hpatro
Copy link
Contributor Author

hpatro commented Nov 22, 2022

@oranagra Couldn't get back to this earlier. I will try to pull the number this week.

@hpatro
Copy link
Contributor Author

hpatro commented Nov 24, 2022

I tested the scenario via running multiple redis-benchmark process with the following flags

./redis-benchmark -q  -n 10000000 -t set -c 10000 -h <host> -p <port>

After reaching the desired client count, I ran the following:

  1. Disable client memory usage tracking.
CONFIG SET maxmemory-clients 0
  1. Cleanup the info stats metrics.
CONFIG RESETSTAT
  1. Enable client memory usage tracking.
CONFIG SET maxmemory-clients 100%
  1. Get the metrics for config|set
INFO COMMANDSTATS

And here are the numbers (3 runs):

ACTIVE CLIENT COUNT MIN (us) MAX (us) AVG (us)
80000 20630 21640 21106
60000 15656 16360 16047
40000 10991 12305 11745
20000 6447 7179 7021
10000 3895 4778 4263
5000 1511 2236 1963

The time spent for this operation seems to be reasonable to me. I tested few more scenarios with higher payload size via -d but the time taken for execution of the config set maxmemory-clients 100% was similar due to same count of syscalls I guess.

@oranagra Let me know what you think.

@oranagra
Copy link
Member

ok, so that config takes only some 20ms for a 80k clients. sounds ok to me

* rename updateClientMemUsage so if there are any in old branches and
  forks it'll cail to build (due to the role change)
* move updateClientMemUsage to create a smaller diff
* updateClientMemUsageAndBucket make sure to clear the bucket even if
  client eviction is disabled
* clientsCron call updateClientMemUsageAndBucket when client eviction is
  enabled
* unify test code of new tests and some minior cleanups
@oranagra
Copy link
Member

oranagra commented Dec 5, 2022

@oranagra
Copy link
Member

oranagra commented Dec 5, 2022

using src/redis-benchmark -t set -r 100000 -n 10000000 -P 10 i can see 5% throughput improvement with default configuration (save and maxmemory-clients disabled)

@oranagra
Copy link
Member

oranagra commented Dec 5, 2022

@hpatro @madolson i implemented my notes since i would like to include this in the next 7.0 release.
please have a look and let me know if i can proceed.

scan the clients only when the config changes state from enabled to
disabled or vice versa.

chagne server.client_mem_usage_buckets to be runtime allocated and use
it as an indication of the previous config.
@oranagra oranagra merged commit c0267b3 into redis:unstable Dec 7, 2022
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Dec 7, 2022
@filipecosta90 filipecosta90 added the action:run-benchmark Triggers the benchmark suite for this Pull Request label Dec 7, 2022
@filipecosta90
Copy link
Contributor

filipecosta90 commented Dec 7, 2022

confirmed that updateClientMemUsage overhead reported in #10981 (comment) on the default benchmark run of redis-benchmark was mitigated:
doing a perf record of the default redis-benchmark before and after using:

perf record -g --pid $(pgrep redis-server) -F 999 --call-graph dwarf -o default-run.out -- taskset -c 1-3 redis-benchmark --csv -n 300000 --threads 3

I could check that before it took 1% of cpu cycles in
unstable branch - 8a315fc :

Samples: 52K of event 'cycles', Event count (approx.): 188766963769
  Children      Self  Command       Shared Objec  Symbol
-    0.99%     0.19%  redis-server  redis-server  [.] updateClientMemUsage                                                                                                                                 
   - 0.80% updateClientMemUsage                                                                                                                                                                            
        0.78% getClientMemoryUsage           

after 0.06% in unstable - c0267b3 ( after Optimize client memory usage tracking operation ) :

Samples: 51K of event 'cycles', Event count (approx.): 183444348521
  Children      Self  Command       Shared Objec  Symbol
     0.06%     0.06%  redis-server  redis-server  [.] updateClientMemUsageAndBucket

@oranagra
Copy link
Member

oranagra commented Dec 7, 2022

@filipecosta90
you should probably compare the calls for updateClientMemUsage with updateClientMemoryUsage, not updateClientMemUsageAndBucket.
i.e. all the places that called updateClientMemUsage now call updateClientMemoryUsage, and the AndBucket is almost never used.

i'm not sure we can compare a specific function to another one in this case, updateClientMemUsageAndBucket is getting called a lot but immediately returns, and there are other change to compliment that.

for this kind of change, i'd rather see the impact on the throughput rather than cost of a specific function (not being a local optimization to a function with equivalent capabilities).

@oranagra oranagra mentioned this pull request Dec 11, 2022
oranagra added a commit that referenced this pull request Dec 12, 2022
… is disabled (#11348)

## Issue
During the client input/output buffer processing, the memory usage is
incrementally updated to keep track of clients going beyond a certain
threshold `maxmemory-clients` to be evicted. However, this additional
tracking activity leads to unnecessary CPU cycles wasted when no
client-eviction is required. It is applicable in two cases.

* `maxmemory-clients` is set to `0` which equates to no client eviction
  (applicable to all clients)
* `CLIENT NO-EVICT` flag is set to `ON` which equates to a particular
  client not applicable for eviction.

## Solution
* Disable client memory usage tracking during the read/write flow when
  `maxmemory-clients` is set to `0` or `client no-evict` is `on`.
  The memory usage is tracked only during the `clientCron` i.e. it gets
  periodically updated.
* Cleanup the clients from the memory usage bucket when client eviction
  is disabled.
* When the maxmemory-clients config is enabled or disabled at runtime,
  we immediately update the memory usage buckets for all clients (tested
  scanning 80000 took some 20ms)

Benchmark shown that this can improve performance by about 5% in
certain situations.

Co-authored-by: Oran Agra <[email protected]>
(cherry picked from commit c0267b3)
oranagra pushed a commit that referenced this pull request Dec 28, 2022
…ors (#11657)

This call is introduced in #8687, but became irrelevant in #11348, and is currently a no-op.
The fact is that #11348 an unintended side effect, which is that even if the client eviction config
is enabled, there are certain types of clients for which memory consumption is not accurately
tracked, and so unlike normal clients, their memory isn't reported correctly in INFO.
oranagra pushed a commit that referenced this pull request Jan 16, 2023
…ors (#11657)

This call is introduced in #8687, but became irrelevant in #11348, and is currently a no-op.
The fact is that #11348 an unintended side effect, which is that even if the client eviction config
is enabled, there are certain types of clients for which memory consumption is not accurately
tracked, and so unlike normal clients, their memory isn't reported correctly in INFO.

(cherry picked from commit af0a4fe)
madolson pushed a commit to madolson/redis that referenced this pull request Apr 19, 2023
… is disabled (redis#11348)

## Issue
During the client input/output buffer processing, the memory usage is
incrementally updated to keep track of clients going beyond a certain
threshold `maxmemory-clients` to be evicted. However, this additional
tracking activity leads to unnecessary CPU cycles wasted when no
client-eviction is required. It is applicable in two cases.

* `maxmemory-clients` is set to `0` which equates to no client eviction
  (applicable to all clients)
* `CLIENT NO-EVICT` flag is set to `ON` which equates to a particular
  client not applicable for eviction.  

## Solution
* Disable client memory usage tracking during the read/write flow when
  `maxmemory-clients` is set to `0` or `client no-evict` is `on`.
  The memory usage is tracked only during the `clientCron` i.e. it gets
  periodically updated.
* Cleanup the clients from the memory usage bucket when client eviction
  is disabled.
* When the maxmemory-clients config is enabled or disabled at runtime,
  we immediately update the memory usage buckets for all clients (tested
  scanning 80000 took some 20ms)

Benchmark shown that this can improve performance by about 5% in
certain situations.

Co-authored-by: Oran Agra <[email protected]>
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
… is disabled (redis#11348)

## Issue
During the client input/output buffer processing, the memory usage is
incrementally updated to keep track of clients going beyond a certain
threshold `maxmemory-clients` to be evicted. However, this additional
tracking activity leads to unnecessary CPU cycles wasted when no
client-eviction is required. It is applicable in two cases.

* `maxmemory-clients` is set to `0` which equates to no client eviction
  (applicable to all clients)
* `CLIENT NO-EVICT` flag is set to `ON` which equates to a particular
  client not applicable for eviction.  

## Solution
* Disable client memory usage tracking during the read/write flow when
  `maxmemory-clients` is set to `0` or `client no-evict` is `on`.
  The memory usage is tracked only during the `clientCron` i.e. it gets
  periodically updated.
* Cleanup the clients from the memory usage bucket when client eviction
  is disabled.
* When the maxmemory-clients config is enabled or disabled at runtime,
  we immediately update the memory usage buckets for all clients (tested
  scanning 80000 took some 20ms)

Benchmark shown that this can improve performance by about 5% in
certain situations.

Co-authored-by: Oran Agra <[email protected]>
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
…ors (redis#11657)

This call is introduced in redis#8687, but became irrelevant in redis#11348, and is currently a no-op.
The fact is that redis#11348 an unintended side effect, which is that even if the client eviction config
is enabled, there are certain types of clients for which memory consumption is not accurately
tracked, and so unlike normal clients, their memory isn't reported correctly in INFO.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action:run-benchmark Triggers the benchmark suite for this Pull Request release-notes indication that this issue needs to be mentioned in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants