Optimize addReplyBulk on sds/int encoded strings: 2.2% to 4% reduction of CPU Time on GET high pipeline use-cases#13644
Merged
ShooterIT merged 3 commits intoredis:unstablefrom Nov 26, 2024
Conversation
…oid sdigits10() call on int incoded objects on addReplyBulk()
Collaborator
Author
CE Performance Automation : step 2 of 2 (benchmark) FINISHED.This comment was automatically generated given a benchmark was triggered. Started benchmark suite at 2024-11-17 08:45:57.839755 and took 7430.827267 seconds to finish. In total will run 141 benchmarks. |
sundb
approved these changes
Nov 22, 2024
funny-dog
pushed a commit
to funny-dog/redis
that referenced
this pull request
Sep 17, 2025
…n of CPU Time on GET high pipeline use-cases (redis#13644) ### Summary By profing 1KiB 100% GET's use-case, on high pipeline use-cases, we can see that addReplyBulk and it's inner calls takes 8.30% of the CPU cycles. This PR reduces from 2.2% to 4% the CPU time spent on addReplyBulk. Specifically for GET use-cases, we saw an improvement from 2.7% to 9.1% on the achievable ops/sec ### Improvement By reducing the duplicate work we can improve by around 2.7% on sds encoded strings, and around 9% on int encoded strings. This PR does the following: - Avoid duplicate sdslen on addReplyBulk() for sds enconded objects - Avoid duplicate sdigits10() call on int incoded objects on addReplyBulk() - avoid final "\r\n" addReplyProto() in the OBJ_ENCODING_INT type on addReplyBulk Altogether this improvements results in the following improvement on the achievable ops/sec : Encoding | unstable (commit 7f38c7b) | this PR | % improvement -- | -- | -- | -- 1KiB Values string SDS encoded | 1478081.88 | 1517635.38 | 2.7% Values string "1" OBJ_ENCODING_INT | 1521139.36 | 1658876.59 | 9.1% ### CPU Time: Total of addReplyBulk Encoding | unstable (commit 7f38c7b) | this PR | reduction of CPU Time: Total -- | -- | -- | -- 1KiB Values string SDS encoded | 8.30% | 6.10% | 2.2% Values string "1" OBJ_ENCODING_INT | 7.20% | 3.20% | 4.0% ### To reproduce Run redis with unix socket enabled ``` taskset -c 0 /root/redis/src/redis-server --unixsocket /tmp/1.socket --save '' --enable-debug-command local ``` #### 1KiB Values string SDS encoded Load data ``` taskset -c 2-5 memtier_benchmark --ratio 1:0 -n allkeys --key-pattern P:P --key-maximum 1000000 --hide-histogram --pipeline 10 -S /tmp/1.socket ``` Benchmark ``` taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500 -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json ``` #### Values string "1" OBJ_ENCODING_INT Load data ``` $ taskset -c 2-5 memtier_benchmark --command "SET __key__ 1" -n allkeys --command-key-pattern P --key-maximum 1000000 --hide-histogram -c 1 -t 1 --pipeline 100 -S /tmp/1.socket # confirm we have the expected reply and format $ redis-cli get memtier-1 "1" $ redis-cli debug object memtier-1 Value at:0x7f14cec57570 refcount:2147483647 encoding:int serializedlength:2 lru:2861503 lru_seconds_idle:8 ``` Benchmark ``` taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500 -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
By profing 1KiB 100% GET's use-case, on high pipeline use-cases, we can see that addReplyBulk and it's inner calls takes 8.30% of the CPU cycles. This PR reduces from 2.2% to 4% the CPU time spent on addReplyBulk. Specifically for GET use-cases, we saw an improvement from 2.7% to 9.1% on the achievable ops/sec
Improvement
By reducing the duplicate work we can improve by around 2.7% on sds encoded strings, and around 9% on int encoded strings. This PR does the following:
Altogether this improvements results in the following improvement on the achievable ops/sec :
CPU Time: Total of addReplyBulk
To reproduce
Run redis with unix socket enabled
1KiB Values string SDS encoded
Load data
Benchmark
Values string "1" OBJ_ENCODING_INT
Load data
Benchmark