Skip to content

Optimize addReplyBulk on sds/int encoded strings: 2.2% to 4% reduction of CPU Time on GET high pipeline use-cases#13644

Merged
ShooterIT merged 3 commits intoredis:unstablefrom
filipecosta90:optimize.addReplyBulk
Nov 26, 2024
Merged

Optimize addReplyBulk on sds/int encoded strings: 2.2% to 4% reduction of CPU Time on GET high pipeline use-cases#13644
ShooterIT merged 3 commits intoredis:unstablefrom
filipecosta90:optimize.addReplyBulk

Conversation

@fcostaoliveira
Copy link
Copy Markdown
Collaborator

Summary

By profing 1KiB 100% GET's use-case, on high pipeline use-cases, we can see that addReplyBulk and it's inner calls takes 8.30% of the CPU cycles. This PR reduces from 2.2% to 4% the CPU time spent on addReplyBulk. Specifically for GET use-cases, we saw an improvement from 2.7% to 9.1% on the achievable ops/sec

Improvement

By reducing the duplicate work we can improve by around 2.7% on sds encoded strings, and around 9% on int encoded strings. This PR does the following:

  • Avoid duplicate sdslen on addReplyBulk() for sds enconded objects
  • Avoid duplicate sdigits10() call on int incoded objects on addReplyBulk()
  • avoid final "\r\n" addReplyProto() in the OBJ_ENCODING_INT type on addReplyBulk

Altogether this improvements results in the following improvement on the achievable ops/sec :

Encoding unstable (commit 9906daf) this PR % improvement
1KiB Values string SDS encoded 1478081.88 1517635.38 2.7%
Values string "1" OBJ_ENCODING_INT 1521139.36 1658876.59 9.1%

CPU Time: Total of addReplyBulk

Encoding unstable (commit 9906daf) this PR reduction of CPU Time: Total
1KiB Values string SDS encoded 8.30% 6.10% 2.2%
Values string "1" OBJ_ENCODING_INT 7.20% 3.20% 4.0%

To reproduce

Run redis with unix socket enabled

taskset -c 0 /root/redis/src/redis-server  --unixsocket /tmp/1.socket --save '' --enable-debug-command local

1KiB Values string SDS encoded

Load data

taskset -c 2-5 memtier_benchmark  --ratio 1:0 -n allkeys --key-pattern P:P --key-maximum 1000000  --hide-histogram  --pipeline 10 -S /tmp/1.socket

Benchmark

taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500  -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json

Values string "1" OBJ_ENCODING_INT

Load data

$ taskset -c 2-5 memtier_benchmark  --command "SET __key__ 1" -n allkeys --command-key-pattern P --key-maximum 1000000  --hide-histogram -c 1 -t 1  --pipeline 100 -S /tmp/1.socket

# confirm we have the expected reply and format 
$ redis-cli get memtier-1
"1"

$ redis-cli debug object memtier-1
Value at:0x7f14cec57570 refcount:2147483647 encoding:int serializedlength:2 lru:2861503 lru_seconds_idle:8

Benchmark

taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500  -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json

@fcostaoliveira fcostaoliveira added the action:run-benchmark Triggers the benchmark suite for this Pull Request label Nov 6, 2024
@fcostaoliveira
Copy link
Copy Markdown
Collaborator Author

fcostaoliveira commented Nov 14, 2024

CE Performance Automation : step 2 of 2 (benchmark) FINISHED.

This comment was automatically generated given a benchmark was triggered.

Started benchmark suite at 2024-11-17 08:45:57.839755 and took 7430.827267 seconds to finish.
Status: [################################################################################] 100.0% completed.

In total will run 141 benchmarks.
- 0 pending.
- 141 completed:
- 0 successful.
- 141 failed.
You can check a the status in detail via the grafana link

Copy link
Copy Markdown
Member

@ShooterIT ShooterIT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great

@ShooterIT ShooterIT merged commit a106198 into redis:unstable Nov 26, 2024
@sundb sundb added this to Redis 8.0 Aug 15, 2025
@sundb sundb moved this to Done in Redis 8.0 Aug 15, 2025
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
…n of CPU Time on GET high pipeline use-cases (redis#13644)

### Summary

By profing 1KiB 100% GET's use-case, on high pipeline use-cases, we can
see that addReplyBulk and it's inner calls takes 8.30% of the CPU
cycles. This PR reduces from 2.2% to 4% the CPU time spent on
addReplyBulk. Specifically for GET use-cases, we saw an improvement from
2.7% to 9.1% on the achievable ops/sec

### Improvement

By reducing the duplicate work we can improve by around 2.7% on sds
encoded strings, and around 9% on int encoded strings. This PR does the
following:
- Avoid duplicate sdslen on addReplyBulk() for sds enconded objects
- Avoid duplicate sdigits10() call on int incoded objects on
addReplyBulk()
- avoid final "\r\n" addReplyProto() in the OBJ_ENCODING_INT type on
addReplyBulk

Altogether this improvements results in the following improvement on the
achievable ops/sec :

Encoding | unstable (commit 7f38c7b) |
this PR | % improvement
-- | -- | -- | --
1KiB Values string SDS encoded | 1478081.88 | 1517635.38 | 2.7%
Values string "1" OBJ_ENCODING_INT | 1521139.36 | 1658876.59 | 9.1%

### CPU Time: Total of addReplyBulk

Encoding | unstable (commit 7f38c7b) |
this PR | reduction of CPU Time: Total
-- | -- | -- | --
1KiB Values string SDS encoded | 8.30% | 6.10% | 2.2%
Values string "1" OBJ_ENCODING_INT | 7.20% | 3.20% | 4.0%

### To reproduce

Run redis with unix socket enabled
```
taskset -c 0 /root/redis/src/redis-server  --unixsocket /tmp/1.socket --save '' --enable-debug-command local
```

#### 1KiB Values string SDS encoded

Load data
```
taskset -c 2-5 memtier_benchmark  --ratio 1:0 -n allkeys --key-pattern P:P --key-maximum 1000000  --hide-histogram  --pipeline 10 -S /tmp/1.socket

```

Benchmark
```
taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500  -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json
```

#### Values string "1" OBJ_ENCODING_INT 

Load data
```
$ taskset -c 2-5 memtier_benchmark  --command "SET __key__ 1" -n allkeys --command-key-pattern P --key-maximum 1000000  --hide-histogram -c 1 -t 1  --pipeline 100 -S /tmp/1.socket

# confirm we have the expected reply and format 
$ redis-cli get memtier-1
"1"

$ redis-cli debug object memtier-1
Value at:0x7f14cec57570 refcount:2147483647 encoding:int serializedlength:2 lru:2861503 lru_seconds_idle:8

```

Benchmark
```
taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500  -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action:run-benchmark Triggers the benchmark suite for this Pull Request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants