Optimize SET/INCR/DECR/SETRANGE/APPEND by reducing duplicate computation #13505

fcostaoliveira · 2024-08-29T23:09:26Z

Avoid addReplyLongLong (which converts back to string) the value we already have as a robj, by using addReplyProto + addReply
Avoid doing dbFind Twice for the same dictEntry on INCR*/DECR*/SETRANGE/APPEND commands.
Avoid multiple sdslen calls with the same input on setrangeCommand and appendCommand
Introduce setKeyWithDictEntry, which is like setKey(), but accepts an optional dictEntry input: Avoids the second dictFind in SET command

TODO:

confirm green CI
confirm green daily
cover it on benchmark spec.
Update this PR with benchmark spec automated comment. check Optimize SET/INCR*/DECR*/SETRANGE/APPEND by reducing duplicate computation #13505 (comment)

Hotspot data that triggered this effort

Hotspots sample on incrDecr* commands (we can save 3.3% on the 2nd dictFind)

Hotspots on set command (we can save 2.1% on the 2nd dictFind)

Measuring the improvements

pip3 install redis-benchmarks-specification==0.1.222
redis-benchmarks-spec-client-runner --tests-regexp "memtier_benchmark-1Mkeys-string-append-1-100B|memtier_benchmark-1Mkeys-string-incrby|memtier_benchmark-1Mkeys-string-incrbyfloat|memtier_benchmark-1Mkeys-string-decr|memtier_benchmark-1Mkeys-string-setrange-100B|memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10|memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10" --flushall_on_every_test_start --flushall_on_every_test_end --override-memtier-test-time 60 --cpuset_start_pos 1

Improvements on STRING SET command

Throughput

Test Name	baseline unstable ( `3fcddfb` )	comparison( `93fae76` )	% change
-----------------------------------------------------------------	-----------:	-----------:	-----------:
memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10	578505	592740	2.46%
memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10	644028	663759	3.06%

Avg. Latency

Test Name	baseline unstable ( `3fcddfb` )	comparison( `93fae76` )	% change
-----------------------------------------------------------------	-----------:	-----------:	-----------:
memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10	3.455	3.372	2.40%
memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10	3.103	3.011	2.96%

Improvements on INCRBY command

Baseline

root@hpe10:~/redis# taskset -c 1-5 memtier_benchmark --pipeline 10 --test-time 60  --hide-histogram --command='INCRBY __key__ 1'
(...)
[RUN #1 100%,  60 secs]  0 threads:    45616940 ops,  750371 (avg:  760212) ops/sec, 36.42MB/sec (avg: 36.91MB/sec),  2.66 (avg:  2.63) msec latency

4         Threads
50        Connections per thread
60        Seconds


ALL STATS
==================================================================================================
Type         Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
--------------------------------------------------------------------------------------------------
Incrbys    760212.36         2.62864         2.55900         5.15100         9.98300     37795.06 
Totals     760212.36         2.62864         2.55900         5.15100         9.98300     37795.06

Comparison

root@hpe10:~/redis# taskset -c 1-5 memtier_benchmark --pipeline 10 --test-time 60  --hide-histogram --command='INCRBY __key__ 1'
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%,  60 secs]  0 threads:    47423740 ops,  789698 (avg:  790317) ops/sec, 37.92MB/sec (avg: 37.95MB/sec),  2.53 (avg:  2.53) msec latency

4         Threads
50        Connections per thread
60        Seconds


ALL STATS
==================================================================================================
Type         Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
--------------------------------------------------------------------------------------------------
Incrbys    790316.92         2.52839         2.44700         4.92700         9.91900     38860.38 
Totals     790316.92         2.52839         2.44700         4.92700         9.91900     38860.38

src/t_string.c

…void multiple sdslen calls with the same input on setrangeCommand and appendCommand.

… optional dictEntry input: Avoids the second dictFind in SET command

fcostaoliveira · 2024-08-30T23:52:45Z

Automated performance analysis summary

This comment was automatically generated given there is performance data available.

Using platform named: intel64-ubuntu22.04-redis-icx1 to do the comparison.

In summary:

Detected a total of 7 improvements above the improvement water line.

You can check a comparison in detail via the grafana link

Comparison between unstable and perf.incrby.

Time Period from 5 months ago. (environment used: oss-standalone)

Improvements Table

Test Case	Baseline unstable (median obs. +- std.dev)	Comparison perf.incrby (median obs. +- std.dev)	% change (higher-better)	Note
memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10	636952 +- 1.4% (7 datapoints)	658638	3.4%	IMPROVEMENT
memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10	753946 +- 2.0% (7 datapoints)	786432	4.3%	waterline=2.0%. IMPROVEMENT
memtier_benchmark-1Mkeys-string-append-1-100B-pipeline-10	767185	792867	3.3%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-incrby-pipeline-10	891079	906416	1.7%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10	449176	457083	1.8%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-setex-100B-pipeline-10	614291	686197	11.7%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-setrange-100B-pipeline-10	812508	826357	1.7%	IMPROVEMENT

Improvements test regexp names: memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10|memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10|memtier_benchmark-1Mkeys-string-append-1-100B-pipeline-10|memtier_benchmark-1Mkeys-string-incrby-pipeline-10|memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10|memtier_benchmark-1Mkeys-string-setex-100B-pipeline-10|memtier_benchmark-1Mkeys-string-setrange-100B-pipeline-10

Full Results table:

Test Case	Baseline unstable (median obs. +- std.dev)	Comparison perf.incrby (median obs. +- std.dev)	% change (higher-better)	Note
memtier_benchmark-1Mkeys-load-string-with-100B-values-pipeline-10	636952 +- 1.4% (7 datapoints)	658638	3.4%	IMPROVEMENT
memtier_benchmark-1Mkeys-load-string-with-10B-values-pipeline-10	753946 +- 2.0% (7 datapoints)	786432	4.3%	waterline=2.0%. IMPROVEMENT
memtier_benchmark-1Mkeys-string-append-1-100B-pipeline-10	767185	792867	3.3%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-incrby-pipeline-10	891079	906416	1.7%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-incrbyfloat-pipeline-10	449176	457083	1.8%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-setex-100B-pipeline-10	614291	686197	11.7%	IMPROVEMENT
memtier_benchmark-1Mkeys-string-setrange-100B-pipeline-10	812508	826357	1.7%	IMPROVEMENT

…ts an optional dictEntry input: Avoids the second kvstoreDictFind in SETEX command

src/db.c

src/t_string.c

Co-authored-by: debing.sun <[email protected]>

src/db.c

src/t_string.c

…UnshareStringValueWithDictEntry. Commented the dictEntry reference input in lookupKey()

src/db.c

oranagra

LGTM. (assuming all comments are addressed)

src/db.c

src/networking.c

Co-authored-by: debing.sun <[email protected]>

src/t_string.c

Co-authored-by: debing.sun <[email protected]>

### New Features in binary distributions - 7 new data structures: JSON, Time series, Bloom filter, Cuckoo filter, Count-min sketch, Top-k, t-digest - Redis scalable query engine (including vector search) ### Potentially breaking changes - #12272 `GETRANGE` returns an empty bulk when the negative end index is out of range - #12395 Optimize `SCAN` command when matching data type ### Bug fixes - #13510 Fix `RM_RdbLoad` to enable AOF after RDB loading is completed - #13489 `ACL CAT` - return module commands - #13476 Fix a race condition in the `cache_memory` of `functionsLibCtx` - #13473 Fix incorrect lag due to trimming stream via `XTRIM` command - #13338 Fix incorrect lag field in `XINFO` when tombstone is after the `last_id` of the consume group - #13470 On `HDEL` of last field - update the global hash field expiration data structure - #13465 Cluster: Pass extensions to node if extension processing is handled by it - #13443 Cluster: Ensure validity of myself when loading cluster config - #13422 Cluster: Fix `CLUSTER SHARDS` command returns empty array ### Modules API - #13509 New API calls: `RM_DefragAllocRaw`, `RM_DefragFreeRaw`, and `RM_RegisterDefragCallbacks` - defrag API to allocate and free raw memory ### Performance and resource utilization improvements - #13503 Avoid overhead of comparison function pointer calls in listpack `lpFind` - #13505 Optimize `STRING` datatype write commands - #13499 Optimize `SMEMBERS` command - #13494 Optimize `GEO*` commands reply - #13490 Optimize `HELLO` command - #13488 Optimize client query buffer - #12395 Optimize `SCAN` command when matching data type - #13529 Optimize `LREM`, `LPOS`, `LINSERT`, and `LINDEX` commands - #13516 Optimize `LRANGE` and other commands that perform several writes to client buffers per call - #13431 Avoid `used_memory` contention when updating from multiple threads ### Other general improvements - #13495 Reply `-LOADING` on replica while flushing the db ### CLI tools - #13411 redis-cli: Fix wrong `dbnum` showed after the client reconnected ### Notes - No backward compatibility for replication or persistence. - Additional distributions, upgrade paths, features, and improvements will be introduced in upcoming pre-releases. - With the GA release of 8.0 we will deprecate Redis Stack.

oranagra · 2025-03-24T06:58:53Z

src/t_string.c

        } else {
            dbAdd(c->db,c->argv[1],new);
        }
    }
    signalModifiedKey(c,c->db,c->argv[1]);
    notifyKeyspaceEvent(NOTIFY_STRING,"incrby",c->argv[1],c->db->id);
    server.dirty++;
-    addReplyLongLong(c, value);
+    addReplyLongLongFromStr(c,new);


this change can cause use-after-free in the tests/unit/moduleapi/keyspace_events.tcl test, when the test uses RM_StringDMA to modify the value from within the KSN, and this command uses it after it was released.
the reason it doesn't happen is because the eviction policy allows for shared integers, so despite "releasing" this object from within the KSN. it doesn't free the memory.

it can be solved by moving the reply a few lines upwards.
we can also argue that the module is violating redis, but it used to work so far (specifically for INCR), and the optimization broke it, so shall we fix that?
WDYT?

in redis#13505, we changed the code to use the string value of the key rather than the integer value on the stack, but we have a test in unit/moduleapi/keyspace_events that uses keyspace notification hook to modify the value with RM_StringDMA, which can cause this value to be released before used. the reason it didn't happen so far is because we were using shared integers, so releasing the object doesn't free it.

in #13505, we changed the code to use the string value of the key rather than the integer value on the stack, but we have a test in unit/moduleapi/keyspace_events that uses keyspace notification hook to modify the value with RM_StringDMA, which can cause this value to be released before used. the reason it didn't happen so far is because we were using shared integers, so releasing the object doesn't free it.

fcostaoliveira added 2 commits August 29, 2024 23:18

Avoid _addReplyLongLongWithPrefix in incrby/decrby commands

31d0a41

Avoid 2x dbFind for the same dictEntry*

f1dddc2

sundb reviewed Aug 30, 2024

View reviewed changes

src/t_string.c Outdated Show resolved Hide resolved

fcostaoliveira changed the title ~~Optimize INCRBY/DECRBY by reducing duplicate computation~~ Optimize INCR*/DECR*/SETRANGE/APPEND by reducing duplicate computation Aug 30, 2024

fcostaoliveira added 2 commits August 30, 2024 11:22

Applied same dbFind reusal to SETRANGE/APPEND/INCRBYFLOAT commands. A…

8580a59

…void multiple sdslen calls with the same input on setrangeCommand and appendCommand.

Introduce setKeyWithDictEntry, which is like setKey(), but accepts an…

93fae76

… optional dictEntry input: Avoids the second dictFind in SET command

fcostaoliveira changed the title ~~Optimize INCR*/DECR*/SETRANGE/APPEND by reducing duplicate computation~~ Optimize SET/INCR*/DECR*/SETRANGE/APPEND by reducing duplicate computation Aug 30, 2024

Introduce setExpireWithDictEntry, which is like seExpire(), but accep…

f1f45f9

…ts an optional dictEntry input: Avoids the second kvstoreDictFind in SETEX command

fcostaoliveira requested review from sundb, oranagra, YaacovHazan and yossigo September 1, 2024 17:07

sundb reviewed Sep 2, 2024

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

src/db.c Outdated Show resolved Hide resolved

src/db.c Outdated Show resolved Hide resolved

src/t_string.c Outdated Show resolved Hide resolved

Apply suggestions from code review

b370e9e

Co-authored-by: debing.sun <[email protected]>

oranagra reviewed Sep 2, 2024

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

src/db.c Show resolved Hide resolved

src/t_string.c Outdated Show resolved Hide resolved

src/t_string.c Outdated Show resolved Hide resolved

sundb and others added 3 commits September 2, 2024 13:56

Returning entry ref instead of dbFind

54fce49

Added addReplyLongLongFromStr()

13eb4f4

Fixes per PR review: sharing code between dbUnshareStringValue and db…

ec5be23

…UnshareStringValueWithDictEntry. Commented the dictEntry reference input in lookupKey()

oranagra reviewed Sep 2, 2024

View reviewed changes

src/db.c Show resolved Hide resolved

oranagra reviewed Sep 2, 2024

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

fcostaoliveira added 3 commits September 2, 2024 14:32

Documented the optional 'deref' output of lookupKey()

77cf67a

Fixed missing return of dbUnshareStringValue()

df0ddd1

Removed extra ; in setExpireWithDictEntry()

a14ce6c

oranagra previously approved these changes Sep 2, 2024

View reviewed changes

fcostaoliveira requested a review from sundb September 2, 2024 13:43

sundb reviewed Sep 2, 2024

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

Apply suggestions from code review

d91a2fd

Co-authored-by: debing.sun <[email protected]>

fcostaoliveira dismissed oranagra’s stale review via d91a2fd September 2, 2024 13:59

fcostaoliveira requested a review from oranagra September 2, 2024 13:59

fcostaoliveira requested a review from sundb September 2, 2024 13:59

sundb previously approved these changes Sep 2, 2024

View reviewed changes

sundb reviewed Sep 3, 2024

View reviewed changes

src/t_string.c Outdated Show resolved Hide resolved

fcostaoliveira dismissed sundb’s stale review via ba249e2 September 3, 2024 14:35

fcostaoliveira and others added 2 commits September 3, 2024 15:35

Update src/t_string.c

ba249e2

Co-authored-by: debing.sun <[email protected]>

only passing reference to dictEntry if the val is not expired

0e6e3d7

sundb approved these changes Sep 4, 2024

View reviewed changes

oranagra approved these changes Sep 4, 2024

View reviewed changes

sundb merged commit 05aed4c into redis:unstable Sep 4, 2024
14 checks passed

YaacovHazan added the release-notes indication that this issue needs to be mentioned in the release notes label Sep 5, 2024

YaacovHazan mentioned this pull request Sep 11, 2024

Redis 8.0 M01 #13538

Merged

oranagra reviewed Mar 24, 2025

View reviewed changes

oranagra mentioned this pull request Mar 24, 2025

avoid possible use-after-free with module KSN changes #13875

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize SET/INCR/DECR/SETRANGE/APPEND by reducing duplicate computation #13505

Optimize SET/INCR/DECR/SETRANGE/APPEND by reducing duplicate computation #13505

fcostaoliveira commented Aug 29, 2024 •

edited

Loading

fcostaoliveira commented Aug 30, 2024 •

edited

Loading

oranagra left a comment

oranagra Mar 24, 2025

oranagra Mar 24, 2025

Optimize SET/INCR*/DECR*/SETRANGE/APPEND by reducing duplicate computation #13505

Optimize SET/INCR*/DECR*/SETRANGE/APPEND by reducing duplicate computation #13505

Conversation

fcostaoliveira commented Aug 29, 2024 • edited Loading

Hotspot data that triggered this effort

Measuring the improvements

Improvements on STRING SET command

Improvements on INCRBY command

fcostaoliveira commented Aug 30, 2024 • edited Loading

Automated performance analysis summary

Comparison between unstable and perf.incrby.

Improvements Table

oranagra left a comment

Choose a reason for hiding this comment

oranagra Mar 24, 2025

Choose a reason for hiding this comment

oranagra Mar 24, 2025

Choose a reason for hiding this comment

Optimize SET/INCR/DECR/SETRANGE/APPEND by reducing duplicate computation #13505

Optimize SET/INCR/DECR/SETRANGE/APPEND by reducing duplicate computation #13505

fcostaoliveira commented Aug 29, 2024 •

edited

Loading

fcostaoliveira commented Aug 30, 2024 •

edited

Loading