Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize LREM, LPOS, LINSERT, LINDEX: Avoid N-1 sdslen() calls on listTypeEqual #13529

Merged
merged 1 commit into from
Sep 10, 2024

Conversation

fcostaoliveira
Copy link
Collaborator

@fcostaoliveira fcostaoliveira commented Sep 9, 2024

This is a very easy optimization, that avoids duplicate computation of the object length for LREM, LPOS, LINSERT na LINDEX.

We can see that sdslen takes 7.7% of the total CPU cycles of the benchmarks.

Function Stack CPU Time: Total CPU Time: Self Module Function (Full) Source File Start Address
listTypeEqual 15.50% 2.346s redis-server listTypeEqual t_list.c 0x845dd
sdslen 7.70% 2.300s redis-server sdslen sds.h 0x845e4

image

Preliminary data showcases 4% improvement on the achieavable ops/sec of LPOS in string elements, and 2% in int elements.

Checks:

To benchmark:

pip3 install redis-benchmarks-specification==0.1.235
taskset -c 0 ./src/redis-server --save '' --protected-mode no --daemonize yes
redis-benchmarks-spec-client-runner --tests-regexp ".*lpos.*" --flushall_on_every_test_start --flushall_on_every_test_end  --cpuset_start_pos 2 --override-memtier-test-time 60

Preliminary benchmark:

Unstable ac03e37 :

Test Name Metric JSON Path Metric Value
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Ops/sec" 5446.820
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Latency" 36.711
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Misses/sec" 0.000
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Percentile Latencies"."p50.00" 35.839
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Ops/sec" 4157.120
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Latency" 48.095
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Misses/sec" 0.000
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Percentile Latencies"."p50.00" 47.103

This PR e7fed24 :

Test Name Metric JSON Path Metric Value
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Ops/sec" 5661.890
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Latency" 35.317
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Misses/sec" 0.000
memtier_benchmark-1key-list-10K-elements-lpos-string "ALL STATS".Totals."Percentile Latencies"."p50.00" 34.303
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Ops/sec" 4245.360
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Latency" 47.100
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Misses/sec" 0.000
memtier_benchmark-1key-list-10K-elements-lpos-integer "ALL STATS".Totals."Percentile Latencies"."p50.00" 46.079

@fcostaoliveira fcostaoliveira requested a review from sundb September 9, 2024 12:17
@sundb sundb merged commit bcae770 into redis:unstable Sep 10, 2024
14 checks passed
@YaacovHazan YaacovHazan mentioned this pull request Sep 11, 2024
YaacovHazan added a commit that referenced this pull request Sep 12, 2024
### New Features in binary distributions

- 7 new data structures: JSON, Time series, Bloom filter, Cuckoo filter,
Count-min sketch, Top-k, t-digest
- Redis scalable query engine (including vector search)

### Potentially breaking changes

- #12272 `GETRANGE` returns an empty bulk when the negative end index is
out of range
- #12395 Optimize `SCAN` command when matching data type

### Bug fixes

- #13510 Fix `RM_RdbLoad` to enable AOF after RDB loading is completed
- #13489 `ACL CAT` - return module commands
- #13476 Fix a race condition in the `cache_memory` of `functionsLibCtx`
- #13473 Fix incorrect lag due to trimming stream via `XTRIM` command
- #13338 Fix incorrect lag field in `XINFO` when tombstone is after the
`last_id` of the consume group
- #13470 On `HDEL` of last field - update the global hash field
expiration data structure
- #13465 Cluster: Pass extensions to node if extension processing is
handled by it
- #13443 Cluster: Ensure validity of myself when loading cluster config
- #13422 Cluster: Fix `CLUSTER SHARDS` command returns empty array

### Modules API

- #13509 New API calls: `RM_DefragAllocRaw`, `RM_DefragFreeRaw`, and
`RM_RegisterDefragCallbacks` - defrag API to allocate and free raw
memory

### Performance and resource utilization improvements

- #13503 Avoid overhead of comparison function pointer calls in listpack
`lpFind`
- #13505 Optimize `STRING` datatype write commands
- #13499 Optimize `SMEMBERS` command
- #13494 Optimize `GEO*` commands reply
- #13490 Optimize `HELLO` command
- #13488 Optimize client query buffer
- #12395 Optimize `SCAN` command when matching data type
- #13529 Optimize `LREM`, `LPOS`, `LINSERT`, and `LINDEX` commands
- #13516 Optimize `LRANGE` and other commands that perform several
writes to client buffers per call
- #13431 Avoid `used_memory` contention when updating from multiple
threads

### Other general improvements

- #13495 Reply `-LOADING` on replica while flushing the db

### CLI tools

- #13411 redis-cli: Fix wrong `dbnum` showed after the client
reconnected

### Notes

- No backward compatibility for replication or persistence.
- Additional distributions, upgrade paths, features, and improvements
will be introduced in upcoming pre-releases.
- With the GA release of 8.0 we will deprecate Redis Stack.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants