-
Notifications
You must be signed in to change notification settings - Fork 23.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize client type check on reply hot code paths #13516
Conversation
if ((c->reply_bytes == 0 && getClientType(c) != CLIENT_TYPE_SLAVE) || missing this line? |
Automated performance analysis summaryThis comment was automatically generated given there is performance data available. Using platform named: intel64-ubuntu22.04-redis-icx1 to do the comparison. In summary:
You can check a comparison in detail via the grafana link Comparison between unstable and client.type.perf.Time Period from 5 months ago. (environment used: oss-standalone) Improvements Table
Improvements test regexp names: memtier_benchmark-10Mkeys-load-hash-5-fields-with-10B-values-pipeline-10|memtier_benchmark-2keys-set-10-100-elements-sunion Full Results table:
|
### New Features in binary distributions - 7 new data structures: JSON, Time series, Bloom filter, Cuckoo filter, Count-min sketch, Top-k, t-digest - Redis scalable query engine (including vector search) ### Potentially breaking changes - #12272 `GETRANGE` returns an empty bulk when the negative end index is out of range - #12395 Optimize `SCAN` command when matching data type ### Bug fixes - #13510 Fix `RM_RdbLoad` to enable AOF after RDB loading is completed - #13489 `ACL CAT` - return module commands - #13476 Fix a race condition in the `cache_memory` of `functionsLibCtx` - #13473 Fix incorrect lag due to trimming stream via `XTRIM` command - #13338 Fix incorrect lag field in `XINFO` when tombstone is after the `last_id` of the consume group - #13470 On `HDEL` of last field - update the global hash field expiration data structure - #13465 Cluster: Pass extensions to node if extension processing is handled by it - #13443 Cluster: Ensure validity of myself when loading cluster config - #13422 Cluster: Fix `CLUSTER SHARDS` command returns empty array ### Modules API - #13509 New API calls: `RM_DefragAllocRaw`, `RM_DefragFreeRaw`, and `RM_RegisterDefragCallbacks` - defrag API to allocate and free raw memory ### Performance and resource utilization improvements - #13503 Avoid overhead of comparison function pointer calls in listpack `lpFind` - #13505 Optimize `STRING` datatype write commands - #13499 Optimize `SMEMBERS` command - #13494 Optimize `GEO*` commands reply - #13490 Optimize `HELLO` command - #13488 Optimize client query buffer - #12395 Optimize `SCAN` command when matching data type - #13529 Optimize `LREM`, `LPOS`, `LINSERT`, and `LINDEX` commands - #13516 Optimize `LRANGE` and other commands that perform several writes to client buffers per call - #13431 Avoid `used_memory` contention when updating from multiple threads ### Other general improvements - #13495 Reply `-LOADING` on replica while flushing the db ### CLI tools - #13411 redis-cli: Fix wrong `dbnum` showed after the client reconnected ### Notes - No backward compatibility for replication or persistence. - Additional distributions, upgrade paths, features, and improvements will be introduced in upcoming pre-releases. - With the GA release of 8.0 we will deprecate Redis Stack.
CE Performance Automation : step 2 of 2 (benchmark) RUNNING...This comment was automatically generated given a benchmark was triggered. Started benchmark suite at 2024-10-20 16:23:47.408782 and took 1.185164 seconds up until now. In total will run 135 benchmarks. |
Proposed improvement
This PR introduces the static inlined function
clientTypeIsSlave
which is doing only 1 condition check vs 3 checks ofgetClientType
, and also uses theunlikely
to tell the compiler that the most common outcome is for the client not to be a slave.Preliminary data show 3% improvement on the achievable ops/sec on the specific LRANGE benchmark. After running the entire suite we see up to 5% improvement in 2 tests. #13516 (comment)
Context
This optimization efforts comes from analyzing the profile info from the memtier_benchmark-1key-list-1K-elements-lrange-all-elements benchmark.
By going over it, we can see that
getClientType
consumes 2% of the cpu time, strictly to check if the client is a slave ( https://github.com/redis/redis/blob/unstable/src/networking.c#L397 , and https://github.com/redis/redis/blob/unstable/src/networking.c#L1254 )TODO:
Manuel preliminary results
unstable branch ea3e8b7
This PR 4263c2f