redis-benchmark: Error/Warning handling updates.#8869
redis-benchmark: Error/Warning handling updates.#8869oranagra merged 4 commits intoredis:unstablefrom
Conversation
…opology updates. Fixed wrongfully failing on config fetch error (warning only).
| if (r->type == REDIS_REPLY_ERROR) { | ||
| /* Try to update slots configuration if reply error is | ||
| * MOVED/ASK/CLUSTERDOWN and the key(s) used by the command | ||
| * contain(s) the slot hash tag. */ | ||
| if (c->cluster_node && c->staglen) { | ||
| int fetch_slots = 0, do_wait = 0; | ||
| if (!strncmp(r->str,"MOVED",5) || !strncmp(r->str,"ASK",3)) | ||
| fetch_slots = 1; | ||
| else if (!strncmp(r->str,"CLUSTERDOWN",11)) { | ||
| /* Usually the cluster is able to recover itself after | ||
| * a CLUSTERDOWN error, so try to sleep one second | ||
| * before requesting the new configuration. */ | ||
| fetch_slots = 1; | ||
| do_wait = 1; | ||
| printf("Error from server %s:%d: %s\n", | ||
| c->cluster_node->ip, | ||
| c->cluster_node->port, | ||
| r->str); | ||
| } | ||
| if (do_wait) sleep(1); | ||
| if (fetch_slots && !fetchClusterSlotsConfiguration(c)) | ||
| exit(1); |
There was a problem hiding this comment.
it's a little bit hard to review this diff. IIUC, it didn't really change (other than the comment below.
what changed is that it went inside an additional separate if (on r->type == REDIS_REPLY_ERROR instead of is_err). and the first part at the stop about showerrorsmoved to the bottom (as anelsewithout anif`).
so logically, the only changes in that huge diff block is that:
- we don't have the possibility to print the error twice.
- normally MOVED, and ASK will not print any error
- we
exiton any unexpected error (in addition to the failure to fetch cluster slots configuration)
let me know if there's anything else i'm missing that changed in that function.
p.s. maybe there was a possibility to arrange the code in a way that it'll create a smaller diff, but never mind now.
There was a problem hiding this comment.
@oranagra thank you for the fast review. with regards to:
- we don't have the possibility to print the error twice.
correct.- normally MOVED, and ASK will not print any error
Missed it. Fixed it on the latest commit. We now print the MOVED and ASK same a before- we exit on any unexpected error (in addition to the failure to fetch cluster slots configuration)
correct.
let me know if there's anything else i'm missing that changed in that function.
All covered above.
p.s. maybe there was a possibility to arrange the code in a way that it'll create a smaller diff, but never mind now.
I also wanted to reduce the changes to the minimal but the identation change makes the line number to be large.
There was a problem hiding this comment.
do we wanna print an error on MOVED and ASK?
i suppose it'll be printed only once (per thread?) since we re-fetch the cluster configuration, right?
note another difference is that the double errors in the past were throttled to be printed once per second, now we don't have double error print cases, but also no throttling.
- the
exitworks as a way of "throttling" the unexpected errors. - MOVED and ASK are maybe solved by fetchClusterSlotsConfiguration (so they won't flood the console)
@filipecosta90 can you test to confirm that? - and CLUSTERDOWN is in addition to that throttled by the sleep
There was a problem hiding this comment.
@oranagra you're totally right on flooding the console.
So my question is:
- should we print the MOVED/ASK? If yes then we need the double errors in the past were throttled to be printed once per second. If not then we're ok with the previous print position. WDYT?
$ ../../src/redis-benchmark -r 100000 -n 1000000000 -t set -p 30002 --cluster
Cluster has 3 master nodes:
Master 0: ff62a57dec6aae7bf257f32fc11f709c6a6529d5 127.0.0.1:30001
Master 1: b4e30ae77b444361619b8ec3d182c1e5056c2939 127.0.0.1:30002
Master 2: 7c2512ef5932a70149343780a555ff95bdde9a6c 127.0.0.1:30003
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001. 0.194)3))
Updating slots configuration.
WARNING: Cluster slots configuration changed, fetching new one...
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
Error from server 127.0.0.1:30002: MOVED 5461 127.0.0.1:30001.
Updating slots configuration.
There was a problem hiding this comment.
i think that MOVED and ASK are part of the protocol, and are handled correctly by the benchmark.
they can be completely silenced IMHO.
in fact, before this PR, they were also completely silence by default, right? (when the showerrors was false)
There was a problem hiding this comment.
@oranagra fixed per PR review: printing only CLUSTERDOWN error message.
- Immediately exit on errors that are not related to topology updates. - Deprecates the `-e` option ( retro compatible ) and warns that we now exit immediately on errors that are not related to topology updates. - Fixed wrongfully failing on config fetch error (warning only). This only affects RE. Bottom line: - MOVED and ASK errors will not show any warning (unlike the throttled error with `-e` before). - CLUSTERDOWN still prints an error unconditionally and sleeps for 1 second. - other errors are fatal. (cherry picked from commit ef6f902)
Redis Enterprise supports the CONFIG GET command, but it replies with am empty array since the save and appendonly configs are not supported. before this fix redis-benchmark would segfault for trying to access the error string on an array type reply. see #8869
- Immediately exit on errors that are not related to topology updates. - Deprecates the `-e` option ( retro compatible ) and warns that we now exit immediately on errors that are not related to topology updates. - Fixed wrongfully failing on config fetch error (warning only). This only affects RE. Bottom line: - MOVED and ASK errors will not show any warning (unlike the throttled error with `-e` before). - CLUSTERDOWN still prints an error unconditionally and sleeps for 1 second. - other errors are fatal.
Redis Enterprise supports the CONFIG GET command, but it replies with am empty array since the save and appendonly configs are not supported. before this fix redis-benchmark would segfault for trying to access the error string on an array type reply. see redis#8869
- Immediately exit on errors that are not related to topology updates. - Deprecates the `-e` option ( retro compatible ) and warns that we now exit immediately on errors that are not related to topology updates. - Fixed wrongfully failing on config fetch error (warning only). This only affects RE. Bottom line: - MOVED and ASK errors will not show any warning (unlike the throttled error with `-e` before). - CLUSTERDOWN still prints an error unconditionally and sleeps for 1 second. - other errors are fatal.
Redis Enterprise supports the CONFIG GET command, but it replies with am empty array since the save and appendonly configs are not supported. before this fix redis-benchmark would segfault for trying to access the error string on an array type reply. see redis#8869
This PR fixes #5854 and #2337 and addresses some minor issue introduced on #8855.
It:
-eoption ( retro compatible ) and warns that we now exit immediately on errors that are not related to topology updates.Bottom line:
-ebefore).