Skip to content

Fix slowlog crash when redirection fails#129

Merged
doyoubi merged 5 commits intoeleme:masterfrom
doyoubi:fix/SlowlogCrash
Sep 11, 2017
Merged

Fix slowlog crash when redirection fails#129
doyoubi merged 5 commits intoeleme:masterfrom
doyoubi:fix/SlowlogCrash

Conversation

@doyoubi
Copy link
Copy Markdown
Contributor

@doyoubi doyoubi commented Sep 4, 2017

Corvus may crash when it fails to connect to redis and try to create a sub-slowlog for multiple key command.

The Buggy Code

struct slowlog_entry *slowlog_create_sub_entry(struct command *cmd, int64_t total_latency)
{
    ...
    int64_t max_remote_latency = 0;
    struct command *c, *slowest_sub_cmd = NULL;
    STAILQ_FOREACH(c, &cmd->sub_cmds, sub_cmd_next) {
        // When this command fails because of not being able to connect to redis,
        // `remote_latency` will be zero (or maybe nagative) which result in a NULL `slowest_sub_cmd`.
        int64_t remote_latency = c->rep_time[1] - c->rep_time[0];
        if (remote_latency > max_remote_latency) {
            max_remote_latency = remote_latency;
            slowest_sub_cmd = c;
        }
    }
   // NULL `slowest_sub_cmd` will create a crash here.
    return slowlog_create_entry(slowest_sub_cmd, max_remote_latency / 1000, total_latency);
}

How To Reproduce

(1) Use the following Python 3 script to mock a redis cluster and let corvus connect to it.
Note that the slowlog-log-slower-than should be 0 and slowlog-max-len should be positive.

import time
import socket

server_address = ('localhost', 10000)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(server_address)
sock.listen(1)

CLUSTER_NODES = b'*2\r\n$7\r\nCLUSTER\r\n$5\r\nNODES\r\n'
CLUSTER_NODES_RESP = 'ffffffffffffffffffff00000000490000000265 127.0.0.1:10000 myself,master - 0 0 1 connected 0-16383\n'
RESP = '${}\r\n{}\r\n'.format(len(CLUSTER_NODES_RESP), CLUSTER_NODES_RESP)

while True:
    connection, client_address = sock.accept()
    data = connection.recv(1000)
    if data == CLUSTER_NODES:
        connection.sendall(RESP.encode('utf-8'))
        raise Exception('Let it crash')

(2) Send a MGET SomeKey to corvus and trigger a crash.

@doyoubi doyoubi requested a review from tevino September 4, 2017 11:47
Copy link
Copy Markdown
Contributor

@tevino tevino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

A test case would be great.

@doyoubi doyoubi force-pushed the fix/SlowlogCrash branch 5 times, most recently from 05cfcf1 to 6c2bd6f Compare September 11, 2017 02:44
@doyoubi doyoubi merged commit 59d1bcb into eleme:master Sep 11, 2017
@doyoubi doyoubi mentioned this pull request Sep 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants