Implements sendfile for redis.#7096
Conversation
|
Hello @ShooterIT, thanks for your PR. Now the window for such a change in Redis 6 is closed. But there could be space for this for Redis 7. I'm pinging @oranagra that may want to give a look to evaluate this code. I believe that it should be proven to have a performance advantage of any kind before using it btw. Cheers. |
|
I suppose there's no disadvantage in that and the cost of complexity is small. |
|
Yes, we should be rigorous for every change @antirez @oranagra My two server machines are the same, Linux kernel is 3.10.0, CPU is E5-2620 v2 @ 2.10GHz.
To be honest, I only verified the correctness on my MacBook for macOS. I need to do some performance tests. |
|
The Linux server machines' configurations are the same, as above. I did benchmark on master redis meanwhile replicas received RDB file. Some tests are as follows. But I get CPU usage without benchmark. Benchmark: Config: One replica syncs from master.
Two replicas sync from master.
|
|
There are some my tests, it have advantage at reducing CPU usage @antirez @oranagra For macOS, my macbook has poor configurations, so I can't do some large capacity tests. but I could exactly find redis will cost low cpu if use |
|
@antirez i think this can be merged. i honestly don't see any risks, and the fact the test suite passes, seem to be enough. BTW, @ShooterIT, how about using |
|
hello @oranagra |
|
@ShooterIT i don't understand what you mean. readSyncBulkPayload reads from a socket and write so a file. i think it can use |
|
AFAIK, one fd of splice arguments must be a pipe fd
|
|
What about merging this only into unstable and not into Redis 6? @oranagra @ShooterIT? |
|
That's OK @antirez |
|
I would have argued that this is a safe improvement.. But considering my recent simple fix exploded twice.. I can't.. So since this is not a critical issue, but rather just efficiency, not putting into 6 is ok. |
|
@oranagra yep given the intrinsic issues with programming... better to be safe :-D Anyway we can promote it to Redis 6 after some time. |
|
Merged. |
|
@ShooterIT, not sure if you noticed, but this change was released as part of redis 6.0.4. Later we received some reports that sendfile introduces latency implications. We may have to revert it, but first, i'd like to check if you have anything to see if you happen to have any feedback on that? In any case, thanks for contributing (quite a lot recently). much appreciated. |
|
Thanks for notifying me @oranagra
|
|
I think |
|
@ShooterIT AFAIK it's not an incorrect usage, but rather in inefficient blocking implementation in the kernel. Have a look at this paper. |
|
one possible solution is maybe to use |
|
Thanks @oranagra I will read it ASAP. |
|
I'm not sure it is the same issue but Kafka also had a trouble with sendfile. |
|
Hi @ShooterIT I originally noticed the problems from test failures in KeyDB after merging 6.0.4 and narrowed it down to the sendfile change. The root cause was increased synchronization time which the test were sensitive too. sendfile() is no longer in KeyDB, but I've included a checkpoint where it existed so you can repro.
The active replication tests don't confirm that synchronization has finished before starting which makes them quite sensitive to latency. Adding a "after 2000" delay before the first test resolves the failure as does removing sendfile() completely which I unfortunately had to do. I did spend some time researching why sendfile() is unexpectedly slower, what I found was that because of its reliance on DMA it has to ensure the last TCP packet is ACK'd before returning (and therefore that the TCP buffer is not needed). This means that the call is now affected by TCP round trip latency. It seems to be designed to send bulk transfers not short bursts as is done in Redis/KeyDB. |
|
The other part to point out is that the tests rely on loopback sockets which have notoriously different perf characteristics to real NICs. I'm not a kernel expert so its quite possible some NICs have more efficient sendfile implementations. Nonetheless loopback use is an important scenario for KeyDB and I assume for Redis as well. |
|
it would be nice to benchmark it on various platforms if someone has the time. in any case, even if there are severe implications on just one platform, we probably can't afford to keep it as it is, but maybe we'll decide that it is important enough to consider using in a thread, or another solution comes up. |
|
Thanks @JohnSully Just as @oranagra said,
@antirez @oranagra You could revert it. I'm truly sorry to bring latency implications. There is a test to see I came up with an idea, Maybe using a thread to send RDB file is a good idea, but that brings much complexity especially many replicas need to receive RDB files at the same time. |
|
When I saw this change I thought it was a really good idea. I’m a bit sad it didn’t work out as well as hoped. A separate thread to send the RDB would be cool. I get a consistent repro on my Ubuntu 18.04 machine run in WSL2’s virtual machine but I’ve seen it in more pure linux setups as well. |
|
Hello @JohnSully would you mind to help verify my guess
You could add the following code in https://github.com/antirez/redis/blob/unstable/src/replication.c#L1351, and do your tests in your environment. |
|
@ShooterIT I repro'd without tls compiled and in tests without TLS enabled. Are you sure this will still have an effect? |
|
Hi @JohnSully That's OK. Now, TLS must be disable to enable and add |
|
I can pass the test that you told me by above modifications, although passed the test before on my environments. |
|
Hi @JohnSully @oranagra @antirez I have some new discoveries. What‘s more, I read
In freeBSD source code, |
|
@ShooterIT it seems you are right. |
|
Please revert it @antirez I am truly sorry. |
|
Hi @ShooterIT It will pass the test but only because sendfile is disabled for active replication which that test enables. It doesn't fundamentally fix the issue of slow startup. Active replication is a bit more sensitive because each server will send an RDB to the other so latency is 2x normal. I initially disabled it only for active replication when I hadn't yet understood the impact of sendfile(). |
|
@ShooterIT no problem, there are complex interactions, we never are aware, you did nothing wrong here, instead we maintainers made an error by letting a new feature in that was just a performance improvement, without giving it the time to stay enough into unstable. A big thank you to @JohnSully for the heads up on that. To start I can revert the feature just this way: I reverted the implementation both on Redis 6.0 and unstable. |
Implements sendfile for redis.
I implement
sendfilefor redis, but now only support Linux and macOS since I only have these two operation systems. For performance, it may couldn't save time but can reduce CPU usage.If we implement it on BSD systems, please notice that
sendfilemay return -1 and errno set is to EAGAIN even if some bytes have been sent successfully and the the len argument is set correctly when using a socket marked for non-blocking I/O just like on macOS.