Remove tmp rdb file in background thread#7762
Conversation
|
@ShooterIT why is the loop in |
|
@oranagra Just it is a good programming style, we closed all files we opened when process exited even OS could clean up everything as much as possible, actually, Linux can't clean somethings such as POSIX semaphores when process exits, https://man7.org/linux/man-pages/man7/sem_overview.7.html just above reason, I want to unlink file exactly when |
|
From my experience, cleanup code at exit can often do more harm than good (like prevent the process from exiting or delay it, thus delaying the restart and making the service unavailable). p.s. i would rather have |
|
Another thing we should care about is we only can call async-signal-safe functions from signal handlers safely, I also like to use a |
|
hi @oranagra I took a more look at |
|
@ShooterIT i see that indeed snprintf isn't listed as async-signal-safe (not required to be so in POSIX), but i actually don't understand why... unlike printf which messes with stdout, sprintf is just a utility function, so maybe it is safe although not required to be. but maybe an easier solution would be to just use |
|
@oranagra I also don't know it clearly. There are some discussions, https://stackoverflow.com/questions/6758222/snprintf-in-signal-handler-creates-segmentation-fault-if-started-with-valgrind We can't use |
|
I think we should just adopt a async-signal-safe implementation of a limited Regarding cleaning up, I agree with @ShooterIT this is a good practice but I also agree with @oranagra that we probably don't want to block a terminating process only because we're waiting for |
|
@yossigo why import another snprintf implementation and not just use sdscatfmt for this one case? i can argue that even when |
|
@oranagra Why do you assume |
|
@oranagra In signal handle, we only can use async-signal-safe functions, In |
|
sorry, my bad about malloc. i prefer |
a645306 to
0d77dbf
Compare
|
I also remove waiting loop in |
|
In order to test file still is not closed, i add |
|
@oranagra Please review my comments in |
|
@ShooterIT i see the new test failed on MacOS, can you look into it? |
| for {set i 0} {$i < 20} {incr i} { | ||
| r set $i $i | ||
| } | ||
| # It will cost 2s(20 * 100ms) to dump rdb |
There was a problem hiding this comment.
here, total time is 2s
|
|
||
| # Child is dumping rdb | ||
| r bgsave | ||
| after 100 |
There was a problem hiding this comment.
Oh, is this problem? @oranagra yesterday, you pointed out my some similar problems, after 100 freezes tcl code?
#7819 (comment)
There was a problem hiding this comment.
the assertion that fails is the one in line 42 (first assert in the second test).
and in this case what could have happened is that the after 100 was not enough, and we tested the existence of the rdb file before redis started to save it (since redis was for some reason frozen for some 100ms).
i guess we should change this to a wait_for_condition loop that will usually exit a lot sooner, but at the same time have longer timeout.
i suppose similar changes can be made in a few other places in these new tests (to reflect the spirit of the changes i asked for in #7819)
please make sure you can reproduce this (possibly bu pushing an experimental branch into github to use the actions) before attempting to solve it.
We're already using bg_unlink in several places to delete the rdb file in the background, and avoid paying the cost of the deletion from our main thread. This commit uses bg_unlink to remove the temporary rdb file in the background too. However, in case we delete that rdb file just before exiting, we don't actually wait for the background thread or the main thread to delete it, and just let the OS clean up after us. i.e. we open the file, unlink it and exit with the fd still open. Furthermore, rdbRemoveTempFile can be called from a thread and was using snprintf which is not async-signal-safe, we now use ll2string instead. (cherry picked from commit b002d2b)
We're already using bg_unlink in several places to delete the rdb file in the background, and avoid paying the cost of the deletion from our main thread. This commit uses bg_unlink to remove the temporary rdb file in the background too. However, in case we delete that rdb file just before exiting, we don't actually wait for the background thread or the main thread to delete it, and just let the OS clean up after us. i.e. we open the file, unlink it and exit with the fd still open. Furthermore, rdbRemoveTempFile can be called from a thread and was using snprintf which is not async-signal-safe, we now use ll2string instead.
We're already using bg_unlink in several places to delete the rdb file in the background, and avoid paying the cost of the deletion from our main thread. This commit uses bg_unlink to remove the temporary rdb file in the background too. However, in case we delete that rdb file just before exiting, we don't actually wait for the background thread or the main thread to delete it, and just let the OS clean up after us. i.e. we open the file, unlink it and exit with the fd still open. Furthermore, rdbRemoveTempFile can be called from a thread and was using snprintf which is not async-signal-safe, we now use ll2string instead. (cherry picked from commit b002d2b)
Remove tmp rdb file in background thread to avoid unnecessary time cost. Maybe removing tmp aof file is similar.