Deterministically shut down all connections on ApiListener::Stop()#10460
Deterministically shut down all connections on ApiListener::Stop()#10460
Conversation
12aa63f to
a5b8289
Compare
a5b8289 to
8a201f9
Compare
|
No need to review this now. I'm still looking at a few things. I will update the PR description and add a comment here when I'm done. |
8a201f9 to
f38c98a
Compare
|
Changes:
Caveats:
|
350c91a to
7ca5994
Compare
|
Incorporated some offline suggestions and fixed a crash due to a pointer initialization I managed to "clean up" during one of the recent rebases. |
lib/remote/apilistener.cpp
Outdated
| try{ | ||
| acceptor->close(); | ||
| } catch (const std::exception& ex) { | ||
| Log(LogCritical, "ApiListener") | ||
| << "Failed to close acceptor socket: " << ex.what(); | ||
| } |
7ca5994 to
e1e0467
Compare
|
Again I've simplified and reduced the scope of the PR, according to the suggestion of @Al2Klimov. I've also think I've addressed the remaining comments that still apply. |
lib/remote/jsonrpcconnection.cpp
Outdated
| std::shared_lock wgLock(*m_WaitGroup, std::try_to_lock); | ||
| if (!wgLock) { | ||
| return; | ||
| } |
There was a problem hiding this comment.
With this lock held not only for the message processing itself but also during socket operations, this raises a question: are these operations unblocked within a reasonable time (very few seconds) during shutdown?
There was a problem hiding this comment.
From my testing, yes. In the previous iteration of the PR I was scared this would be a similar situation as the async_write() in HttpServerConnection blocking the shutdown. But unlike there, here the coroutine mostly sits in the async_read() without actually making progress (i.e. giving other operations a chance to run) or actively reading the rather small RPC messages (like check results).
Then again, the same corner cases could arise. Large messages over very, very slow connections. But even in that case, I think it would be better to let the message finish than to interrupt reading it in the middle, even if it takes a second or two.
To move around this, acquiring the lock somewhere else isn't even enough, I would have to pass something into NetString::ReadStringFromStream to interrupt reading the message, which I really don't want to do.
There was a problem hiding this comment.
After thinking about this some more, unlike in the previous iteration of the PR (which had the WaitGroup::Cancel() in a timeout), I could now get away with just locking the WaitGroup per iteration, after the read. But again... it depends on what we want... do we want to shut down fast or do we want to make sure things we started get finished... We can't have this cake and eat it too.
There was a problem hiding this comment.
or do we want to make sure things we started get finished
That just depends on your interpretation of what counts as started. I'd totally fine with only counting the message processing itself, no the read operation.
The other extreme (I haven't checked in detail yet if something like this could actually happen) would be the remote crashes so that the connection just hangs in the middle of reading a single JSON-RPC message. In that situation, it shouldn't happen that the shutdown blocks until the CheckLiveness() coroutine kills the connection.
The nicest option would be to signal that we want to shut down to the connection, process all outstanding messages and then cleanly shut down the connection, with an additional quick timeout that would kill the connection if this doesn't happen within a few connection and only waits for the processing of the current message then (so that if processing of a message started, it also finishes). However, I don't see how an implementation without introducing new cluster messages to notify the peer about the intent to shut down would work any better than more or less randomly selecting a last message after which the connection is closed which should be pretty much what you get when locking around each MessageHandler() individually.
There was a problem hiding this comment.
I don't see how an implementation without introducing new cluster messages [...]
I get that and also commented on it here (last point).
I'd totally fine with only counting the message processing itself, no the read operation.
Alright, so I'd move to lock to the top of MessageHandler(). Is that fine? Should the lock in WriteOutgoingMessages() stay then or should I rather remove it entirely? There's a timeout in Disconnect(), but I guess that can fail to cancel the operation too under certain conditions.
e1e0467 to
44c902d
Compare
|
I've pushed and update that should address the remaining comments, with the exception of the wait group lock in Another change is that I'm now using a boost::signal+post instead of the post+coroutine I used to stop the listener before. I've also moved the lock inside the loop because I had after all encountered rare situations where, like @julianbrost warned, the |
Did this just happen randomly? Were you able to make this happen more reliably by creating some specific scenario?
Does that prevent the |
|
Sorry, I forgot to address the following in a previous comment/review:
For me, it's pretty similar to the one in the read loop and don't add unless:
|
I think it's the same situation I've observed in #5581, with the coroutine yielding again when the async operation is actually already done. Here, this would mean closing the socket comes too late to make async_accept() throw, but obviously still closes the socket. I thought about alternatively checking if the wait group is not lockable anymore when catching the
Well, it acts as locking and a loop condition. I can also keep it outside the loop and just do |
We can ensure they are unblocked, but not guarantee it's within a reasonable time, even when the 5s timeout in |
Is that really the correct PR you wanted to reference? I was asking primarily because I didn't understand how changing the scope of the locking should solve bad file descriptor errors and if these appear randomly, that's always a hint that something could be wrong with the code.1 In the current state of the PR, I think there could still be a small race condition in that respect: icinga2/lib/remote/apilistener.cpp Lines 519 to 524 in 44c902d The accepting socket is already closed by line 521, but the wait group you check before starting a new icinga2/lib/remote/apilistener.cpp Lines 541 to 548 in 44c902d So I think Footnotes
|
44c902d to
fb81054
Compare
Sorry, no. I screwed up the auto complete. I meant #10139.
I mean, it didn't appear randomly, it appeared sporadically when due to timing closing the socket didn't cancel an ongoing Anyway, I've just pushed yet another update. Note: I just noticed that |
Randomly in the sense that I don't understand the pattern and it looks random to me. ;) Just remove "randomly" from my sentence. seeing a bad file descriptor error anywhere is rarely a good sign. |
julianbrost
left a comment
There was a problem hiding this comment.
We should be getting close to the finish line now as I'm getting more into some details and a bit of code style. Overall functionality should be good from looking at the code, I'll give it a test later today.
@Al2Klimov @yhabteab I only expect some small changes here, so in case you want to take another look, now (or after the next push) would be a good time.
The wait group gets passed to HttpServerConnection, then down to the HttpHandlers. For those handlers that modify the program state, the wait group is locked so ApiListener will wait on Stop() for the request to complete. If the request iterates over config objects, a further check on the state of the wait group is added to abort early and not delay program shutdown. In that case, 503 responses will be sent to the client. Additionally, in HttpServerConnection, no further requests than the one already started will be allowed once the wait group is joining.
fb81054 to
82bb636
Compare
|
Fixed the trailing whitespace, added the |
julianbrost
left a comment
There was a problem hiding this comment.
Thank you very much!
@Al2Klimov @yhabteab I only expect some small changes here, so in case you want to take another look, now (or after the next push) would be a good time.
Not merging yet until I get some feedback there.
|
yhabteab herewith dismisses all his requested changes :)! Just joking - I didn't request anything explicitly here on purpose, but submitted only few comments, so my review shouldn't block this in anyway. Since Alex already requested a review from himself, it should be fine when both of you approve it! |
|
My over loaded docker setup found nothing suspicious, so this should be fine. Thanks to all of you 🎉! |
Deterministically shut down all connections on ApiListener::Stop()
Currently, while the shutdown order is determined by icinga2 object priorities, the connection objects on the backend can continue to have effects on the state of the program, even after seemingly having been shut down (i.e. the
Stop()method of ApiListener has been called). In practice, the connection objects will mostly stay alive until the very end when icinga2 exits and will answer all kinds of requests until then.To mitigate some of the effects from
ApiListener's associated objects afterStop()has returned, with this PR, these additional steps will be performed (in order):ListenerCoroutineProcJsonRpcConnections503: Service unavailableThis PR relates to #10179 and implements one of the TODOs in @Al2Klimov's comment.
Exiting from
ListnerCoroutineProcListenerCoroutineProcis stopped by closing the acceptor socket, then waiting for the completion of outstandingNewClientHandlers using a separate wait group.Shutting down the remaining endpoint and anonymous JsonRpcConnections
Since the ListenerCoroutineProc will not take any new connections now, and
m_ReconnectTimerhas already been stopped, we can be sure that no endpoints are currently connecting and we can simply iterate over all endpoint objects andDisconnect()those that are still connected.This means that some check results are still discarded when the wait group is stopped. A solution to this will require a message to be passed between the nodes and could be addressed in a future PR.
Aborting HTTP API requests
In API requests that modify or have the potential to modify the program state, the wait group will be locked and in case of potentially longer running requests, the stopped condition will be used to abort loops that iterate over config objects early and return with a 503 error code:
fixes #10179