Fix nullptr clientModel access during shutdown #801

furszy · 2024-02-28T13:12:06Z

Fixing #800.

During shutdown, already queue events dispatched from the backend such
'numConnectionsChanged' and 0networkActiveChanged' could try to access
the clientModel object, which might not exist because we manually delete
it inside 'BitcoinApplication::requestShutdown()'.

This happen because boost does not clears the queued events when they arise
concurrently with the signal disconnection (see https://www.boost.org/doc/libs/1_55_0/doc/html/signals2/thread-safety.html).
From the docs:

"Note that since we unlock the connection's mutex before executing its associated slot, it is possible a slot will still be executing after it has been disconnected by a connection::disconnect(), if the disconnect was called concurrently with signal invocation."
"The fact that concurrent signal invocations use the same combiner object means you need to insure any custom combiner you write is thread-safe"

So, we need to guard clientModel before accessing it at the handler side.

DrahtBot · 2024-02-28T13:12:09Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	hebasto

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

maflcko · 2024-02-28T13:25:09Z

src/qt/bitcoingui.cpp


 void BitcoinGUI::updateNetworkState()
 {
+    if (!clientModel) return;


Not sure, this could still race, since there is no lock, so clientModel may still be deleted after this statement?

Not sure, this could still race, since there is no lock, so clientModel may still be deleted after this statement?

Check it now.
BitcoinApplication::requestShutdown() sets the main window (BitcoinGUI) clientModel to nullptr without clearing the subscribed signals (this is the window->setClientModel(nullptr) line), which leaves an open window where the signal handlers are still active but the main window client model is nullptr. So, by unsubscribing the signals prior to the setClientModel(nullptr) call, we avoid any concurrent event happening in-between the main window nullptr client model and the final clientModel destruction at the end of BitcoinApplication::requestShutdown() (which was the one destructing the event handlers).

What about events that are already running? Does the destructor ensure they are flushed/discarded, or will it just disconnect?

What about events that are already running? Does the destructor ensure they are flushed/discarded, or will it just disconnect?

I haven't looked in detail, but it seems like in principle, if any boost signal events are currently running, and they send asynchronous Q_EMIT qt signals that are run in the GUI event loop thread, there should not be a problem. That is assuming this method is always running the GUI event loop thread, and the code deleting clientModel is also running in the GUI event loop thread.

Ok, but then it means this check in this line could be removed?

Ok, but then it means this check in this line could be removed?

I think not if the events are queued, but again I'm not too sure.

A bit late but here again. Sorry, life called.

it means this check in this line could be removed?

Not really. While requestShutdown() is being executed, no other task can be processed in the main thread but new tasks can be enqueued to be executed by the main thread after it. So, any event coming from the backend while requestShutdown() is running (prior to the unsubscribeFromCoreSignals call) will be executed right after it finishes it.
So.. this means that we also need to guard RPCConsole::updateNetworkState() function for a nullptr clientModel. The same segfault could arise there too.

I think not if the events are queued, but again I'm not too sure.

Yes. Both events are running on the main thread. The diff is on who enqueues them: the quit action is appended by the main thread (when we press the quit button) and the "network active change" is enqueued by the backend thread, when the RPC changes the value.

That being said, it is all theoretical. Let me do some tests.

I think not if the events are queued, but again I'm not too sure.

Yes. Both events are running on the main thread. The diff is on who enqueues them: the quit action is appended by the main thread (when we press the quit button) and the "network active change" is enqueued by the backend thread, when the RPC changes the value.

Thanks for clarifying. Indeed, if the events all run on a single thread (haven't checked this), this fix should be correct.

I guess I was confusing myself with the second commit.

if the events all run on a single thread (haven't checked this), this fix should be correct.

Yeah. As far as I have tested, they are running on the main thread. But, this doesn't mean that the code will always act in the same way.. we are using the default qt connection() function, which provides Qt::ConnectionType = Qt::AutoConnection, which per https://doc.qt.io/qt-6/qt.html#ConnectionType-enum it might use Qt::DirectConnection, which executes the task on the signalling thread.
To solve this, we could force the main thread execution by changing the slot connection side but, as I haven't been able to trigger the segfault after the fix, will leave this investigation for a follow-up.

During shutdown, already queue events dispatched from the backend such 'numConnectionsChanged' and 'networkActiveChanged' could try to access the clientModel object, which might not exist because we manually delete it inside 'BitcoinApplication::requestShutdown()'.

maflcko · 2024-02-29T08:17:49Z

The second commit is a refactor? If yes, it would be good to clarify

hebasto · 2024-02-29T15:33:46Z

Concept ACK. I did my own research and I agree with @furszy's analysis if the problem.

hebasto

ACK f357578, I have reviewed the code and it looks OK. Tested on Ubuntu 23.10.

nit: The second commit message might be amended according to #801 (comment).

Preventing dangling signals.

hebasto

re-ACK b7aa717

Summary: ``` During shutdown, already queue events dispatched from the backend such 'numConnectionsChanged' and 0networkActiveChanged' could try to access the clientModel object, which might not exist because we manually delete it inside 'BitcoinApplication::requestShutdown()'. This happen because boost does not clears the queued events when they arise concurrently with the signal disconnection (see https://www.boost.org/doc/libs/1_55_0/doc/html/signals2/thread-safety.html). From the docs: "Note that since we unlock the connection's mutex before executing its associated slot, it is possible a slot will still be executing after it has been disconnected by a connection::disconnect(), if the disconnect was called concurrently with signal invocation." "The fact that concurrent signal invocations use the same combiner object means you need to insure any custom combiner you write is thread-safe" So, we need to guard clientModel before accessing it at the handler side. ``` Backport of [[bitcoin-core/gui#801 | core-gui#801]]. Test Plan: ninja all check-all On OSX, run the Qt with Chronik enabled, let it sync a bit then kill it. Check it doesn't segfault upon shutdown. Reviewers: #bitcoin_abc, PiRK Reviewed By: #bitcoin_abc, PiRK Differential Revision: https://reviews.bitcoinabc.org/D17682

furszy mentioned this pull request Feb 28, 2024

Segfault in qt/clientmodel.cpp:92 on every shutdown #800

Closed

1 task

maflcko reviewed Feb 28, 2024

View reviewed changes

furszy force-pushed the 2024_gui_dont_access_nullptr_clientmodel branch 2 times, most recently from cc337f8 to 4b47c78 Compare February 28, 2024 13:49

furszy force-pushed the 2024_gui_dont_access_nullptr_clientmodel branch from 4b47c78 to f357578 Compare February 28, 2024 20:59

maflcko added this to the 27.0 milestone Feb 29, 2024

hebasto approved these changes Feb 29, 2024

View reviewed changes

refactor: gui, simplify boost signals disconnection

b7aa717

Preventing dangling signals.

furszy force-pushed the 2024_gui_dont_access_nullptr_clientmodel branch from f357578 to b7aa717 Compare February 29, 2024 17:41

hebasto approved these changes Mar 1, 2024

View reviewed changes

hebasto merged commit 776d48d into bitcoin-core:master Mar 4, 2024

furszy deleted the 2024_gui_dont_access_nullptr_clientmodel branch March 4, 2024 12:18

bitcoin-core locked and limited conversation to collaborators Mar 4, 2025

Fix nullptr clientModel access during shutdown #801

Fix nullptr clientModel access during shutdown #801

Uh oh!

Conversation

furszy commented Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Uh oh!

maflcko Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

furszy Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

ryanofsky Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

maflcko Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

ryanofsky Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

furszy Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

furszy Feb 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko commented Feb 29, 2024

Uh oh!

hebasto commented Feb 29, 2024

Uh oh!

hebasto left a comment

Choose a reason for hiding this comment

Uh oh!

hebasto left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

furszy commented Feb 28, 2024 •

edited

Loading

DrahtBot commented Feb 28, 2024 •

edited

Loading

furszy Feb 28, 2024 •

edited

Loading

furszy Feb 28, 2024 •

edited

Loading

furszy Feb 29, 2024 •

edited

Loading