-
Notifications
You must be signed in to change notification settings - Fork 38.6k
qt: lock cs_main, m_cached_tip_mutex in that order #19132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qt: lock cs_main, m_cached_tip_mutex in that order #19132
Conversation
|
Notice: this is just one possible fix. Other possible fixes are:
cc @furszy |
src/qt/clientmodel.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the requirement to check IsNull twice is dropped, then I think this whole function can be written shorter in just 4 lines of code:
uint256 tip{WITH_LOCK(m_cached_tip_mutex, return m_cached_tip_blocks)};
if (!tip.IsNull()) return tip;
tip = m_node.getBestBlockHash(); // Will lock `cs_main` (and release it). This is needed to be done early to not violate the implicit lock order cs_main -> m_cached_tip_mutex
return WITH_LOCK(m_cached_tip_mutex, return m_cached_tip_blocks = tip);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above snippet can end up setting/overwriting m_cached_tip_blocks if it is not null. I.e. if it is null during the if, later some other thread sets it to some non null value (while we are calling m_node.getBestBlockHash()) and we end up overwriting what the other thread set.
Maybe that is ok in this case or maybe it cannot happen due to some constraints outside of this function. This code is new to me so I played safe :)
Btw if (A) return B; has two disadvantages compared to putting the return on a separate line:
- it is not possible to set a breakpoint on
return B;only. For example if the condition is evaluated 1000 times, butAistrueonly a few times and one wants to break whenreturn B;is executed. - it will skew line-based code coverage.
Always lock the mutexes `cs_main` and `m_cached_tip_mutex` in the same order: `cs_main`, `m_cached_tip_mutex`. Otherwise we may end up in a deadlock. `ClientModel::m_cached_tip_blocks` is protected by `ClientModel::m_cached_tip_mutex`. There are two access paths that lock the two mutexes in opposite order: ``` validation.cpp:2868 CChainState::ActivateBestChain(): lock cs_main validation.cpp:2916 CChainState::ActivateBestChain(): call uiInterface.NotifyBlockTip() ui_interface.cpp:52 CClientUIInterface::NotifyBlockTip(): go deep in boost ... qt/clientmodel.cpp:255 BlockTipChanged(): lock m_cached_tip_mutex ``` and ``` qt/clientmodel.cpp:119 ClientModel::getBestBlockHash(): lock m_cached_tip_mutex qt/clientmodel.cpp:121 ClientModel::getBestBlockHash(): call m_node.getBestBlockHash() interfaces/node.cpp:200 NodeImpl::getBestBlockHash(): lock cs_main ``` From `debug.log`: ``` POTENTIAL DEADLOCK DETECTED Previous lock order was: m_cs_chainstate validation.cpp:2851 (1) cs_main validation.cpp:2868 ::mempool.cs validation.cpp:2868 (2) clientmodel->m_cached_tip_mutex qt/clientmodel.cpp:255 Current lock order is: (2) m_cached_tip_mutex qt/clientmodel.cpp:119 (1) ::cs_main interfaces/node.cpp:200 ``` The possible deadlock was introduced in bitcoin#17993
367c8a6 to
f46b678
Compare
|
Hmm, good catch. Would actually think it on the other way around. The problem shouldn't be on the slot side (the slot can be inside the same process or be another process -- GUI division -- or whatever). Better to review the reason behind Why not doing the same as what is being done on the |
|
See commit d86edd3 |
|
Tested ACK f46b678 |
f46b678 qt: lock cs_main, m_cached_tip_mutex in that order (Vasil Dimov) Pull request description: Always lock the mutexes `cs_main` and `m_cached_tip_mutex` in the same order: `cs_main`, `m_cached_tip_mutex`. Otherwise we may end up in a deadlock. `ClientModel::m_cached_tip_blocks` is protected by `ClientModel::m_cached_tip_mutex`. There are two access paths that lock the two mutexes in opposite order: ``` validation.cpp:2868 CChainState::ActivateBestChain(): lock cs_main validation.cpp:2916 CChainState::ActivateBestChain(): call uiInterface.NotifyBlockTip() ui_interface.cpp:52 CClientUIInterface::NotifyBlockTip(): go deep in boost ... qt/clientmodel.cpp:255 BlockTipChanged(): lock m_cached_tip_mutex ``` and ``` qt/clientmodel.cpp:119 ClientModel::getBestBlockHash(): lock m_cached_tip_mutex qt/clientmodel.cpp:121 ClientModel::getBestBlockHash(): call m_node.getBestBlockHash() interfaces/node.cpp:200 NodeImpl::getBestBlockHash(): lock cs_main ``` From `debug.log`: ``` POTENTIAL DEADLOCK DETECTED Previous lock order was: m_cs_chainstate validation.cpp:2851 (1) cs_main validation.cpp:2868 ::mempool.cs validation.cpp:2868 (2) clientmodel->m_cached_tip_mutex qt/clientmodel.cpp:255 Current lock order is: (2) m_cached_tip_mutex qt/clientmodel.cpp:119 (1) ::cs_main interfaces/node.cpp:200 ``` The possible deadlock was introduced in bitcoin#17993 ACKs for top commit: jonasschnelli: Tested ACK f46b678 Tree-SHA512: 904f24b39bdc97c4d0ecb897a6980d8d479814535eb167e23105238800ea2f1f85273e3370cf894db58bc597f94c4f2e81fb68d0ff3362d468c16af5ce8f5d78
Summary: PR description > Rationale: > > The height based polling in the GUI is an issue on chain reorgs. Any new tip signal with the same height as the one that it's cached in the model was not triggering the GUI update (interpreting it as the same same block when it could receive a different one). Ending up with bad information presented in the GUI. > > This PR essentially changes the last cached height to be a last cached block hash. bitcoin/bitcoin@2f86720 > [ClientModel] best header/block hash cached. bitcoin/bitcoin@f46b678 > qt: lock cs_main, m_cached_tip_mutex in that order > > Always lock the mutexes `cs_main` and `m_cached_tip_mutex` in > the same order: `cs_main`, `m_cached_tip_mutex`. Otherwise we may end up > in a deadlock. > > `ClientModel::m_cached_tip_blocks` is protected by > `ClientModel::m_cached_tip_mutex`. There are two access paths that > lock the two mutexes in opposite order: > > ``` > validation.cpp:2868 CChainState::ActivateBestChain(): lock cs_main > validation.cpp:2916 CChainState::ActivateBestChain(): call uiInterface.NotifyBlockTip() > ui_interface.cpp:52 CClientUIInterface::NotifyBlockTip(): go deep in boost > ... > qt/clientmodel.cpp:255 BlockTipChanged(): lock m_cached_tip_mutex > ``` > > and > > ``` > qt/clientmodel.cpp:119 ClientModel::getBestBlockHash(): lock m_cached_tip_mutex > qt/clientmodel.cpp:121 ClientModel::getBestBlockHash(): call m_node.getBestBlockHash() > interfaces/node.cpp:200 NodeImpl::getBestBlockHash(): lock cs_main > ``` > > From `debug.log`: > > ``` > POTENTIAL DEADLOCK DETECTED > Previous lock order was: > m_cs_chainstate validation.cpp:2851 > (1) cs_main validation.cpp:2868 > ::mempool.cs validation.cpp:2868 > (2) clientmodel->m_cached_tip_mutex qt/clientmodel.cpp:255 > Current lock order is: > (2) m_cached_tip_mutex qt/clientmodel.cpp:119 > (1) ::cs_main interfaces/node.cpp:200 > ``` > > The possible deadlock was introduced in > bitcoin/bitcoin#17993 This is a backport of Core [[bitcoin/bitcoin#17993 | PR17993]] [1/2] and [[bitcoin/bitcoin#19132 | PR19132]] Test Plan: `ninja && src/qt/bitcoin-qt` Send a transaction, wait for a few block and check that the tx status is correctly updated in the interface. Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Subscribers: majcosta Differential Revision: https://reviews.bitcoinabc.org/D9253
Always lock the mutexes
cs_mainandm_cached_tip_mutexinthe same order:
cs_main,m_cached_tip_mutex. Otherwise we may end upin a deadlock.
ClientModel::m_cached_tip_blocksis protected byClientModel::m_cached_tip_mutex. There are two access paths thatlock the two mutexes in opposite order:
and
From
debug.log:The possible deadlock was introduced in #17993