-
Notifications
You must be signed in to change notification settings - Fork 38.8k
refactor: drop boost::signals2 in validationinterface #18524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Stop using boost::signals2 internally in validationinterface. Replace with std::list and Add/Remove/Clear/Iterate helper functions. Motivation for change is to reduce dependencies and avoid issues happening with boost versions before 1.59: bitcoin#18517 bitcoin#18471
|
Concept ACK: less boost is better |
|
Concept ACK: it solved the issue I had. |
|
Concept ACK. |
|
Updated ba8312c -> ad067a9 ( |
promag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested that this fixes the issue. I also prefer dropping boost signals2.
I was going a different way. From https://stackoverflow.com/a/2265979
I suggest you go the other way around and connect a dummy slot, then disconnect it when your "real" slot is invoked. Connecting another slot will clean up stale connections, so your slot should be released.
| void Add(std::shared_ptr<CValidationInterface> callbacks) | ||
| { | ||
| LOCK(m_mutex); | ||
| auto inserted = m_map.emplace(callbacks.get(), m_list.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we add the same twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
So we add the same twice?
There's two emplaces because of the map and list (if that's the question)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this is allowing adding the same callbacks. Why not assert(inserted.second)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
I mean this is allowing adding the same callbacks. Why not
assert(inserted.second)?
Unit tests fail if the unregister function is not idempotent, so it seems good to me that the register function is idempotent as well. I could imagine an idempotent API here making calling code simpler, even though I could also imagine an assert catching potential bugs. I think asserts tend to work a better at short range within a module instead of being used to make an API rigid in an attempt to catch external bugs. But I don't think there is a right answer here and would be ok with a PR that took a different approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think these shouldn't be idempotent because makes sense to call just once. For instance std::fstream::open is not idempotent, fails if file is already opened. I'd rather fix unit tests. Just an opinion and should not prevent this change going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Personally I think these shouldn't be idempotent because makes sense to call just once. For instance
std::fstream::openis not idempotent, fails if file is already opened. I'd rather fix unit tests. Just an opinion and should not prevent this change going forward.
Feel free to open a followup. If I wanted to change calling code not to use the same pointer more than once, I would change these to return bool and have calling code assert that they return true, rather having these crash on cases they can reasonably handle and making assumptions about calling code. I do find API's like std::map's insert/erase that are idempotent convenient for avoiding boilerplate code, so that's another reason I like the keeping the boost behavior beyond not wanting to increase scope of this PR and not liking asserts across an api boundary
|
|
||
| template<typename F> void Iterate(F&& f) | ||
| { | ||
| WAIT_LOCK(m_mutex, lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about local copy and then iterate it lock free?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
How about local copy and then iterate it lock free?
I'm not sure what approach could be entirely lock free since the list is global and can be modified from any thread.
I just implemented something simple that can be changed and optimized in the future. The lock is not held when calling CValidation interface methods. It avoids copying so allocations aren't needed just to iterate the list. Assumption is that the list is iterated frequently and modified infrequently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So at best case it would be lock free I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I wanted to delete my own comment, but it seems I misclicked and deleted @promag's. Restoring from mail:
Assumption is that the list is iterated frequently and modified infrequently
Right. What I had in mind is 2 lists and an atomic book list_changed. On iterating it would then sync the list with the mutex locked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) I was wondering what happened. So draft:
if (list_changed) {
lock;
iterate_list = list;
list_changed = false
}
for (f : iterate_list) f()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Right. What I had in mind is 2 lists and an atomic book list_changed. On iterating it would then sync the list with the mutex locked.
I don't understand the suggestion from the pseudocode. I don't love the idea of having multiple lists, but even if you have them, I don't see how you avoid locks copying the list if you do create copies, or avoid needing a condition variable to delay modifying the list if you don't create copies. I'm probably just making an incorrect assumption about what you want to accomplish here. Feel free to post sample code, or just change the implementation in an alternate PR or followup PR. I'd be especially interested if it's a simplification and not just an optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not possible that there are two invocations of Iterate simultaneously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Is it not possible that there are two invocations of Iterate simultaneously?
It shouldn't be the most common thing but should be possible because notifications like CMainSignals::BlockChecked and CMainSignals::NewPoWValidBlock run on calling thread. Other notifications are sent from the scheduler so shouldn't happen simultaneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not possible that there are two invocations of Iterate simultaneously?
My previous suggestion isn't possible in this case.
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested 01639a2 on Ubuntu 16.04.6 LTS, it works as expected.
The end of output:
...
2020-04-04T20:13:51Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) started
2020-04-04T20:13:51Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) completed (0.00s)
2020-04-04T20:13:51Z [default wallet] Releasing wallet
2020-04-04T20:13:51Z Shutdown: done
The only concern is about failed CentOS 7 build on Travis.
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 01639a2 -> 3d463ad (pr/nosig.3 -> pr/nosig.4, compare) to fix centos 7 compiler error
| void Add(std::shared_ptr<CValidationInterface> callbacks) | ||
| { | ||
| LOCK(m_mutex); | ||
| auto inserted = m_map.emplace(callbacks.get(), m_list.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
So we add the same twice?
There's two emplaces because of the map and list (if that's the question)
|
|
||
| template<typename F> void Iterate(F&& f) | ||
| { | ||
| WAIT_LOCK(m_mutex, lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
How about local copy and then iterate it lock free?
I'm not sure what approach could be entirely lock free since the list is global and can be modified from any thread.
I just implemented something simple that can be changed and optimized in the future. The lock is not held when calling CValidation interface methods. It avoids copying so allocations aren't needed just to iterate the list. Assumption is that the list is iterated frequently and modified infrequently
| void Clear() | ||
| { | ||
| LOCK(m_mutex); | ||
| for (auto it = m_list.begin(); it != m_list.end();) { | ||
| if (--it->count == 0) it = m_list.erase(it); else ++it; | ||
| } | ||
| m_map.clear(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It follows from the code of the Clear() function that m_list could be non-empty when the function returns.
Is it consistent state if m_list.empty() == false and m_map.empty() == true ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
It follows from the code of the
Clear()function thatm_listcould be non-empty when the function returns.Is it consistent state if
m_list.empty() == falseandm_map.empty() == true?
Yes, this happens when the reference count of one or more callbacks in the list is nonzero, which means they are currently being called, and will be removed by the iterate function when they are done being called, clearing the list.
|
Updated 3d463ad -> b5fea24 ( Also I'm noticing random appveyor test failures with |
| m_list.emplace_back(); | ||
| inserted.first->second = std::prev(m_list.end()); | ||
| } | ||
| inserted.first->second->callbacks = std::move(callbacks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why callbacks are moved into the m_list always, but not only when an insertion into m_map occurs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Why
callbacksare moved into them_listalways, but not only when an insertion intom_mapoccurs?
Shouldn't make a difference in practice, but it seemed better to me to call operator=(&&) and do empty destroy on the RHS object consistently instead of having different call sequences depending on a basically unrelated condition. Also, this will probably never matter but maybe always updating could be useful and worth guaranteeing in the future if there's internal property of the pointer worth updating like a new custom deleter.
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK b5fea24
|
ACK b5fea24 |
src/validationinterface.cpp
Outdated
| boost::signals2::signal<void (const CBlockIndex *, const std::shared_ptr<const CBlock>&)> NewPoWValidBlock; | ||
|
|
||
| Mutex m_mutex; | ||
| struct Entry { std::shared_ptr<CValidationInterface> callbacks; int count = 1; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in stating the invariants here?
- There are two types of callbacks, ones that are registered (in m_map) and ones that are not (because they're still being executed while being deleted).
- m_list contains all callbacks of both types
- entry.count is equal to the number of current executions of that entry, plus 1 if it's registered. It cannot be 0 (because that would imply being unregistered, and not being executed).
Perhaps it's worth spelling these out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in stating the invariants here?
Yes, that's all correct and well stated. I'll add this as a comment
|
One way to test this change is to load and unload wallets repeatedly like in #18362. Each time a wallet is loaded and unloaded, a CValidationInterface instance gets registered and unregistered. If this is done during a sync when there are lots of UpdateTip and BlockConnected callbacks, it can be a good way to mix registrations with notifications and try to trigger deadlocks & segfaults we've previously seen here. Some command lines I used for testing this: src/bitcoind -testnet -debug=1 -debugexclude=libevent -server=1 -printtoconsole
src/bitcoin-cli -testnet createwallet w1
src/bitcoin-cli -testnet unloadwallet w1
src/bitcoin-cli -testnet createwallet w2
src/bitcoin-cli -testnet unloadwallet w2
while src/bitcoin-cli -testnet loadwallet w1 && src/bitcoin-cli -testnet unloadwallet w1; do true; done
while src/bitcoin-cli -testnet loadwallet w2 && src/bitcoin-cli -testnet unloadwallet w2; do true; done |
|
Out of curiosity, have you considered a read write lock? |
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated b5fea24 -> c3a4716 (pr/nosig.5 -> pr/nosig.6, compare) adding suggested comments and renaming a few things to match comments.
Out of curiosity, have you considered a read write lock?
I was thinking about one in the context of the solution you were proposing. If we just used a read-write lock in the most straightforward way here without reference counts registering and unregistering would block while callbacks were executing which would be changing behavior from boost signals and might lead to external deadlocks
src/validationinterface.cpp
Outdated
| auto inserted = m_map.emplace(callbacks.get(), m_list.end()); | ||
| if (inserted.second) { | ||
| m_list.emplace_back(); | ||
| inserted.first->second = std::prev(m_list.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these lines can be combined into inserted.first->second = m_list.emplace(m_list.end());.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
I think these lines can be combined into
inserted.first->second = m_list.emplace(m_list.end());.
Thanks, switched to this
| boost::signals2::signal<void (const CBlock&, const BlockValidationState&)> BlockChecked; | ||
| boost::signals2::signal<void (const CBlockIndex *, const std::shared_ptr<const CBlock>&)> NewPoWValidBlock; | ||
|
|
||
| Mutex m_mutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do all of these member variables/types need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Do all of these member variables/types need to be public?
No none do, added private/public sections
|
ACK 9617600 |
promag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one little difference from boost signals2 when some validation interface is registered in the middle of some Iterate which results in a different order of callback execution. However I don't see how this could be a problem.
| void Add(std::shared_ptr<CValidationInterface> callbacks) | ||
| { | ||
| LOCK(m_mutex); | ||
| auto inserted = m_map.emplace(callbacks.get(), m_list.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think these shouldn't be idempotent because makes sense to call just once. For instance std::fstream::open is not idempotent, fails if file is already opened. I'd rather fix unit tests. Just an opinion and should not prevent this change going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 9617600 -> d6815a2 (pr/nosig.7 -> pr/nosig.8, compare) just adding public/private, simplifying emplace, and making erase calls more consistent
re: #18524 (review)
There is one little difference from boost signals2 when some validation interface is registered in the middle of some
Iteratewhich results in a different order of callback execution. However I don't see how this could be a problem.
Curious what the boost behavior is when registrations change during a call. Seems like it would have to make a snapshot copy or use a kind of versioning to be able to ignore additions / removals
| boost::signals2::signal<void (const CBlock&, const BlockValidationState&)> BlockChecked; | ||
| boost::signals2::signal<void (const CBlockIndex *, const std::shared_ptr<const CBlock>&)> NewPoWValidBlock; | ||
|
|
||
| Mutex m_mutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Do all of these member variables/types need to be public?
No none do, added private/public sections
| void Add(std::shared_ptr<CValidationInterface> callbacks) | ||
| { | ||
| LOCK(m_mutex); | ||
| auto inserted = m_map.emplace(callbacks.get(), m_list.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
Personally I think these shouldn't be idempotent because makes sense to call just once. For instance
std::fstream::openis not idempotent, fails if file is already opened. I'd rather fix unit tests. Just an opinion and should not prevent this change going forward.
Feel free to open a followup. If I wanted to change calling code not to use the same pointer more than once, I would change these to return bool and have calling code assert that they return true, rather having these crash on cases they can reasonably handle and making assumptions about calling code. I do find API's like std::map's insert/erase that are idempotent convenient for avoiding boilerplate code, so that's another reason I like the keeping the boost behavior beyond not wanting to increase scope of this PR and not liking asserts across an api boundary
src/validationinterface.cpp
Outdated
| auto inserted = m_map.emplace(callbacks.get(), m_list.end()); | ||
| if (inserted.second) { | ||
| m_list.emplace_back(); | ||
| inserted.first->second = std::prev(m_list.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #18524 (comment)
I think these lines can be combined into
inserted.first->second = m_list.emplace(m_list.end());.
Thanks, switched to this
|
ACK d6815a2 |
Yes, that's the case, slots are called without any lock using a local list. |
promag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| explicit MainSignalsInstance(CScheduler *pscheduler) : m_schedulerClient(pscheduler) {} | ||
|
|
||
| void Register(std::shared_ptr<CValidationInterface> callbacks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, could avoid incrementing usage count - receive reference.
| //! count is equal to the number of current executions of that entry, plus 1 | ||
| //! if it's registered. It cannot be 0 because that would imply it is | ||
| //! unregistered and also not being executed (so shouldn't exist). | ||
| struct ListEntry { std::shared_ptr<CValidationInterface> callbacks; int count = 1; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit int count{1};
| LOCK(m_mutex); | ||
| auto it = m_map.find(callbacks); | ||
| if (it != m_map.end()) { | ||
| if (!--it->second->count) m_list.erase(it->second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, == 0.
|
ACK d6815a2 I like the general direction of this (to move away from boost::signals2), and as it works around a problem with boost I think this is the preferable way to fix this (both in 0.20 and master).
FWIW asserts in destructors have turned out to be pretty terrible in generating shutdown crashes in unexpected conditions such as errors. So I'm not sure I like this. |
|
re-ACK d6815a2 |
It could be something that is only enabled on |
|
Thanks for suggestions and reviews! I was pushing back on some suggestions that would have expanded scope of this PR or prevented it from being a refactor, but I'm working on a followup PR to implement these. |
| void Clear() | ||
| { | ||
| LOCK(m_mutex); | ||
| for (auto it = m_list.begin(); it != m_list.end();) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct that this iterates over the entire list? I think it should only iterate over entries that are in the map (the count of those that are already unregistered shouldn't be decremented further).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested fix here: #18551
Bug in MainSignalsInstance::Clear could cause validation interface callbacks to be deleted during execution if UnregisterAllValidationInterfaces was called more than once. Bug was introduced in bitcoin#18524 and is fixed by bitcoin#18551
Bug in MainSignalsInstance::Clear could cause validation interface callbacks to be deleted during execution if UnregisterAllValidationInterfaces was called more than once. Bug was introduced in bitcoin#18524 and is fixed by bitcoin#18551
Bug in MainSignalsInstance::Clear could cause validation interface callbacks to be deleted during execution if UnregisterAllValidationInterfaces was called more than once. Bug was introduced in bitcoin#18524 and is fixed by bitcoin#18551
Bug in MainSignalsInstance::Clear could cause validation interface callbacks to be deleted during execution if UnregisterAllValidationInterfaces was called more than once. Bug was introduced in bitcoin#18524 and is fixed by bitcoin#18551
Summary: d6815a2313158862d448733954a73520f223deb6 refactor: drop boost::signals2 in validationinterface (Russell Yanofsky) Pull request description: Stop using boost::signals2 internally in validationinterface. Replace with std::list and Add/Remove/Clear/Iterate helper functions. Motivation for change is to reduce dependencies and avoid issues happening with boost versions before 1.59: bitcoin/bitcoin#18517, bitcoin/bitcoin#18471 ACKs for top commit: MarcoFalke: ACK d6815a2313158862d448733954a73520f223deb6 laanwj: ACK d6815a2313158862d448733954a73520f223deb6 hebasto: re-ACK d6815a2313158862d448733954a73520f223deb6 promag: ACK d6815a2313158862d448733954a73520f223deb6. Tree-SHA512: 4fc0f14a8446e8616cc142af6c3d36815f3254525d30348ba8e4d4bc74c249a5a8c9bc119bdd1be7ebd7abe0b784bc0c5551a3e156a766890cb2fdd891a95919 Backport of Core [[bitcoin/bitcoin#18524 | PR18524]] Depends on D6652 Test Plan: `ninja check check-functional` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D6653
Bug in MainSignalsInstance::Clear could cause validation interface callbacks to be deleted during execution if UnregisterAllValidationInterfaces was called more than once. Bug was introduced in bitcoin/bitcoin#18524 and is fixed by bitcoin/bitcoin#18551
Bug in MainSignalsInstance::Clear could cause validation interface callbacks to be deleted during execution if UnregisterAllValidationInterfaces was called more than once. Bug was introduced in bitcoin/bitcoin#18524 and is fixed by bitcoin/bitcoin#18551
Stop using boost::signals2 internally in validationinterface. Replace with std::list and Add/Remove/Clear/Iterate helper functions.
Motivation for change is to reduce dependencies and avoid issues happening with boost versions before 1.59: #18517, #18471