[Consensus] Protect mapBlockIndex with its own mutex#951
[Consensus] Protect mapBlockIndex with its own mutex#951codeofalltrades merged 2 commits intoVeil-Project:masterfrom
Conversation
CaveSpectre11
left a comment
There was a problem hiding this comment.
One thing that bugs me is that cs_main is used as a general lock, when we should be locking individual things. Since this is targeting mapBlockIndex, do you see a reason to not create a lock specifically for mapBlockIndex so we can start separating from this giant super lock that is preventing significant multithreading to happen?
src/validation.cpp
Outdated
| LOCK(cs_main); | ||
| // If mapBlockIndex isn't bloated, don't bother taking the time. | ||
| if (chainActive.Height() > static_cast<int>(mapBlockIndex.size() - PRUNE_COUNT)) { | ||
| return; |
There was a problem hiding this comment.
This area of code was a major place of inefficiency (primarily later on since IsAncestor took way longer than chainActive.Contains, and it was being done first). It's probably not something I want to lock through the whole thing, since it's looking for stale tips that are old (PRUNE_DEPTH).
I think I would prefer locking to check if mapBlockIndex is bloated; and if so, make a copy, unlock and then let it run with the copy even if mapBlockIndex changes; so we're not locked for the whole check if it's a long check [e.g. on startup].
There was a problem hiding this comment.
I updated to use a separate lock, but missed this section in the subsequent passes. Will get back to this when I have a chance.
There was a problem hiding this comment.
Updated to split this in two critical sections, and not use cs_main at all.
|
Let me look into using a separate lock. I think at the time it was not especially clear whether cs_main was already locking for mapBlockIndex in some places. |
it was definitely requiring it (in some places ;)) |
fb97aab to
042bcdc
Compare
|
tsan reported a data race in GetAncestor, so this is not ready to merge. I will have to look into that. |
Declare mapBlockIndex to be protected by cs_mapblockindex. Move a lot of generic lookups to LookupBlockIndex, while other more complex uses (e.g. atomic lookup+insert) are left alone. Make sure the lock is taken everywhere necessary (places found with clang thread-safety warnings) and only as tightly as possible to avoid potential lock cycles (detectable by tsan). Many uses of cs_main are replaced by cs_mapblockindex if they were only protecting mapBlockIndex, or removed (LookupBlockIndex taking the lock itself), or moved to after LookupBlockIndex calls. Rework PruneStaleBlockIndexes to do most work without the lock.
=== Veil: cannot presently remove SetNull helper due to other use, however, this is still worthwhile due to the initialization changes.
|
Cherry-picked explicit default initialization in CBlockIndex, this quiets the tsan warning about a read conflict with a write in operator::new. |
Problem
tsan identified a data race on CChainState, particularly between AcceptBlock and ActivateBestChain (used often by the miner threads).
Solution
Mark
mapBlockIndexas guarded bycs_mapblockindex, and ensure that the lock is taken prior to any use. Some code has been reorganized slightly to reduce the time spent holding the lock and avoid locking a second time for the same information.Move many lookups to use LookupBlockIndex, which grabs the lock itself, and remove lots of cs_main uses that would have otherwise blocked only for LookupBlockIndex.
Tested
Original: Configured with --with-sanitizers=tsan, built with clang, run on regtest with 2 SHA mining threads enabled.
With cs_mapblockindex: configured --with-sanitizers=tsan, built with clang, run on testnet to catch up and mine 2 sha threads.