Skip to content

Conversation

@maflcko
Copy link
Member

@maflcko maflcko commented May 25, 2021

There is no need for any lock to be held, because net processing is single threaded. So holding the validation lock cs_main for sending a feefilter is confusing and might even degrade blockchain-related RPC performance minimally.

MaybeSendFeefilter will break when called in multiple threads, because g_filter_rounder.round isn't thread safe (among others). Thus, annotating the function with m_mutex_message_handling seems a nice belt-and-suspenders compiler check for now. If there are more threads in the future, a special feefilter mutex can be introduced.

@maflcko
Copy link
Member Author

maflcko commented May 25, 2021

m_mutex_message_handling is taken from #21527, but should be independent and fine to merge in either this pull or the other.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 25, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jnewbery
Copy link
Contributor

Code review ACK faca41443120c254f45d7a4dd72d5d80efacaf59

Copy link
Contributor

@lsilva01 lsilva01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on mainnet ACK faca414

FEEFILTER messages sent normally.

@maflcko
Copy link
Member Author

maflcko commented May 30, 2021

cc @vasild @ajtowns for Approach ACK/NACK

@ajtowns
Copy link
Contributor

ajtowns commented May 31, 2021

Approach ACK

Suggest adding a // GUARDED_BY(...) comment on g_filter_rounder (along the lines of 9ac4792).

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns expressed in #21527 (comment) are valid for this PR too.

In addition, this PR introduces the mutex without any GUARDED_BY(the new mutex). So it is unclear what is being protected by it. There is this vague comment: "anything only accessed by ProcessMessage(s) or SendMessages" which better be replaced by GUARDED_BY() directives on the variables protected by this mutex.

We have the following call path:

CConnman::ThreadMessageHandler() (single thread) ->
  PeerManagerImpl::SendMessages(CNode pto) (protected by CNode::cs_sendProcessing) ->
    PeerManagerImpl::MaybeSendFeefilter(CNode pto)

And PeerManagerImpl::MaybeSendFeefilter(CNode pto) accesses the following variables directly or indirectly:

bool PeerManagerImpl::m_ignore_incoming_txs
std::unique_ptr<TxRelay> CNode::m_tx_relay
CNode::GetCommonVersion()
  std::atomic<int> CNode::m_greatest_common_version
CNode::HasPermission()
  NetPermissionFlags CNode::m_permissionFlags
CTxMemPool& PeerManagerImpl::m_mempool
ChainstateManager& PeerManagerImpl::m_chainman
ChainstateManager::ActiveChainstate()
  CChainState* ChainstateManager::m_active_chainstate GUARDED_BY(cs_main)
CChainState::IsInitialBlockDownload()
  std::atomic<bool> CChainState::m_cached_finished_ibd // ok
  CChain CChainState::m_chain // implicit GUARDED_BY(cs_main)
FeeFilterRounder g_filter_rounder
FeeFilterRounder::round()
  FeeFilterRounder::feeset
CAmount CNode::TxRelay::lastSentFeeFilter
std::chrono::microseconds CNode::TxRelay::m_next_send_feefilter
(global) CFeeRate minRelayTxFee
CFeeRate::GetFeePerK()
  CFeeRate::GetFee()
    CAmount CFeeRate::nSatoshisPerK
PeerManagerImpl::m_connman
CConnman::PushMessage()
  const CAddress CNode::addr
  std::unique_ptr<TransportSerializer> CNode::m_serializer
  TransportSerializer::prepareForTransport()

I think that the above variables should be analyzed with respect to access from other threads before changing MaybeSendFeefilter() to be executed without holding cs_main:

  • If a variable is not accessed by other threads than ThreadMessageHandler(), then it is irrelevant (I understand that in this case you may want to add GUARDED_BY(the newly added mutex))
  • If a variable can be accessed by other threads, then it should be clear which mutex is protecting it, have GUARDED_BY() and the mutex held when calling MaybeSendFeefilter().

Comment on lines +285 to +337
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How guarding with a mutex guarantees "it's only accessed by a single thread"? Is the intention to say "it's only accessed by a single thread at a given point in time"? If yes, then this applies to any mutex and the comment better be removed to avoid confusion. If it means that the mutex guarantees "it's only accessed by the single thread ThreadMessageHandler()", then it is not true because any other thread can grab the mutex and access the relevant variables.

@maflcko
Copy link
Member Author

maflcko commented Jan 24, 2022

Suggest adding a // GUARDED_BY(...) comment on g_filter_rounder (along the lines of 9ac4792).

Thanks, added a // needed for ... comment.

My concerns expressed in #21527 (comment) are valid for this PR too.

Thanks, I've replied there. Though, I think this pull makes sense on its own because it (minimally) increases the performance of blockchain-related RPCs that actually need the cs_main lock. As of now, the cs_main lock is both confusing and (minimally) harmful.

this PR introduces the mutex without any GUARDED_BY(the new mutex). So it is unclear what is being protected by it.

Thanks, added a comment // needed for ...

bool PeerManagerImpl::m_ignore_incoming_txs

This is const, so there can't possibly be any race.

std::unique_ptr CNode::m_tx_relay

This is initialized in the constructor, so there can't possibly be any race.

std::atomic CNode::m_greatest_common_version

This is atomic, so there can't possibly be any race. Also, there can't be a logical race introduced in this pr, because it is not guarded by cs_main in other threads.

NetPermissionFlags CNode::m_permissionFlags

This is initialized before the Peer is initialized in net_processing. MaybeSendFeefilter isn't called for objects without a Peer, so there is no race introduced by this patch.

CTxMemPool& PeerManagerImpl::m_mempool

A reference as member variable is like a const pointer, so there can't possibly be any race reading the pointer.

ChainstateManager& PeerManagerImpl::m_chainman

Same

ChainstateManager::ActiveChainstate()

The lock is taken by this function. Currently we don't allow loading the active chainstate after init, so this reference can be treated like a "const pointer". If there is a patch in the future that allows re-seating the active chainstate, a lot more places need to be fixed up. My point is that re-seating shouldn't be done here, but this is a discussion for the future.

CChainState::IsInitialBlockDownload()

The appropriate locks are taken by this function where needed. This function doesn't care about the return value anyway, as long as it is a bool that goes from true->false.

FeeFilterRounder g_filter_rounder

Added comment in this pull

FeeFilterRounder::round()

Same

CAmount CNode::TxRelay::lastSentFeeFilter

This is only read and written by this function. Happy to add the annotation here as well, if reviewers think it is worth it?

std::chrono::microseconds CNode::TxRelay::m_next_send_feefilter

Same

(global) CFeeRate minRelayTxFee

This is set in init, and treated as const afterward. If there is a patch in the future making it mutable, that patch should add the appropriate locks.

PeerManagerImpl::m_connman

A reference as member variable is like a const pointer, so there can't possibly be any race reading the pointer.

CConnman::PushMessage()

This function takes locks where needed and is thread-safe.

@maflcko
Copy link
Member Author

maflcko commented Jan 24, 2022

Thanks for the review. Addressed all feedback.

Since message handling is single threaded, this can simplify
guarding any variables only accessed during message handling.
@maflcko
Copy link
Member Author

maflcko commented Feb 22, 2022

Rebased and clarified the goals (as initially pointed out in OP) in the source code as well. The new comment also explains the history and future, so that developers don't have to use git to lookup this discussion thread.

@maflcko
Copy link
Member Author

maflcko commented Feb 22, 2022

Still trivial to re-ACK as there are no code changes: git range-diff bitcoin-core/master faca414431 fab56fb47d531.

@vasild
Copy link
Contributor

vasild commented Feb 22, 2022

After re-reading carefully the comments and the diff, I still think this is flawed because it introduces a mutex that protects "anything only accessed by" func1(), func2() or func3(). Mutexes protect data (variables). Those variables must be annotated with GUARDED_BY(). No such clause is used with the newly introduced mutex. It is supposed to protect something vague which is hard to check and very easy to change or break in the future - all the variables accessed directly or indirectly by a list of functions.

What about this:

  1. Make the newly introduced mutex global
  2. Make g_filter_rounder global and annotate it with GUARDED_BY(the new mutex)
  3. Annotate CNode::TxRelay::lastSentFeeFilter and CNode::TxRelay::m_next_send_feefilter with GUARDED_BY(the new mutex)

A reference as member variable is like a const pointer

I don't think so, a const pointer cannot change, but a reference member can. The following does not compile:

member reference vs const pointer
#include <vector>

class A
{
public:
    A(std::vector<int>& arg) : m1{arg}, m2{nullptr} {};

    std::vector<int>& m1;
    std::vector<int>* const m2;
};

int main(int, char**)
{
    std::vector<int> v1;
    std::vector<int> v2;
    A a(v1);

    // ok
    a.m1 = v2;

    // error: cannot assign to non-static data member 'm2' with const-qualified type 'std::vector<int> *const'
    a.m2 = nullptr;

    return 0;
}

@maflcko
Copy link
Member Author

maflcko commented Feb 22, 2022

I don't think so, a const pointer cannot change, but a reference member can. The following does not compile:

TIL

@maflcko
Copy link
Member Author

maflcko commented Feb 22, 2022

Mutexes protect data (variables).

Yes, but this is not the only use of Mutexes. In this context the mutex is not strictly needed to guard from UB. It is possible to remove the existing cs_main annotation in the function and compile without compiler warnings and run the program without UB.

The goal of the mutex here is to allow only one caller to execute the function.

very easy to change or break in the future

The assumption that I touch here is documented in the source code and has a compiler annotation, both before the changes here and after. Breaking it would require removing the compiler annotation and runtime check, as well as the comment.

@vasild
Copy link
Contributor

vasild commented Feb 22, 2022

I mean that "anything only accessed by ...list of functions..." is "something vague which is hard to check and very easy to change or break in the future".

@maflcko
Copy link
Member Author

maflcko commented Feb 22, 2022

It is trivial to check with AssertLockHeld. Forgetting to add the annotation to a function is no different than forgetting to add the annotation to a data member.

In any case, this pull isn't changing anything about that. Previously there was a AssertLockHeld for cs_main, now there is one for a different mutex. If you have concerns about the code in current master, it might be best to create a separate discussion thread.

I think the discussion in this pull request should focus on the goal and how the code changes help to achieve the goal. The goal is to reduce the use of cs_main to mean "global lock for everything" and work toward cs_main to mean "validation lock for validation stuff only".

@vasild
Copy link
Contributor

vasild commented Feb 22, 2022

Then just moving the call to MaybeSendFeefilter() after cs_main is released would achieve the goal. It is 1-line change.

Another attempt to explain: right now there are e.g. 25 variables accessed directly or indirectly by ProcessMessage(), ProcessMessages() and SendMessages(). If we presume that they are protected by the newly added m_mutex_message_handling, then any other change anywhere in the code that touches existent variables has to check (manually by code inspection!) if the touched variable is any of those 25 and if it is, it has to acquire m_mutex_message_handling. If it does not acquire it, then there will be a race without a compilation or runtime error. This comment, added in this PR, is confusing (or even bizarre):

/** Message handling mutex.
 *  Message processing is single-threaded, so anything only accessed
 *  by ProcessMessage(s) or SendMessages can be guarded by this mutex,
 *  which guarantees it's only accessed by a single thread.

at least somebody else finds the comment confusing too: #21527 (comment)

@maflcko
Copy link
Member Author

maflcko commented Feb 23, 2022

Then just moving the call to MaybeSendFeefilter() after cs_main is released would achieve the goal. It is 1-line change.

It would require also removing the AssertLockHeld(cs_main);. If reviewers think this 2-line patch is preferable, I am more than happy to go with it, as long as I can make progress removing cs_main.

@vasild
Copy link
Contributor

vasild commented Feb 23, 2022

Then just moving the call to MaybeSendFeefilter() after cs_main is released would achieve the goal. It is 1-line change.

It would require also removing the AssertLockHeld(cs_main);. If reviewers think this 2-line patch is preferable, I am more than happy to go with it, as long as I can make progress removing cs_main.

+1, should be ok, I will review carefully if you go that way.

@maflcko
Copy link
Member Author

maflcko commented Feb 23, 2022

Closing for now. Reviewers, please ACK/NACK #24427 instead.

@maflcko maflcko closed this Feb 23, 2022
@maflcko maflcko deleted the 2105-lessCsMain branch February 23, 2022 09:29
@bitcoin bitcoin locked and limited conversation to collaborators Feb 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants