Skip to content

Conversation

@jonatack
Copy link
Member

@jonatack jonatack commented Apr 1, 2022

PRs #22736, #22904 and #23223 changed lock contention logging from a DEBUG_LOCKCONTENTION compile-time preprocessor directive to a runtime lock log category and improved the logging output. This changed the locking from using lock() to try_lock():

  • void Mutex::UniqueLock::lock() acquires the mutex and blocks until it gains access to it

  • bool Mutex::UniqueLock::try_lock() doesn't block but instead immediately returns whether it acquired the mutex; it may be used by lock() internally as part of the deadlock-avoidance algorithm

In theory the cost of try_lock might be essentially the same relative to lock. The test-and-set logic of these calls is purported to be ~ constant time, optimised and light/quick if used carefully (i.e. no mutex convoying), compared to system calls, memory/cache coherency and fences, wait queues, and (particularly) lock contentions. See the discussion around #22736 (comment) and after with respect to performance/cost aspects. However, there are reasonable concerns (see here and here) that Base::try_lock() may be potentially costly or risky compared to Base::lock() in this very frequently called code.

One alternative to keep the run-time lock logging would be to gate the try_lock call behind the logging conditional, for example as proposed in ccd73de and ACKed here. However, this would add the cost of if (LogAcceptCategory(BCLog::LOCK)) to the hotspot, instead of replacing lock with try_lock, for the most frequent happy path (non-contention).

It turns out we can keep the advantages of the runtime lock contention logging (the ability to turn it on/off at runtime) while out of prudence putting the try_lock() call and lock logging category behind a DEBUG_LOCKCONTENTION compile-time preprocessor directive, and also still retain the lock logging enhancements of the mentioned PRs, as suggested in #24734 (comment) by W. J. van der Laan, in #22736 (comment), and in the linked IRC discussion.

Proposed here and for backport to v23.

@laanwj
Copy link
Member

laanwj commented Apr 1, 2022

Alternatively, we could keep the lock category only when DEBUG_LOCKCONTENTION is set. This still has the intended effect of making lock contention run-time switchable for people that want it enabled. No strong opinion though. This is definitely simpler.

Obvious concept ACK.

@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

Alternatively, we could keep the lock category only when DEBUG_LOCKCONTENTION is set. This still has the intended effect of making lock contention run-time switchable for people that want it enabled. No strong opinion though.

Agree! This would allow leaving the flag defined for development while being able to switch it off at runtime. All the benefits without the drawbacks. Will update.

@DrahtBot DrahtBot added the Tests label Apr 1, 2022
@jonatack jonatack changed the title sync: revert lock contention from log category to preprocessor directive Put lock contention logging behind DEBUG_LOCKCONTENTION preprocessor directive Apr 1, 2022
@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

Updated code and PR title/description per @laanwj's and @ajtowns suggestion. This is now a much smaller change, too.

Edit: overlooked updating the commit title, will do after CI feedback.

@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch from 0d1248b to 73759f8 Compare April 1, 2022 12:56
Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 73759f88537e472dc443318a16545d63e6c22024

Let me summarize how I understood this PR:

  • This PR introduced (back) the DEBUG_LOCKCONTENTION compile-time processor directive to allow the user to opt-out of potentially costly and dangerous run time locking log using Base::try_lock().

If my understanding of this PR is correct, I agree with its concept of it:

  • I like the prudent approach in case we find out using Base::try_lock() was not the efficient way of logging.
  • This keeps the earlier benefits introduced with run-time logging, so we are not losing a beneficial feature.

Considering the comments of @laanwj and @ajtowns

  • The reduced diff looks much cleaner than the earlier version of this PR.
  • As per @laanwj suggestion, the updated code keeps the benefits of run-time logging instead of completely reverting them, which I agree with.

@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

Mind that updating the checkqueue_tests setup doesn't seem needed, but can do if people prefer. -> Edit: done, as the test without DEBUG_LOCKCONTENTION defined would generate warnings.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 1, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #24757 (build, ci: add DEBUG_LOCKCONTENTION to --enable-debug and CI by jonatack)
  • #23443 (p2p: Erlay support signaling by naumenkogs)
  • #21603 (log: Mitigate disk filling attacks by rate limiting LogPrintf by dergoegge)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

With this change, if lock logging is requested and DEBUG_LOCKCONTENTION is not defined:

$ bitcoind -debug=lock
Warning: Unsupported logging category -debug=lock.
2022-04-01T13:57:44Z [init] Bitcoin Core version v23.99.0-73759f88537e (debug build)
2022-04-01T13:57:44Z [init] Warning: Unsupported logging category -debug=lock.
$ bitcoin-cli logging '["lock"]'
error code: -8
error message:
unknown logging category lock

which is better than being silently inactive.

@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch from 73759f8 to 22bbd84 Compare April 1, 2022 14:01
@maflcko
Copy link
Member

maflcko commented Apr 1, 2022

So this will be disabled for all CI tasks? If yes, could enable it for some, or by default in --enable-debug? See also #24709.

Otherwise the code is never compiled, which makes me wonder if it ever was compiled before the changes?

@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

So this will be disabled for all CI tasks? If yes, could enable it for some, or by default in --enable-debug? See also #24709.

I was thinking about that, as personally I would always want to define DEBUG_LOCKCONTENTION for debug building. Can add here to --enable-debug the same as DDEBUG_LOCKORDER and mention in the developer notes.

@maflcko
Copy link
Member

maflcko commented Apr 1, 2022

If it was never enabled previously, it can also be done as a follow-up. Just wanted to raise the point.

@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

Follow-up SGTM. Right, the previous situation AFAIK was that it was only compiled if manually defined by the user.

@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch from 22bbd84 to fd56ee1 Compare April 1, 2022 15:39
@jonatack
Copy link
Member Author

jonatack commented Apr 1, 2022

Made the src/sync.h logging include headers conditional on DEBUG_LOCKCONTENTION as well, added missing includes in src/net.{h,cpp} that this revealed and moved a method from the header to the implementation to not include logging in the header file.

@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch 2 times, most recently from 2a439ea to b2f3bf4 Compare April 1, 2022 16:48
jonatack added 2 commits April 1, 2022 19:01
Among these includes, src/net.{h,cpp} is relying on src/sync.h for <logging.h>
and in turn on src/logging.h for <list>.  The compiler raises when they are
removed from src/sync.h in the last commit of this PR, so they need to be added.
which allows dropping `#include <logging>` in the header file.
@jonatack
Copy link
Member Author

jonatack commented Apr 4, 2022

Added developer notes documentation.

@laanwj
Copy link
Member

laanwj commented Apr 4, 2022

Concept and code review ACK 2623fb4ff235b199e167695fe304a689b072c560

(reasoning for adding backport label is that it removes the try_locklock sequence overhead for every locking in the application, which adds some overhead and is potentially risky)

@jonatack
Copy link
Member Author

jonatack commented Apr 4, 2022

(reasoning for adding backport label is that it removes the try_locklock sequence overhead for every locking in the application, which adds some overhead and is potentially risky)

Agree, really happy that this change avoids unnecessary risk while preserving the benefits of the recent improvements.

@jonatack
Copy link
Member Author

jonatack commented Apr 4, 2022

So this will be disabled for all CI tasks? If yes, could enable it for some, or by default in --enable-debug? See also #24709.

Done in #24757.

@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch from 2623fb4 to f5257c4 Compare April 5, 2022 08:35
@jonatack
Copy link
Member Author

jonatack commented Apr 5, 2022

Updated per git diff 2623fb4 7ea951d to avoid Unsupported logging category -debugexclude=lock warnings when running the checkqueue tests without DEBUG_LOCKCONTENTION defined. Edit: re-pushed to appease linter.

@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch from f5257c4 to 523e7b3 Compare April 5, 2022 09:00
@jonatack jonatack force-pushed the change-lock-logging-to-DEBUG_LOCKCONTENTION branch from 523e7b3 to 7ea951d Compare April 5, 2022 09:17
@jonatack
Copy link
Member Author

jonatack commented Apr 5, 2022

ACK 73759f8

@shaavan mind having another look? (thanks!)

Copy link
Member

@fanquake fanquake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could just drop the first two commits, and add 2 missing includes to the 3rd commit, i.e https://github.com/fanquake/bitcoin/tree/simplfied_24734. That would seem to achieve the same outcome with less code changes, and is simpler if this is meant to be backported.

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes since my last review:

  1. Added documentation for DEBUG_LOCKCONTENTION in developer notes.
  2. Added missing include headers.
  3. Moved StartExtraBlockRelayPeers from net.h to the net.cpp

I like the added developer documentation for its clear and concise writing. It explains the purpose of the DEBUG_LOCKCONTENTION macro and describes how to use it effectively.

However, I am not able to reason with the following added changes:

  • Keeping all the included headers into a separate commit.
    • Though it makes reviewing easier. I don’t feel it needs to be in a separate commit.
  • Moving StartExtraBlockRelayPeers from net.h to the net.cpp
    • Though it is mentioned in this comment to refer to the first commit message, I can still not completely understand the reasoning. Hence, for now, I am concept 0 on this particular change.

I took a look into #24770, which has all the changes I agree with from this PR, and hence I think we can focus on getting that merged first.

@jonatack
Copy link
Member Author

jonatack commented Apr 5, 2022

@shaavan thanks for having a look.

* **Keeping all the included headers into a separate commit.**
  
  * Though it makes reviewing easier. I don’t feel it needs to be in a separate commit.

The missing headers are an unrelated issue, apart from needing to be fixed for the object of this pull. Missing headers are generally done in a separate commit, see https://github.com/bitcoin/bitcoin/pull/24740/commits for a recent example.

* **Moving `StartExtraBlockRelayPeers` from net.h to the net.cpp**
  
  * Though it is mentioned in [this](https://github.com/bitcoin/bitcoin/pull/24734#discussion_r842591735) comment to refer to the first commit message, I can still not completely understand the reasoning. Hence, for now, I am concept 0 on this particular change.

Moving the function definition into the implementation (cpp) allows dropping an include header. Doing so can speed up compilation for the many files that include net.h. It's not strictly needed for backport but is trivial to review.

@ajtowns
Copy link
Contributor

ajtowns commented Apr 6, 2022

This should be closed or rebased on top of #24770 if that's what you're now intending to get merged...

@jonatack
Copy link
Member Author

jonatack commented Apr 6, 2022

Yes, this is my preferred version and plan to update or close whichever ones aren't merged.

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 7ea951d

Missing headers are generally done in a separate commit.

Though I am not 100% convinced, I cannot think of any blocking points for this change. Hence I am Concept 0 on this and will allow fellow reviewers to decide the optimal way forward on this particular change.

Moving the function definition into the implementation (cpp) allows dropping an include header. Doing so can speed up compilation for the many files that include the net.h.

That is acceptable reasoning, and I find it to be logically sound. Hence, I agree with this change.

fanquake added a commit to bitcoin-core/gui that referenced this pull request Apr 8, 2022
…TION preprocessor directive

4394733 Add DEBUG_LOCKCONTENTION documentation to the developer notes (Jon Atack)
39a34b6 Put lock logging behind DEBUG_LOCKCONTENTION preprocessor directive (Jon Atack)

Pull request description:

  This is a more minimal, no-frills version of #24734 for backport. The other fixes and improvements in that pull can be done after.

  *Copy of the PR 24734 description:*

  PRs #22736, #22904 and #23223 changed lock contention logging from a `DEBUG_LOCKCONTENTION` compile-time preprocessor directive to a runtime `lock` log category and improved the logging output. This changed the locking from using `lock()` to `try_lock()`:

  - `void Mutex::UniqueLock::lock()` acquires the mutex and blocks until it gains access to it

  - `bool Mutex::UniqueLock::try_lock()` doesn't block but instead immediately returns whether it acquired the mutex; it may be used by `lock()` internally as part of the deadlock-avoidance algorithm

  In theory the cost of `try_lock` might be essentially the [same](https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-31.html#l-697) relative to `lock`. The test-and-set logic of these calls is purported to be ~ constant time, optimised and light/quick if used carefully (i.e. no mutex convoying), compared to system calls, memory/cache coherency and fences, wait queues, and (particularly) lock contentions. See the discussion around bitcoin/bitcoin#22736 (comment) and after with respect to performance/cost aspects.  However, there are reasonable concerns (see [here](bitcoin/bitcoin#22736 (comment)) and [here](https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-31.html#l-620)) that `Base::try_lock()` may be potentially [costly](https://www.erisian.com.au/bitcoin-core-dev/log-2022-03-31.html#l-700) or [risky](bitcoin/bitcoin#22904 (comment)) compared to `Base::lock()` in this very frequently called code.

  One alternative to keep the run-time lock logging would be to gate the `try_lock` call behind the logging conditional, for example as proposed in bitcoin/bitcoin@ccd73de and ACKed [here](bitcoin/bitcoin#22736 (comment)). However, this would add the [cost](bitcoin/bitcoin#22736 (comment)) of `if (LogAcceptCategory(BCLog::LOCK))` to the hotspot, instead of replacing `lock` with `try_lock`, for the most frequent happy path (non-contention).

  It turns out we can keep the advantages of the runtime lock contention logging (the ability to turn it on/off at runtime) while out of prudence putting the `try_lock()` call and `lock` logging category behind a  `DEBUG_LOCKCONTENTION` compile-time preprocessor directive, and also still retain the lock logging enhancements of the mentioned PRs, as suggested in bitcoin/bitcoin#24734 (comment) by W. J. van der Laan, in bitcoin/bitcoin#22736 (comment), and in the linked IRC discussion.

  Proposed here and for backport to v23.

ACKs for top commit:
  laanwj:
    Code review ACK 4394733

Tree-SHA512: 89b1271cae1dca0eb251914b1a60fc5b68320aab4a3939c57eec3a33a3c8f01688f05d95dfc31f91d71a6ed80cfe2d67b77ff14742611cc206175e47b2e5d3b1
@maflcko
Copy link
Member

maflcko commented Apr 9, 2022

Closing as this has been merged

@maflcko maflcko closed this Apr 9, 2022
@jonatack
Copy link
Member Author

Thank you for revewing, @shaavan. I moved the remaining changes here to #24757.

@bitcoin bitcoin locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants