-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Preserve the LockData initial state if "potential deadlock detected" exception thrown
#19340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
vasild
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if the offending entry is still in the lock stack by the time potential_deadlock_detected() is called because it prints the stack and we want to see the offender in the debug log.
Also, I think that it would be best to undo all mods before abort()/throw. We also inserted into lockdata.lockorders and in lockdata.invlockorders. Maybe something like:
diff --git i/src/sync.cpp w/src/sync.cpp
index 3098e402c..b21154d6e 100644
--- i/src/sync.cpp
+++ w/src/sync.cpp
@@ -124,17 +124,12 @@ static void potential_deadlock_detected(const LockPair& mismatch, const LockStac
}
if (i.first == mismatch.second) {
LogPrintf(" (2)"); /* Continued */
}
LogPrintf(" %s\n", i.second.ToString());
}
- if (g_debug_lockorder_abort) {
- tfm::format(std::cerr, "Assertion failed: detected inconsistent lock order at %s:%i, details in debug log.\n", __FILE__, __LINE__);
- abort();
- }
- throw std::logic_error("potential deadlock detected");
}
static void double_lock_detected(const void* mutex, LockStack& lock_stack)
{
LogPrintf("DOUBLE LOCK DETECTED\n");
LogPrintf("Lock order:\n");
@@ -186,14 +181,28 @@ static void push_lock(MutexType* c, const CLockLocation& locklocation)
if (lockdata.lockorders.count(p1))
continue;
lockdata.lockorders.emplace(p1, lock_stack);
const LockPair p2 = std::make_pair(c, i.first);
lockdata.invlockorders.insert(p2);
- if (lockdata.lockorders.count(p2))
+ if (lockdata.lockorders.count(p2)) {
potential_deadlock_detected(p1, lockdata.lockorders[p2], lockdata.lockorders[p1]);
+
+ lockdata.invlockorders.erase(p2);
+ lockdata.lockorders.erase(p1);
+ lock_stack.pop_back();
+
+ if (g_debug_lockorder_abort) {
+ tfm::format(std::cerr,
+ "Assertion failed: detected inconsistent lock order at %s:%i, "
+ "details in debug log.\n",
+ __FILE__, __LINE__);
+ abort();
+ }
+ throw std::logic_error("potential deadlock detected");
+ }
}
}
static void pop_lock()
{
LockData& lockdata = GetLockData();
src/sync.cpp
Outdated
| tfm::format(std::cerr, "Assertion failed: detected inconsistent lock order at %s:%i, details in debug log.\n", __FILE__, __LINE__); | ||
| abort(); | ||
| } | ||
| throw std::logic_error("potential deadlock detected"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to unconditionally remove the entry from the stack here, after printing the stack and before abort()/throw. This cleanup is warranted from all code paths that call EnterCritical().
| throw std::logic_error("potential deadlock detected"); | |
| // Undo the insertion from push_lock(). We will not lock the mutex. | |
| s2.pop_back(); | |
| throw std::logic_error("potential deadlock detected"); |
(this should be before the if, 4 lines above, but github does not allow me to put the suggestion there)
It prints from |
|
You are right! I did not realize that the Anyway - I think, for |
|
Updated c810cd4 -> 236e760 (pr19340.01 -> pr19340.02, diff):
I'd leave it as is since it seems unrelated to this PR goal. |
|
Reworked after the discussion with @vasild on IRC.
|
LockData initial state if "potential deadlock detected" exception thrown
vasild
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK f88222a
|
I'm not sure I fully get what's going on here, but the change looks fine to me. Can you explain how making the copy of the lockstack actually helps? It looks like a race condition to even copy it? |
|
@JeremyRubin thanks for asking! The key here is the added Let me elaborate with an example:
Notice that at the end |
maflcko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK f88222a
just some questions. Feel free to ignore.
|
|
||
| const LockPair p1 = std::make_pair(i.first, c); | ||
| const LockPair p2 = std::make_pair(c, i.first); | ||
| if (lockdata.lockorders.count(p2)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this is now being done before if (lockdata.lockorders.count(p1))continue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Updated f88222a -> cab80f4 (pr19340.03 -> pr19340.04, diff):
Rebased cab80f4 -> 63e9e40 (pr19340.04 -> pr19340.05) to fix Travis "lint" job errors. |
vasild
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 63e9e40
|
re-ACK 63e9e40 |
…ial deadlock detected" exception thrown 63e9e40 test: Add LockStackEmpty() (Hennadii Stepanov) 42b2a95 test: Repeat deadlock tests (Hennadii Stepanov) 1f96be2 Preserve initial state if push_lock() throws exception (Hennadii Stepanov) Pull request description: On master (e3fa3c7) if the `push_lock()` throws the "potential deadlock detected" exception (via the `potential_deadlock_detected()` call), the `LockData` instance internal state differs from one when the `push_lock()` was called. This non-well behaviour makes (at least) testing brittle. This PR preserves the `LockData` instance initial state if `push_lock()` throws an exception, and improves the `sync_tests` unit test. ACKs for top commit: MarcoFalke: re-ACK 63e9e40 vasild: ACK 63e9e40 Tree-SHA512: 7679182154ce5f079b44b790faf76eb5f553328dea70a326ff6b600db70e2f9ae015a33a104ca070cb660318280cb79b6b42e37ea5166f26f9e627ba721fcdec
…exception thrown Summary: > Preserve initial state if push_lock() throws exception > test: Repeat deadlock tests > test: Add LockStackEmpty() This is a backport of [[bitcoin/bitcoin#19340 | core#19340]] Test Plan: With TSAN: `ninja check check-functional` Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D10059
…l deadlock detected" exception thrown
…l deadlock detected" exception thrown
On master (e3fa3c7) if the
push_lock()throws the "potential deadlock detected" exception (via thepotential_deadlock_detected()call), theLockDatainstance internal state differs from one when thepush_lock()was called. This non-well behaviour makes (at least) testing brittle.This PR preserves the
LockDatainstance initial state ifpush_lock()throws an exception, and improves thesync_testsunit test.