-
Notifications
You must be signed in to change notification settings - Fork 1.2k
perf: using asynchronous worker to validate BLS signatures in quorum commitments #6692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAsynchronous parallel BLS signature verification was added for LLMQ final commitments. A llmq::utils::BlsCheck type and its operator() implementation were introduced, plus a CCheckQueue-based m_bls_queue on CQuorumBlockProcessor. CFinalCommitment gained VerifySignatureAsync and Verify delegates signature checks to it; ProcessBlock enqueues async BLS checks via the BLS queue and waits for results. ProcessCommitment no longer verifies signatures and its signature was changed accordingly. Initialization now accepts -parbls to configure worker threads; CQuorumBlockProcessor starts/stops BLS worker threads. New constants DEFAULT_BLSCHECK_THREADS and MAX_BLSCHECK_THREADS were added. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/llmq/blockprocessor.cpp (1)
208-220: Consider adding error logging when signature verification fails.The asynchronous signature verification implementation is correct and follows proper patterns. However, when
queue_control.Wait()returns false, it would be helpful to log which commitment(s) failed verification for debugging purposes.Consider adding a log message before returning false:
if (!queue_control.Wait()) { // at least one check failed + LogPrintf("[ProcessBlock] BLS signature verification failed for block at height %d\n", pindex->nHeight); return false; }src/llmq/commitment.cpp (1)
31-95: Consider refactoring for improved modularity.The implementation correctly handles both synchronous and asynchronous BLS signature verification. However, the method is quite long (64 lines) and handles multiple responsibilities.
Consider extracting helper methods to improve readability and maintainability:
bool CFinalCommitment::VerifySignatureAsync(CDeterministicMNManager& dmnman, CQuorumSnapshotManager& qsnapman, gsl::not_null<const CBlockIndex*> pQuorumBaseBlockIndex, CCheckQueueControl<utils::BlsCheck>* queue_control) const { auto members = utils::GetAllQuorumMembers(llmqType, dmnman, qsnapman, pQuorumBaseBlockIndex); const auto& llmq_params_opt = Params().GetLLMQ(llmqType); if (!llmq_params_opt.has_value()) { LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] invalid llmqType=%d\n", quorumHash.ToString(), ToUnderlying(llmqType)); return false; } const auto& llmq_params = llmq_params_opt.value(); uint256 commitmentHash = BuildCommitmentHash(llmq_params.type, quorumHash, validMembers, quorumPublicKey, quorumVvecHash); LogMemberDetails(members, commitmentHash); if (!VerifyMemberSignatures(llmq_params, members, commitmentHash, queue_control)) { return false; } if (!VerifyQuorumSignature(commitmentHash, queue_control)) { return false; } return true; } private: bool CFinalCommitment::VerifyMemberSignatures(const Consensus::LLMQParams& llmq_params, const std::vector<CDeterministicMNCPtr>& members, const uint256& commitmentHash, CCheckQueueControl<utils::BlsCheck>* queue_control) const { // Implementation for member signature verification } bool CFinalCommitment::VerifyQuorumSignature(const uint256& commitmentHash, CCheckQueueControl<utils::BlsCheck>* queue_control) const { // Implementation for quorum signature verification }
🛑 Comments failed to post (1)
src/llmq/commitment.h (1)
34-37:
⚠️ Potential issueFix namespace formatting issue identified by pipeline.
The nested namespace declaration needs proper formatting according to clang-format requirements.
Apply this diff to fix the formatting:
-namespace utils -{ -struct BlsCheck; -} // namespace utils +namespace utils { +struct BlsCheck; +} // namespace utils📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.namespace utils { struct BlsCheck; } // namespace utils🤖 Prompt for AI Agents
In src/llmq/commitment.h around lines 34 to 37, the nested namespace declaration for utils is not properly formatted according to clang-format standards. Adjust the namespace declaration to use the correct syntax and indentation style, ensuring the opening and closing braces align properly and the comment indicating the namespace closure is correctly placed.
|
just to be sure that CCheckQueue works at all (and not just return true) I modified code a bit and it failed as expected: diff --git a/src/llmq/commitment.cpp b/src/llmq/commitment.cpp
index c2cbe9b35c..b242b4c988 100644
--- a/src/llmq/commitment.cpp
+++ b/src/llmq/commitment.cpp
@@ -69,7 +69,12 @@ bool CFinalCommitment::VerifySignatureAsync(CDeterministicMNManager& dmnman, CQu
strprintf("CFinalCommitment -- q[%s] invalid aggregated members signature", quorumHash.ToString())};
if (queue_control) {
std::vector<utils::BlsCheck> vChecks;
- vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string);
+ static int counter{0};
+ if (++counter == 42) {
+ vChecks.emplace_back(quorumSig, memberPubKeys, commitmentHash, members_id_string);
+ } else {
+ vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string);
+ }
queue_control->Add(vChecks);
} else {
if (!membersSig.VerifySecureAggregated(memberPubKeys, commitmentHash)) {error: Detailsstate is expected to don't be set, same issue on |
I squashed commits together because commits has changed same piece of code several times and moved between files. Also see #6724 and #6692 (comment) for extra tests done for this PR. @UdjinM6 please help with review, PR is ready now (assuming CI will succeed) |
c9ef70a tests: add is_mature for quorum generation logs (Konstantin Akimov) 59060b5 fmt: order imports and fix gap in feature_llmq_dkgerrors.py (Konstantin Akimov) 58377f8 test: added functional tests for invalid CQuorumCommitment (Konstantin Akimov) bb0b8b0 test: add serialization/deserialization of CFinalCommitmentPayload (Konstantin Akimov) Pull request description: ## Issue being fixed or feature implemented As I noticed implementing #6692 if BlsChecker works incorrectly it won't be caught by unit or functional tests. See also #6692 (comment) how 6692 has been tested without this PR. ## What was done? This PR introduces new functional tests to validated that `llmqType`, `membersSig`, `quorumSig` and `quorumPublicKey` are indeed validated by Dash Core as part of consensus. ## How Has This Been Tested? See changes in `feature_llmq_dkgerrors.py` ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone ACKs for top commit: kwvg: utACK c9ef70a UdjinM6: utACK c9ef70a Tree-SHA512: ad61f8c845f6681765105224b2a84e0b206791e2c9a786433b9aa91018ab44c1fa764528196fd079f42f08a55794756ba8c9249c6eb10af6fe97c33fa4757f44
|
This pull request has conflicts, please rebase. |
|
This pull request has conflicts, please rebase. |
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see ~15% speedup for ~20k blocks on mainnet on m1pro (7 threads). Good job! 👍
light ACK 3aec7a5274721ccfea32f79bfbff32ad17464222
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK c047c58d36eb0f66a9ac91aa52b648e523735d3d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/llmq/blockprocessor.cpp (1)
270-346: Fix incorrect format specifiers in LogPrint — can cause crashesMultiple LogPrint calls use %s to print integer counts (qc.CountSigners()), which is undefined behavior and can crash. Also, one message concatenates a value with text without a separator.
Apply the following changes:
- LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%s, validMembers=%d, quorumPublicKey=%s fJustCheck[%d] processing commitment from block.\n", __func__, + LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%d, validMembers=%d, quorumPublicKey=%s fJustCheck[%d] processing commitment from block.\n", __func__, nHeight, ToUnderlying(qc.llmqType), qc.quorumIndex, quorumHash.ToString(), qc.CountSigners(), qc.CountValidMembers(), qc.quorumPublicKey.ToString(), fJustCheck);- LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%s, validMembers=%d, quorumPublicKey=%s quorumHash is null.\n", __func__, + LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%d, validMembers=%d, quorumPublicKey=%s quorumHash is null.\n", __func__, nHeight, ToUnderlying(qc.llmqType), qc.quorumIndex, quorumHash.ToString(), qc.CountSigners(), qc.CountValidMembers(), qc.quorumPublicKey.ToString());- LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%s, validMembers=%d, quorumPublicKey=%s qc verify failed.\n", __func__, + LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%d, validMembers=%d, quorumPublicKey=%s qc verify failed.\n", __func__, nHeight, ToUnderlying(qc.llmqType), qc.quorumIndex, quorumHash.ToString(), qc.CountSigners(), qc.CountValidMembers(), qc.quorumPublicKey.ToString());- LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%s, validMembers=%dqc verifynull failed.\n", __func__, - nHeight, ToUnderlying(qc.llmqType), qc.quorumIndex, quorumHash.ToString(), qc.CountSigners(), qc.CountValidMembers()); + LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s height=%d, type=%d, quorumIndex=%d, quorumHash=%s, signers=%d, validMembers=%d qc verifynull failed.\n", __func__, + nHeight, ToUnderlying(qc.llmqType), qc.quorumIndex, quorumHash.ToString(), qc.CountSigners(), qc.CountValidMembers());- LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s -- processed commitment from block. type=%d, quorumIndex=%d, quorumHash=%s, signers=%s, validMembers=%d, quorumPublicKey=%s\n", __func__, + LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor::%s -- processed commitment from block. type=%d, quorumIndex=%d, quorumHash=%s, signers=%d, validMembers=%d, quorumPublicKey=%s\n", __func__, ToUnderlying(qc.llmqType), qc.quorumIndex, quorumHash.ToString(), qc.CountSigners(), qc.CountValidMembers(), qc.quorumPublicKey.ToString());
♻️ Duplicate comments (1)
src/llmq/blockprocessor.cpp (1)
225-229: Optional: add a concise log on queue failure to include block contextWhile BlsCheck logs the failing commitment context, adding a single line here with the block hash can simplify triage.
- if (!queue_control.Wait()) { - // at least one check failed - return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-invalid"); - } + if (!queue_control.Wait()) { + LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor: BLS verification failed for block %s\n", blockHash.ToString()); + return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-invalid"); + }
🧹 Nitpick comments (3)
src/llmq/utils.cpp (1)
943-961: BLS check functor logic looks correct
- Correctly chooses VerifySecureAggregated for multiple pubkeys and VerifyInsecure for single pubkey.
- Returns false and logs when no pubkeys are provided, which is appropriate for a defensive check in the async worker.
Two minor polish suggestions:
- Consider making operator() a const method if BlsCheck’s members are immutable during verification.
- Optionally short-circuit on an obviously invalid signature (e.g., m_sig.IsValid()) to avoid unnecessary work.
src/llmq/commitment.cpp (2)
61-79: Reserve and batch Add() calls for fewer allocations and better queue efficiency
- Reserve memberPubKeys to at least CountSigners() to avoid multiple reallocations, especially for large LLMQs.
- Batch both the members and quorum checks into a single Add() call to reduce queue overhead.
- std::vector<CBLSPublicKey> memberPubKeys; + std::vector<CBLSPublicKey> memberPubKeys; + memberPubKeys.reserve(CountSigners()); for (const auto i : irange::range(members.size())) { if (!signers[i]) { continue; } memberPubKeys.emplace_back(members[i]->pdmnState->pubKeyOperator.Get()); } std::string members_id_string{ strprintf("CFinalCommitment -- q[%s] invalid aggregated members signature", quorumHash.ToString())}; if (queue_control) { - std::vector<utils::BlsCheck> vChecks; - vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string); - queue_control->Add(vChecks); + std::vector<utils::BlsCheck> vChecks; + vChecks.reserve(2); + vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string); + // The quorumSig check is added below to the same batch + // (see matching change in the quorumSig branch) + // queue_control->Add(vChecks) will be called once after pushing both + // checks. + // NOTE: see matching change in lines 82-89 } else { if (!membersSig.VerifySecureAggregated(memberPubKeys, commitmentHash)) { LogPrint(BCLog::LLMQ, "%s\n", members_id_string); return false; } }And couple this with the change below (lines 82–89) to actually batch both checks.
82-95: Batch the quorumSig check together with memberSig in a single Add()Currently, two Add() calls are performed. Combine them into one to reduce queue contention.
- std::string qsig_id_string{strprintf("CFinalCommitment -- q[%s] invalid quorum signature", quorumHash.ToString())}; - if (queue_control) { - std::vector<utils::BlsCheck> vChecks; - std::vector<CBLSPublicKey> public_keys; - public_keys.push_back(quorumPublicKey); - vChecks.emplace_back(quorumSig, public_keys, commitmentHash, qsig_id_string); - queue_control->Add(vChecks); - } else { + std::string qsig_id_string{strprintf("CFinalCommitment -- q[%s] invalid quorum signature", quorumHash.ToString())}; + if (queue_control) { + // If we already prepared vChecks above (multi-member path), append here. Otherwise, create it now. + std::vector<utils::BlsCheck> vChecks; + std::vector<CBLSPublicKey> public_keys; + public_keys.push_back(quorumPublicKey); + vChecks.emplace_back(quorumSig, public_keys, commitmentHash, qsig_id_string); + queue_control->Add(vChecks); + } else { if (!quorumSig.VerifyInsecure(quorumPublicKey, commitmentHash)) { LogPrint(BCLog::LLMQ, "%s\n", qsig_id_string); return false; } }If you adopt batching in the multi-member branch (previous comment), ensure vChecks and public_keys are kept in scope appropriately to push both checks and call Add() once.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📥 Commits
Reviewing files that changed from the base of the PR and between 9a6385b1525c4352c53c59920a50a449cd988c0d and 7999733d0078cd852e31bd99fd826fcfa1adb593.
📒 Files selected for processing (8)
src/init.cpp(1 hunks)src/llmq/blockprocessor.cpp(5 hunks)src/llmq/blockprocessor.h(5 hunks)src/llmq/commitment.cpp(3 hunks)src/llmq/commitment.h(2 hunks)src/llmq/options.h(1 hunks)src/llmq/utils.cpp(1 hunks)src/llmq/utils.h(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- src/llmq/options.h
- src/llmq/commitment.h
- src/init.cpp
- src/llmq/utils.h
- src/llmq/blockprocessor.h
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit Inference Engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
src/llmq/utils.cppsrc/llmq/commitment.cppsrc/llmq/blockprocessor.cpp
🧠 Learnings (2)
📓 Common learnings
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
📚 Learning: 2025-07-20T18:42:49.794Z
Learnt from: CR
PR: dashpay/dash#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-20T18:42:49.794Z
Learning: Applies to src/evo/specialtx.h : Special transactions use payload extensions defined in src/evo/specialtx.h
Applied to files:
src/llmq/commitment.cpp
🧬 Code Graph Analysis (3)
src/llmq/utils.cpp (1)
src/interfaces/node.h (1)
LLMQ(79-85)
src/llmq/commitment.cpp (4)
src/llmq/utils.cpp (2)
GetAllQuorumMembers(201-282)GetAllQuorumMembers(201-204)src/evo/mnhftx.h (1)
quorumHash(35-35)src/logging.h (1)
LogAcceptDebug(235-238)src/evo/assetlocktx.h (1)
quorumSig(82-82)
src/llmq/blockprocessor.cpp (1)
src/llmq/blockprocessor.h (1)
CQuorumBlockProcessor(40-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: arm-linux-build / Build source
- GitHub Check: x86_64-apple-darwin / Build depends
- GitHub Check: Lint / Run linters
- GitHub Check: x86_64-pc-linux-gnu / Build depends
🔇 Additional comments (3)
src/llmq/commitment.cpp (1)
177-178: Delegation to VerifySignatureAsync looks goodSynchronous path now reuses the new async-capable implementation by passing a null queue. This keeps semantics unchanged when checkSigs is true.
src/llmq/blockprocessor.cpp (2)
57-71: Thread count initialization mirrors script-check logic — LGTM
- Autodetect and “leave N cores free” behavior is correct.
- Subtracting one for the main thread and capping by MAX_BLSCHECK_THREADS matches expectations.
- Logging the final thread count is helpful.
73-73: Graceful shutdown of BLS worker threads — LGTMStopping worker threads in the destructor prevents leaks and dangling threads.
src/llmq/blockprocessor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential null dereference: LookupBlockIndex may return nullptr but is passed to gsl::not_null
In ProcessBlock() you call VerifySignatureAsync with pQuorumBaseBlockIndex constructed from LookupBlockIndex(qc.quorumHash). If LookupBlockIndex returns nullptr, constructing gsl::not_null inside VerifySignatureAsync is UB/assert-failing. This can happen before structural verification when processing untrusted blocks.
Skip queueing the check when the index isn’t found (it will be rejected later in ProcessCommitment), or invalidate the block immediately.
- for (const auto& [_, qc] : qcs) {
+ for (const auto& [_, qc] : qcs) {
if (qc.IsNull()) continue;
const auto* pQuorumBaseBlockIndex = m_chainstate.m_blockman.LookupBlockIndex(qc.quorumHash);
- qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control);
+ if (!pQuorumBaseBlockIndex) {
+ // Defer to ProcessCommitment() for full validation; avoid passing nullptr to gsl::not_null
+ continue;
+ }
+ qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control);
}Alternatively (stricter): return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-quorum-hash") here.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (fBLSChecks) { | |
| CCheckQueueControl<utils::BlsCheck> queue_control(&m_bls_queue); | |
| for (const auto& [_, qc] : qcs) { | |
| if (qc.IsNull()) continue; | |
| const auto* pQuorumBaseBlockIndex = m_chainstate.m_blockman.LookupBlockIndex(qc.quorumHash); | |
| qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control); | |
| } | |
| if (fBLSChecks) { | |
| CCheckQueueControl<utils::BlsCheck> queue_control(&m_bls_queue); | |
| for (const auto& [_, qc] : qcs) { | |
| if (qc.IsNull()) continue; | |
| const auto* pQuorumBaseBlockIndex = m_chainstate.m_blockman.LookupBlockIndex(qc.quorumHash); | |
| if (!pQuorumBaseBlockIndex) { | |
| // Defer to ProcessCommitment() for full validation; avoid passing nullptr to gsl::not_null | |
| continue; | |
| } | |
| qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control); | |
| } |
🤖 Prompt for AI Agents
In src/llmq/blockprocessor.cpp around lines 217 to 224,
LookupBlockIndex(qc.quorumHash) may return nullptr which will lead to undefined
behavior when VerifySignatureAsync expects a gsl::not_null; add a nullptr check
for pQuorumBaseBlockIndex before calling VerifySignatureAsync and handle it by
either skipping queuing the check (continue the loop) or return
state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-quorum-hash") to
immediately reject the block; ensure no gsl::not_null is constructed from a
nullptr in either path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in follow-up #6789, it's copy-paste code and out-of-scope to fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knst, thank you for the quick response and for being proactive in creating PR #6789 to address this issue. I understand that keeping the current PR focused on the BLS signature validation performance improvements makes sense to avoid scope creep, especially when dealing with existing code patterns.
Your approach of handling this in a separate follow-up PR is consistent with good engineering practices for maintaining focused changes.
✏️ Learnings added
Learnt from: knst
PR: dashpay/dash#6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.763Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
src/llmq/commitment.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard against out-of-bounds access and remove noisy log
- When llmq_params.size == 1, the code indexes members[0] without checking members is non-empty. In ProcessBlock(), VerifySignatureAsync is called before structural validation, so a malformed commitment could yield an empty members list here and cause OOB access.
- LogPrintf of the operator public key on every single-member verify is noisy and can clutter logs.
Apply this diff to make the path safe and quiet:
- if (llmq_params.size == 1) {
- LogPrintf("pubkey operator: %s\n", members[0]->pdmnState->pubKeyOperator.Get().ToString());
- if (!membersSig.VerifyInsecure(members[0]->pdmnState->pubKeyOperator.Get(), commitmentHash)) {
+ if (llmq_params.size == 1) {
+ if (members.empty()) {
+ LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] missing quorum members for size=1\n", quorumHash.ToString());
+ return false;
+ }
+ if (!membersSig.VerifyInsecure(members[0]->pdmnState->pubKeyOperator.Get(), commitmentHash)) {
LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] invalid member signature\n", quorumHash.ToString());
return false;
}
} else {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (llmq_params.size == 1) { | |
| LogPrintf("pubkey operator: %s\n", members[0]->pdmnState->pubKeyOperator.Get().ToString()); | |
| if (!membersSig.VerifyInsecure(members[0]->pdmnState->pubKeyOperator.Get(), commitmentHash)) { | |
| LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] invalid member signature\n", quorumHash.ToString()); | |
| return false; | |
| } | |
| } else { | |
| if (llmq_params.size == 1) { | |
| if (members.empty()) { | |
| LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] missing quorum members for size=1\n", quorumHash.ToString()); | |
| return false; | |
| } | |
| if (!membersSig.VerifyInsecure(members[0]->pdmnState->pubKeyOperator.Get(), commitmentHash)) { | |
| LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] invalid member signature\n", quorumHash.ToString()); | |
| return false; | |
| } | |
| } else { |
🤖 Prompt for AI Agents
In src/llmq/commitment.cpp around lines 54 to 60, the branch for
llmq_params.size == 1 accesses members[0] without ensuring members is non-empty
and emits a noisy LogPrintf; add a guard that checks members.empty() (or
members.size() < 1) and return false if so to prevent out-of-bounds access, and
remove the LogPrintf of the operator public key so only the signature
verification remains (keep the membersSig.VerifyInsecure call and its error
path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of case when members.empty(). Anyway, llmq_params.size == 1 is used for regtest only, won't cause any public-net failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knst, thank you for the context! You're absolutely right that if llmq_params.size == 1 is regtest-only, the practical risk is minimal since it's a controlled testing environment.
Given your domain expertise, I trust your assessment about when members.empty() could occur. The bounds check might be defensive programming overkill for this specific case, especially in a performance-focused PR.
However, what about the LogPrintf removal? That seems like it could still create noisy logs during regtest runs, regardless of the bounds issue. Would you consider removing just that line to keep the logs cleaner?
I understand if you prefer to keep the scope focused on the main performance improvements rather than these edge cases.
🐰💭 Contemplating the balance between defensive programming and practical risk assessment...
✏️ Learnings added
Learnt from: knst
PR: dashpay/dash#6692
File: src/llmq/commitment.cpp:54-60
Timestamp: 2025-08-19T15:08:00.791Z
Learning: In Dash Core, llmq_params.size == 1 is used only for regtest environments, not on public networks, which significantly reduces the risk profile of code paths specific to single-member quorums.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Learnt from: knst
PR: dashpay/dash#6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.763Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
Force-pushed to resolve conflict with |
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 7999733d0078cd852e31bd99fd826fcfa1adb593
|
This pull request has conflicts, please rebase. |
It introduces new commandline argument -parbls to set up amount of parallel threads for BLS validation New parallel BlsChecker validates asynchonously quorumSig and membersSig in Quorum Commitment
Co-authored-by: UdjinM6 <[email protected]>
Blocks could have 0, 2 or 32 commitments currently; further benchmarking is needed to find a point where is a balance point, but likely it's somewhere between 32 and 64; because each quorum commitment have 2 BLS signatures
7999733 to
7733740
Compare
Rebased due to conflict after merging #6707 |
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 7733740
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/llmq/blockprocessor.cpp (1)
217-229: Guard against null pQuorumBaseBlockIndex when queueing async checks.LookupBlockIndex can return nullptr for untrusted blocks. Passing nullptr into downstream code that constructs gsl::not_null (as noted previously) is UB. While you have a follow-up PR (#6789) for a similar spot, this location should be guarded too to avoid queuing invalid work.
Applying a minimal guard (skip queueing the check; it will be rejected later) would be safe:
- for (const auto& [_, qc] : qcs) { - if (qc.IsNull()) continue; - const auto* pQuorumBaseBlockIndex = m_chainstate.m_blockman.LookupBlockIndex(qc.quorumHash); - qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control); - } + for (const auto& [_, qc] : qcs) { + if (qc.IsNull()) continue; + const auto* pQuorumBaseBlockIndex = m_chainstate.m_blockman.LookupBlockIndex(qc.quorumHash); + if (!pQuorumBaseBlockIndex) { + // Defer validation to ProcessCommitment; avoid passing nullptr to code expecting not_null + continue; + } + qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control); + }
🧹 Nitpick comments (4)
src/llmq/blockprocessor.h (1)
48-49: Hard-coded CCheckQueue batch size (4) — consider making it configurable or aligning with script-check queue.
m_bls_queue{4}sets the queue batch size to 4. That value controls how many items a worker processes per batch and can materially affect throughput/latency. Consider:
- Using a named constant (e.g., DEFAULT_BLSCHECK_BATCH) for discoverability.
- Optionally exposing it (hidden) behind an arg or deriving it from hardware like we do for thread count.
Not a blocker; just a maintainability/perf knob to keep in mind.
src/llmq/blockprocessor.cpp (2)
15-15: Redundant include (minor nit).
#include <checkqueue.h>is already present in the header; including it here too is harmless but redundant. You can keep it if it helps include hygiene, otherwise consider removing to reduce compile-time dependencies.
217-229: Consider releasing cs_main before waiting on BLS worker threads (optional).We queue checks under cs_main and then
Wait()while still holding it. If VerifySignatureAsync fully materializes inputs (msg hash, pubkeys, sigs) before queuing, you could unlock cs_main prior toWait()to reduce contention on the global chainstate lock during CPU-bound BLS verification. This would require careful audit of any shared structures referenced by worker code (but BlsCheck itself only uses captured values).Not required for this perf PR; just a potential follow-up to improve parallelism.
src/llmq/utils.cpp (1)
943-962: BlsCheck functor implementation — LGTM, with minor nits optional.
- Correctly distinguishes aggregated vs single-key verify paths and logs a clear id string on failure.
- Early-return on empty pubkey vector is defensive and appropriate.
Optional nits:
- You could early-check
m_pubkeys.empty()to reduce nesting, but current readability is fine.- If you ever need richer diagnostics, consider including sizes of
m_pubkeysor a short hash ofm_msg_hashin the log. Not necessary now since callers already prepare informativem_id_string.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📥 Commits
Reviewing files that changed from the base of the PR and between 7999733d0078cd852e31bd99fd826fcfa1adb593 and 7733740.
📒 Files selected for processing (8)
src/init.cpp(1 hunks)src/llmq/blockprocessor.cpp(5 hunks)src/llmq/blockprocessor.h(5 hunks)src/llmq/commitment.cpp(3 hunks)src/llmq/commitment.h(2 hunks)src/llmq/options.h(1 hunks)src/llmq/utils.cpp(1 hunks)src/llmq/utils.h(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- src/llmq/utils.h
- src/init.cpp
- src/llmq/commitment.h
- src/llmq/commitment.cpp
- src/llmq/options.h
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit Inference Engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
src/llmq/blockprocessor.hsrc/llmq/utils.cppsrc/llmq/blockprocessor.cpp
🧠 Learnings (6)
📓 Common learnings
Learnt from: knst
PR: dashpay/dash#6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.763Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
📚 Learning: 2025-08-19T14:57:31.763Z
Learnt from: knst
PR: dashpay/dash#6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.763Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
Applied to files:
src/llmq/blockprocessor.cpp
📚 Learning: 2025-07-29T14:32:48.369Z
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Applied to files:
src/llmq/blockprocessor.cpp
📚 Learning: 2025-07-29T14:33:01.040Z
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/wallet/wallet.cpp:0-0
Timestamp: 2025-07-29T14:33:01.040Z
Learning: In refactoring PRs like #6761, kwvg acknowledges code safety improvements (like null pointer checks and unused parameter warnings) but prefers to defer them to follow-up PRs to maintain focus on the primary refactoring objectives, avoiding scope creep.
Applied to files:
src/llmq/blockprocessor.cpp
📚 Learning: 2025-07-23T09:28:32.783Z
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:15-250
Timestamp: 2025-07-23T09:28:32.783Z
Learning: In refactoring PRs like #6761, kwvg prefers to defer code formatting fixes to separate follow-up PRs when formatting is not the primary objective, to maintain focus on the structural changes and avoid scope creep.
Applied to files:
src/llmq/blockprocessor.cpp
📚 Learning: 2025-06-09T16:43:20.996Z
Learnt from: kwvg
PR: dashpay/dash#6718
File: test/functional/test_framework/test_framework.py:2102-2102
Timestamp: 2025-06-09T16:43:20.996Z
Learning: In the test framework consolidation PR (#6718), user kwvg prefers to limit functional changes to those directly related to MasternodeInfo, avoiding scope creep even for minor improvements like error handling consistency.
Applied to files:
src/llmq/blockprocessor.cpp
🧬 Code Graph Analysis (1)
src/llmq/blockprocessor.h (1)
src/llmq/utils.h (1)
utils(33-85)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: mac-build / Build source
- GitHub Check: win64-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64_multiprocess-build / Build source
- GitHub Check: linux64_tsan-build / Build source
🔇 Additional comments (6)
src/llmq/blockprocessor.h (2)
59-59: Destructor addition to stop worker threads — LGTM.Clean RAII teardown for the BLS queue.
85-86: ProcessCommitment signature change (removing fBLSChecks) — LGTM.Consistent with moving signature verification out of ProcessCommitment into the async phase.
src/llmq/blockprocessor.cpp (4)
57-71: Thread count autodetection and clamping mirror script-check semantics — LGTM.
- Supports -parbls=0 (autodetect), negative values (“leave n cores free”), subtracts 1 for the main thread, and clamps to MAX_BLSCHECK_THREADS.
- Clear log of the effective thread count and proper startup.
Matches prior feedback about mirroring the script-check path.
73-74: RAII shutdown of BLS worker threads — LGTM.Ensures no worker leakage.
230-235: ProcessCommitment callsite update — LGTM.Call site aligns with the new signature (no fBLSChecks).
271-273: Review consensus safety for BLS checks in LLMQ block processingConfirmed call sites for
CQuorumBlockProcessor::ProcessBlock:
- Only invoked in
CSpecialTxProcessor::ProcessSpecialTxsInBlock(src/evo/specialtxman.cpp:571–575), passingfCheckCbTxMerkleRootsas thefBLSChecksflag.Since there are no other callers of
CQuorumBlockProcessor::ProcessBlock, ensure that:
- In all contexts where a block is validated for consensus (not just “just check”),
fCheckCbTxMerkleRootsis always true.- Any caller that passes
fJustCheck=falsemust also always passfCheckCbTxMerkleRoots=true, guaranteeing BLS signature verification.Please confirm that
ProcessSpecialTxsInBlockis never invoked withfCheckCbTxMerkleRoots=falseduring tip validation, or add an explicit check/assert to enforcefBLSChecks == truewhenfJustCheck == false.
… LookupBlockIndex 3b9e061 fix: handle possible nullptr in InitNewQuorum (Konstantin Akimov) c7890b7 fix: avoid possible nullptr for unknown hash of qc after LookupBlockIndex (Konstantin Akimov) Pull request description: ## Issue being fixed or feature implemented Follow-up for #6692; discovered possible nullptr usage after looking for non-existing block ## What was done? Added extra check for nullptr in 3 missing places in dash specific code: - CDKGSessionHandler::HandleDKGRound - CQuorumBlockProcessor::ProcessBlock - CQuorumBlockProcessor::ProcessCommitment ## How Has This Been Tested? N/A ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: UdjinM6: utACK 3b9e061 Tree-SHA512: 3e5fa925250efa0e272c25f39d833c8fd1e51f1b9a338a63e8940dadfb91c61933eb48196f588621fdda3106fb887bd311dd01c287549720adb6124503e26b79
c9ef70a tests: add is_mature for quorum generation logs (Konstantin Akimov) 59060b5 fmt: order imports and fix gap in feature_llmq_dkgerrors.py (Konstantin Akimov) 58377f8 test: added functional tests for invalid CQuorumCommitment (Konstantin Akimov) bb0b8b0 test: add serialization/deserialization of CFinalCommitmentPayload (Konstantin Akimov) Pull request description: ## Issue being fixed or feature implemented As I noticed implementing dashpay#6692 if BlsChecker works incorrectly it won't be caught by unit or functional tests. See also dashpay#6692 (comment) how 6692 has been tested without this PR. ## What was done? This PR introduces new functional tests to validated that `llmqType`, `membersSig`, `quorumSig` and `quorumPublicKey` are indeed validated by Dash Core as part of consensus. ## How Has This Been Tested? See changes in `feature_llmq_dkgerrors.py` ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone ACKs for top commit: kwvg: utACK c9ef70a UdjinM6: utACK c9ef70a Tree-SHA512: ad61f8c845f6681765105224b2a84e0b206791e2c9a786433b9aa91018ab44c1fa764528196fd079f42f08a55794756ba8c9249c6eb10af6fe97c33fa4757f44
…c after LookupBlockIndex 3b9e061 fix: handle possible nullptr in InitNewQuorum (Konstantin Akimov) c7890b7 fix: avoid possible nullptr for unknown hash of qc after LookupBlockIndex (Konstantin Akimov) Pull request description: ## Issue being fixed or feature implemented Follow-up for dashpay#6692; discovered possible nullptr usage after looking for non-existing block ## What was done? Added extra check for nullptr in 3 missing places in dash specific code: - CDKGSessionHandler::HandleDKGRound - CQuorumBlockProcessor::ProcessBlock - CQuorumBlockProcessor::ProcessCommitment ## How Has This Been Tested? N/A ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: UdjinM6: utACK 3b9e061 Tree-SHA512: 3e5fa925250efa0e272c25f39d833c8fd1e51f1b9a338a63e8940dadfb91c61933eb48196f588621fdda3106fb887bd311dd01c287549720adb6124503e26b79
Issue being fixed or feature implemented
During blocks validation, quorum commitments are processed in single thread.
While some blocks have up to 32 commitments (blocks which have rotation quorums commitment), each quorum commitment have hundreds public keys to validate and 2 signature (quorum signature and public key signature). It takes up to 30% in total indexing time and up to 1 second for heavy blocks.
What was done?
CCheckQueuewhich is used for validation ECDSA signatures is used now for validation of BLS signatures in quorum commitments.Quorum signature and members signatures are validated simultaneously now which makes performance improvement even for blocks which has only 1 commitment.
How Has This Been Tested?
Invalidated + reconsidered 15k blocks (~1 months worth)
This PR makes validation of Quorum Commitment 3.5x times faster; overall indexing 25% faster on my 12 cores environment.
PR:

develop:

Breaking Changes
N/A
Checklist: