Skip to content

Conversation

@UdjinM6
Copy link

@UdjinM6 UdjinM6 commented Dec 11, 2023

Issue being fixed or feature implemented

  1. scanQuorumsCache is a special one and we use it incorrectly.
  2. Platform doesn't really use anything that calls ScanQuorums() directly, they specify the exact quorum hash in RPCs so it's GetQuorum() that is used instead. The only place ScanQuorums() is used for Platform related stuff is StartCleanupOldQuorumDataThread() because we want to preserve quorum data used by GetQuorum(). But this can be optimised with its own (much more compact) cache.
  3. RPCs that use ScanQuorums() should in most cases be ok with smaller cache, for other use cases there is a note in help text now.

What was done?

pls see individual commits

How Has This Been Tested?

run tests, run a node (in progress looks stable)

Breaking Changes

n/a

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone

@UdjinM6 UdjinM6 added this to the 20.1 milestone Dec 11, 2023
@UdjinM6 UdjinM6 force-pushed the fix_quorum_caching_again branch from 6549784 to 27bef79 Compare December 13, 2023 17:31
@UdjinM6 UdjinM6 marked this pull request as ready for review December 14, 2023 14:06
ogabrielides
ogabrielides previously approved these changes Dec 16, 2023
Copy link

@ogabrielides ogabrielides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@PastaPastaPasta
Copy link
Member

Is this not a breaking change in the sense that the response of some RPCs may have been changed / reduced, in a way that is not fully backwards compatible? Maybe I am mis-understanding the scope of the potential RPC changes.

But I think if an RPC previously would have returned say 10 quorums but now only returns 5 that'd be a breaking change no?

@UdjinM6
Copy link
Author

UdjinM6 commented Dec 17, 2023

There should be no changes in RPC results, in 6200640 we simply warn users that fast results are only granted for active quorum sets. It's actually an improvement because keepOldKeys value means quorums, not blocks while scanQuorumsCache is caching quorum sets per block. So we were using cache for keepOldKeys of recently requested blocks only here. And even scanning back with the default count (signingActiveQuorumCount) was not using cache for the most part really (only platform quorums were using cache all the time thanks to the huge keepOldKeys).

@UdjinM6 UdjinM6 force-pushed the fix_quorum_caching_again branch from bb47181 to 947c626 Compare December 17, 2023 19:43
Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK for squash merge

@PastaPastaPasta PastaPastaPasta merged commit 6fe36cc into dashpay:develop Dec 20, 2023
PastaPastaPasta pushed a commit that referenced this pull request Dec 22, 2023
…ms (#5784)

## Issue being fixed or feature implemented
Cache population for old quorums is a cpu heavy operation and should be
avoided for inactive quorums _at least_ oin `ScanQuorums`. This issue is
critical for testnet and other small network because every mn
participate in almost every platform quorum and cache population for 2
months of quorums can easily block everything for 15+ minutes on a 4 cpu
node. On mainnet quorum distribution is much better but it's still a
small waste of cpu (or not so small for unlucky nodes).

#5761 follow-up

## What was done?
Do not start cache population for outdated quorums, improve logs in
`StartCachePopulatorThread` to make it easier to see what's going on.

## How Has This Been Tested?
run a mn on testnet

## Breaking Changes
n/a

## Checklist:
- [x] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have added or updated relevant unit/integration/functional/e2e
tests
- [ ] I have made corresponding changes to the documentation
- [x] I have assigned this pull request to a milestone _(for repository
code-owners and collaborators only)_
@UdjinM6 UdjinM6 removed this from the 20.1 milestone Dec 24, 2023
@UdjinM6 UdjinM6 added this to the 20.0.3 milestone Dec 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants