Skip to content

Conversation

@furszy
Copy link

@furszy furszy commented Sep 1, 2020

-- Only running on regtest --

Covering the following points:

1) A new synchronization structure for the tier two network.

It has no redundant checks nor redundant requests (there by less network bloat, faster synchronization) and the code is by far more readable and maintainable than the previous version.

2) Tier two network regtest support.

Adding Masternodes and Governance capabilities to the regtest network.

3) Sync and masternode activation test.

Functional test validating the correct operation of the tier two network sync and the masternode broadcasting & activation process.

TODO:

  • Clean peer state on peer disconnection (connect signal)
  • Spam filter + spam functional test.
  • Not responding peers timeout test + spam functional test.
  • Sporks broadcasting end message (currently it's using the SPORK_INVALID as end message).
  • Data verification from several peers (currently running, only need to implement a sync threshold to move to the next level) + spam functional test.
  • Further decoupling from the old structure.
  • Previous version cleanup.

Have pushed it under WIP to be able to test other great PRs like #1826.

@furszy furszy self-assigned this Sep 1, 2020
@furszy furszy force-pushed the 2020_masternodesSync_regtest branch from 8e6b52a to c61c0bb Compare September 2, 2020 21:59
@furszy
Copy link
Author

furszy commented Sep 4, 2020

Added a commit expanding the test coverage for the following scenarios: masternode expiration, re initialization after expiration, removal, re initialization after removal and collateral spent.

This will give us higher security for #1826 and #1835 testing on the tier two modifications (hopefully speeding up the review as well).

Time wise: the test will take some more time due the different protocol rules/restrictions to configure the masternode, pass from one state to another, setup the node, shut it down etc. As manual testing sucks, and consumes the entire day (when you have to setup a whole network and enable a group of masternodes all by yourself..), don't care on how much time this functional test is taking, it's hours and hours that we going to win :) .

@furszy furszy force-pushed the 2020_masternodesSync_regtest branch 2 times, most recently from c2af741 to 118e9bd Compare September 5, 2020 07:48
@furszy
Copy link
Author

furszy commented Sep 5, 2020

long night :) , added another commit increasing further the test coverage.

First budget/governance area functional test, covering the following points:

  1. Group of masternodes setup/creation.
  2. Proposal creation.
  3. Vote creation.
  4. Proposal and vote broadcast.
  5. Proposal and vote sync.

Pending for future work (if anyone want, feel more than welcome to tackle any of the following points or code new ones on top of this PR):

Add test coverage for budget finalization, proposals being funded or not, proposals removal, budget projection, change vote.

Future future plus: it would be great if we can prepare a framework setup for the governance tests. Start with N amount of masternodes instead of have to set & initialize them in each test, maybe with proposals in the network too etc.

random-zebra added a commit that referenced this pull request Sep 7, 2020
30d5c66 net: correct addrman logging (Fuzzbawls)
8a2b7fe Don't send layer2 messages to peers that haven't completed the handshake (Fuzzbawls)
dc10100 [bugfix] Making tier two thread interruptable. (furszy)
2ae76aa Move CNode::addrLocal access behind locked accessors (Fuzzbawls)
470482f Move CNode::addrName accesses behind locked accessors (Fuzzbawls)
35365e1 Move [clean|str]SubVer writes/copyStats into a lock (Fuzzbawls)
d816a86 Make nServices atomic (Matt Corallo)
8a66add Make nStartingHeight atomic (Matt Corallo)
567c9b5 Avoid copying CNodeStats to make helgrind OK with buggy std::string (Matt Corallo)
aea5211 Make nTimeConnected const in CNode (Matt Corallo)
cf46680 net: fix a few races. (Fuzzbawls)
c916fcf net: add a lock around hSocket (Cory Fields)
cc8a93c net: rearrange so that socket accesses can be grouped together (Cory Fields)
6f731dc Do not add to vNodes until fOneShot/fFeeler/fAddNode have been set (Matt Corallo)
07c8d33 Ensure cs_vNodes is held when using the return value from FindNode (Matt Corallo)
110a44b Delete some unused (and broken) functions in CConnman (Matt Corallo)
08a12e0 net: log an error rather than asserting if send version is misused (Cory Fields)
cd8b82c net: Disallow sending messages until the version handshake is complete (Fuzzbawls)
54b454b net: don't run callbacks on nodes that haven't completed the version handshake (Cory Fields)
2be6877 net: deserialize the entire version message locally (Fuzzbawls)
444f599 Dont deserialize nVersion into CNode (Fuzzbawls)
f30f10e net: remove cs_vRecvMsg (Fuzzbawls)
5812f9e net: add a flag to indicate when a node's send buffer is full (Fuzzbawls)
5ec4db2 net: Hardcode protocol sizes and use fixed-size types (Wladimir J. van der Laan)
de87ea6 net: Consistent checksum handling (Wladimir J. van der Laan)
d4bcd25 net: push only raw data into CConnman (Cory Fields)
b79e416 net: add CVectorWriter and CNetMsgMaker (Cory Fields)
63c51d3 net: No need to check individually for disconnection anymore (Cory Fields)
07d8c7b net: don't send any messages before handshake or after fdisconnect (Cory Fields)
9adfc7f net: Set feelers to disconnect at the end of the version message (Cory Fields)
f88c06c net: handle version push in InitializeNode (Cory Fields)
04d39c8 net: construct CNodeStates in place (Cory Fields)
40a6c5d net: remove now-unused ssSend and Fuzz (Cory Fields)
681c62d drop the optimistic write counter hack (Cory Fields)
9f939f3 net: switch all callers to connman for pushing messages (Cory Fields)
8f9011d connman is in charge of pushing messages (Cory Fields)
f558bb7 serialization: teach serializers variadics (Cory Fields)
01ea667 net: Use deterministic randomness for CNode's nonce, and make it const (Cory Fields)
de1ad13 net: constify a few CNode vars to indicate that they're threadsafe (Cory Fields)
34050a3 Move static global randomizer seeds into CConnman (Pieter Wuille)
1ce349f net: add a flag to indicate when a node's process queue is full (Fuzzbawls)
5581b47 net: add a new message queue for the message processor (Fuzzbawls)
701b578 net: rework the way that the messagehandler sleeps (Fuzzbawls)
7e55dbf net: Add a simple function for waking the message handler (Cory Fields)
47ea844 net: record bytes written before notifying the message processor (Cory Fields)
ffd4859 net: handle message accounting in ReceiveMsgBytes (Cory Fields)
8cee696 net: log bytes recv/sent per command (Fuzzbawls)
754400e net: set message deserialization version when it's time to deserialize (Fuzzbawls)
d2b8e0a net: make CMessageHeader a dumb storage class (Fuzzbawls)
cc24eff net: remove redundant max sendbuffer size check (Fuzzbawls)
32ab0c0 net: wait until the node is destroyed to delete its recv buffer (Cory Fields)
6e3f71b net: only disconnect if fDisconnect has been set (Cory Fields)
1b0beb6 net: make GetReceiveFloodSize public (Cory Fields)
229697a net: make vRecvMsg a list so that we can use splice() (Fuzzbawls)
d2d71ba net: fix typo causing the wrong receive buffer size (Cory Fields)
50bb09d Add test-before-evict discipline to addrman (Ethan Heilman)

Pull request description:

  This is a combination of multiple upstream PRs focused on optimizing the P2P networking flow after the introduction of CConnman encapsulation, and a few older PRs that were previously missed to support the later optimizations. The PRs are as follows:

  - bitcoin#9037 - net: Add test-before-evict discipline to addrman
  - bitcoin#5151 - make CMessageHeader a dumb storage class
  - bitcoin#6589 - log bytes recv/sent per command
  - bitcoin#8688 - Move static global randomizer seeds into CConnman
  - bitcoin#9050 - net: make a few values immutable, and use deterministic randomness for the localnonce
  - bitcoin#8708 - net: have CConnman handle message sending
  - bitcoin#9128 - net: Decouple CConnman and message serialization
  - bitcoin#8822 - net: Consistent checksum handling
  - bitcoin#9441 - Net: Massive speedup. Net locks overhaul
  - bitcoin#9609 - net: fix remaining net assertions
  - bitcoin#9626 - Clean up a few CConnman cs_vNodes/CNode things
  - bitcoin#9698 - net: fix socket close race
  - bitcoin#9708 - Clean up all known races/platform-specific UB at the time PR was opened
    - Excluded bitcoin/bitcoin@512731b and bitcoin/bitcoin@d8f2b8a, to be done in a separate PR

ACKs for top commit:
  furszy:
    code ACK 30d5c66 , testnet sync from scratch went well and tested with #1829 on top as well and all good.
  furszy:
     mainnet sync went fine, ACK 30d5c66 .
  random-zebra:
    ACK 30d5c66 and merging...

Tree-SHA512: 09689554f53115a45f810b47ff75d887fa9097ea05992a638dbb6055262aeecd82d6ce5aaa2284003399d839b6f2c36f897413da96cfa2cd3b858387c3f752c1
@furszy furszy force-pushed the 2020_masternodesSync_regtest branch from 118e9bd to e0d9ad6 Compare September 8, 2020 15:24
@furszy
Copy link
Author

furszy commented Sep 8, 2020

rebased on master

@random-zebra random-zebra added this to the 5.0.0 milestone Sep 22, 2020
@furszy furszy force-pushed the 2020_masternodesSync_regtest branch from e0d9ad6 to bd6b917 Compare September 24, 2020 23:34
@furszy furszy changed the title [WIP] Tier two network sync new architecture, regtest support + MN activation functional test. Tier two network sync new architecture, regtest support + MN activation functional test. Sep 25, 2020
Copy link

@random-zebra random-zebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK, looking pretty good.
Aside from a couple notes left inline, have two minor talking points:

  • I'd rather remove commits 594fa7a924d028444687c789083c9e997385bdfd and 6f164f6775588a474225c0a50798694ae694f674 (and most sleep in the test).
    Time constants can remain the same as mainnet (if you use mocktime).
  • LogPrintf in ProcessSporkMsg could be very spammy. Maybe you could move to LogPrint with BCLog::NET or some other category.

@furszy
Copy link
Author

furszy commented Sep 30, 2020

I'd rather remove commits 594fa7a and 6f164f6 (and most sleep in the test).
Time constants can remain the same as mainnet (if you use mocktime).

i played with the mocktime implementation for a while and agree that would be cleaner with it but.. there still a missing piece there. MNs weren't getting enabled.

Better to merge this initial step and continue moving forward on top. So neither you or me are depending on each other's work and can continue moving on.

And btw, i actually think that 594fa7a and 6f164f6 are good outside of the mock time changes. I'm more on the side of keep them with a slightly modification in the upcoming work:
agree on removing the mainnet/regtest times division with the mock time changes (this will mean only touch masternode.cpp), keeping the static functions call migration.
So in a future, we can implement, a little bit easier, a TierTwoConsensusParams similar concept and not have all consensus constant definitions all around the tier two sources.

LogPrintf in ProcessSporkMsg could be very spammy. Maybe you could move to LogPrint with BCLog::NET or some other category.

Good catch 👍 , moving them.

@furszy furszy force-pushed the 2020_masternodesSync_regtest branch from bd6b917 to 94b9619 Compare September 30, 2020 20:15
@furszy
Copy link
Author

furszy commented Sep 30, 2020

Updated per feedback.

furszy added a commit that referenced this pull request Oct 1, 2020
…t to obtain the collateral.

7f86031 wallet: GetMasternodeVinAndKeys was not locking cs_wallet to obtain the collateral tx. (furszy)

Pull request description:

  Fixing `CWallet::GetMasternodeVinAndKeys` accessing `mapWallet` without locking `cs_wallet`.
  Decoupled from #1829 .

ACKs for top commit:
  random-zebra:
    utACK 7f86031
  Fuzzbawls:
    utACK 7f86031

Tree-SHA512: 34995e3d65d6506d4a2465dd41b6d710d8a3e34f613a88691b6aa53018afa48a000faa444d03bbca0f418126fd5220a416d85a5a81d2053fd1412159b8cfb3c9
@furszy furszy force-pushed the 2020_masternodesSync_regtest branch from 94b9619 to b68ec43 Compare October 5, 2020 17:08
@furszy
Copy link
Author

furszy commented Oct 5, 2020

Rebased on master with some changes:

  1. replaced the ping update hack for [Masternodes] dead end over the activation process. #1886 correction.
  2. migrated masternodes_governance_basic.py to use mock time.
  3. renamed functional tests to be using the tiertwo_xxx prefix.

For me, it's good to go in this way (merging 1886 first). It's already fulfilling a lot of needs.

random-zebra added a commit that referenced this pull request Oct 8, 2020
2e3e313 Solving a dead end inconsistency over the masternode activation process. (furszy)

Pull request description:

  Solving a dead end inconsistency over the masternode activation process.

  Context:

  A valid masternode is created, mn start message is broadcasted and the entire network receives it. Peers moves the masternode status to pre_enable (active) and are waiting for a mn ping update to fulfill the min ping time requirement to move the status to enable.

  Issue:

  As the mnping update message work flow is not performing any action until the masternode is in an enable status (masternode.cpp, line 707), the masternode ping is never updated, there by the masternode will never be moved to enable status.

  Why this is working now:

  It's working because (1) the masternodes list is being requested, cleaned and re requested an insane amount of times, (2) some peers are broadcasting the start message twice and (3) peers are broadcasting a start message with an inner ping time in the future, fulfilling the min ping requirements.

  Final thoughts:

  This is most likely one of the reason of the different masternodes list views across peers in the network (over the recently active/enable MNs) and why the activation process takes in some occasions few days (the mn list needs to be refreshed)

  Extra data:

  1) The masternodes list request + re-request network bloat, as is mentioned in point 1 of the "Why is working now" section, will not happen anymore in the future moving forward with #1829 new tier two syncing process subsequent steps, the deterministic masternodes implementation and all of the PRs refactoring & cleaning the tier two sources (in other words, after the whole tier two sources revamp..).

  2) This PR makes #1829 4963953 hack unneeded.

ACKs for top commit:
  random-zebra:
    tested ACK 2e3e313
  Fuzzbawls:
    utACK 2e3e313

Tree-SHA512: bebd8fe67905b848acf66b72b7462b2fef38825214a4b7209a5f77745f3eeb7086012abe1248e1789a510d74325bfc577810198f14b10658b1eea577a920fa7d
@furszy furszy force-pushed the 2020_masternodesSync_regtest branch 2 times, most recently from cb62628 to 4d9aa94 Compare October 8, 2020 15:51
@furszy furszy force-pushed the 2020_masternodesSync_regtest branch from 4d9aa94 to 9a1e5e2 Compare October 8, 2020 16:20
@furszy
Copy link
Author

furszy commented Oct 8, 2020

Rebased on top of master after #1886 merge.

Plus added two more commits to verify the budget finalization sync process, fulfilling the "governance_sync_basic" test purpose of validating the correct functioning of the tier two messages sync process.
With this changes, we are able to merge #1845 with some minimum checks for the synchronization process.

Ready for review and merge. Will not add any other change here.

Copy link

@random-zebra random-zebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome first step 🥃 very much needed.
Left few more comments (mostly around the use of mocktime which seems a bit funky), but they can be addressed in a later PR, so ACK 9a1e5e2

for i in range(0, 2):
self.generate(1)
time.sleep(5)
set_node_times(self.nodes, self.mocktime + 15)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not increase the mocktime in successive calls.
Should do something like

self.mocktime += 15
set_node_times(self.nodes, self.mocktime)

and in the same fashion in the other places where you update the time, otherwise you are always adding seconds to the same, starting, mocktime value (e.g. see how generate_pos/generate_pow return the updated time to the caller).

Comment on lines 178 to 187
for i in range(0, len(self.nodes)):
node = self.nodes[i]
votesInfo = node.getbudgetvotes(proposalName)
assert(len(votesInfo) > 0)
voteInfo = votesInfo[0]
assert_equal(voteInfo["mnId"], mnCollateralHash)
assert_equal(voteInfo["Vote"], voteType)
found = False
for voteInfo in votesInfo:
if (voteInfo["mnId"] == mnCollateralHash) :
assert_equal(voteInfo["Vote"], voteType)
found = True
assert_true(found, "Error checking vote existence in node " + str(i))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could rewrite all this function in a single pythonic one-liner as:

assert_equal([x["Vote"] for node in self.nodes for x in node.getbudgetvotes(proposalName)
                      if x["mnId"] == mnCollateralHash], [voteType] * len(self.nodes))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice if we could print in this magic line which node wasn't able to sync as well.

Comment on lines 38 to 45
listMNs = node.listmasternodes()
assert(len(listMNs) > 0)
found = False
for mnData in listMNs:
if (mnData["txhash"] == mnTxHash):
assert_equal(mnData["status"], status)
found = True
assert(found)
Copy link

@random-zebra random-zebra Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could rewrite all this function in a single pythonic one-liner as:

assert_equal([x["status"] for x in node.listmasternodes() if x["txhash"] == mnTxHash], [status])

@furszy furszy requested a review from Fuzzbawls October 11, 2020 23:38
Copy link
Collaborator

@Fuzzbawls Fuzzbawls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9a1e5e2

@furszy furszy merged commit 0adce3c into PIVX-Project:master Oct 14, 2020
random-zebra added a commit that referenced this pull request Nov 6, 2020
…ral fixes

2b5fb9b [BUG] Re adding missing CInv elements. "ix" and "txlvote" (furszy)
fcb9207 OperationResult class decoupled to its own file, so it can be used in several places. (furszy)
4983157 Masternode: Fixing bug on the disconnection and reconnection flow, happens when the controller needs to broadcast a new start message, the remote masternode was rejecting it because thinks that already has it enabled (when the MN is in un-capable state and it was kicked from the network list), loosing the reactivation opportunity and ending in a forever non-capable state. (furszy)
af01054 Masternode: locking internal cs for lastPing set. (furszy)
b82c0e6 masternodeman:CheckAndRemove moving to the next iterator with the map.erase return object. (furszy)
60a6f91 tiertwo_masternode_activation: fixing vin_spent check, collateral spent tx broadcast wasn't included un the block (furszy)
4ee8058 tier two sync shouldn't start until the chain is synced. (furszy)
bfdaf6d masternode ping checking for IBD state (furszy)
8656b2a Add IBD status to getblockchaininfo RPC call (furszy)
5a1e3af masternodeman::ProcessGetMNList vMasternodes for loop wasn't being guarded. (furszy)
8a7360f Cached the best block time in UpdateTip (same as we are caching best block hash) to be used on CMasternodeSync::IsBlockchainSynced() and not lock cs_main there. (furszy)
2dcb88c Strip out Misbehaving call from ProcessSporkMsg and move tier two sync process to the next step if the sporks are received before the tierTwoMan requested them (messageHandler requested them) (furszy)
572c871 tier two functional test: wait_until_mnsync_finished advancing the clock on every fail sync attempt. (furszy)
21826ea tiertwo sync on regtest, updating blockchain synced on every loop as in mainnet/testnet. Plus, not updating asset when the received asset isn't what we are looking for. (furszy)
48b4514 [Tests] Tier2 tests: minor fixes (random-zebra)
4f7bdf7 [Refactor] OperationResult: make optional error private (random-zebra)
835088a [Tests] TierTwo: send pings during wait_until (random-zebra)
5d61b58 [Tests][CI] Raise travis build timeout for tiertwo job (random-zebra)
cbd9c67 [Tests] Don't create PoW/PoS cache when running --tiertwo tests (random-zebra)
19f0797 [Tests] TierTwo: add masternode ping (no mocktime) functional test (random-zebra)
0ce9431 [Tests] Refactor tier2_masternode_activation (random-zebra)
2ac0e43 [Tests] Abstract tier2 init in PivxTier2TestFramework (random-zebra)
ec936f5 [Tests] tiertwo_governance_sync_basic fixes and improvements (random-zebra)
4d984ab test_runner: add --tiertwo option to run only the tier two functional test + travis job only for tier two tests created. (furszy)
4929919 Added budget finalization payment test case in tiertwo_governance_sync_basic.py. (furszy)
41f2639 mnfinalbudgetsuggest returning the budget hash if it was broadcasted successfully. (furszy)
6a5b2d1 initmasternode RPC command documentation. (furszy)
5f5b190 initMasternode returning the string error to the caller instead of only throwing a runtime_error. (furszy)
9159b67 Initializing MN port to default port in case of not finding it in the conf. (furszy)
7a49627 MasternodesConfig entries access guarded properly (furszy)
1259c36 tiertwo_governance_sync functional test, setup masternodes without shutting down the nodes. (furszy)
4755a8d introduce "reload_conf" argument in startmasternode "alias" RPC command. (furszy)
fbf7efb masternodeconfig files code styling cleanup. (furszy)
8a07010 Abstract out masternode initialization process. (furszy)

Pull request description:

  Follow up to #1829, focused on decrease the testing time and continue improving the tier two functional tests.

  Introduced the following changes:

  1) Implemented a new RPC command `initmasternode` to initialize the masternode at runtime, with direct impact in the functional test time. The tests don't need to shut the nodes down and restart them anymore in order to setup the masternode/s.

  2) Adapted `tiertwo_governance_sync_basic.py` to the new runtime initialization feature.

  3) Further tier two cleanup. `Masternodeconfig` code refactored properly.

  It's still in WIP (because it need some more work) but the results are very favorable for now:

  Test previous time:
  - 9:43 min

  Test time with this changes:
  - 3:21 min

ACKs for top commit:
  random-zebra:
    ACK 2b5fb9b

Tree-SHA512: 3f70e0ce42a8779ce7e67713cae59c50b28bc197575edb1e75105f8d76bc320007c6b60e171cc1f5ae0a8242a734f71397ec2afa141a3f737677f5a20cecb839
@furszy furszy deleted the 2020_masternodesSync_regtest branch November 29, 2022 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants