-
Notifications
You must be signed in to change notification settings - Fork 725
Tier two network sync new architecture, regtest support + MN activation functional test. #1829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tier two network sync new architecture, regtest support + MN activation functional test. #1829
Conversation
8e6b52a to
c61c0bb
Compare
|
Added a commit expanding the test coverage for the following scenarios: masternode expiration, re initialization after expiration, removal, re initialization after removal and collateral spent. This will give us higher security for #1826 and #1835 testing on the tier two modifications (hopefully speeding up the review as well). Time wise: the test will take some more time due the different protocol rules/restrictions to configure the masternode, pass from one state to another, setup the node, shut it down etc. As manual testing sucks, and consumes the entire day (when you have to setup a whole network and enable a group of masternodes all by yourself..), don't care on how much time this functional test is taking, it's hours and hours that we going to win :) . |
c2af741 to
118e9bd
Compare
|
long night :) , added another commit increasing further the test coverage. First budget/governance area functional test, covering the following points:
Pending for future work (if anyone want, feel more than welcome to tackle any of the following points or code new ones on top of this PR): Add test coverage for budget finalization, proposals being funded or not, proposals removal, budget projection, change vote. Future future plus: it would be great if we can prepare a framework setup for the governance tests. Start with N amount of masternodes instead of have to set & initialize them in each test, maybe with proposals in the network too etc. |
30d5c66 net: correct addrman logging (Fuzzbawls) 8a2b7fe Don't send layer2 messages to peers that haven't completed the handshake (Fuzzbawls) dc10100 [bugfix] Making tier two thread interruptable. (furszy) 2ae76aa Move CNode::addrLocal access behind locked accessors (Fuzzbawls) 470482f Move CNode::addrName accesses behind locked accessors (Fuzzbawls) 35365e1 Move [clean|str]SubVer writes/copyStats into a lock (Fuzzbawls) d816a86 Make nServices atomic (Matt Corallo) 8a66add Make nStartingHeight atomic (Matt Corallo) 567c9b5 Avoid copying CNodeStats to make helgrind OK with buggy std::string (Matt Corallo) aea5211 Make nTimeConnected const in CNode (Matt Corallo) cf46680 net: fix a few races. (Fuzzbawls) c916fcf net: add a lock around hSocket (Cory Fields) cc8a93c net: rearrange so that socket accesses can be grouped together (Cory Fields) 6f731dc Do not add to vNodes until fOneShot/fFeeler/fAddNode have been set (Matt Corallo) 07c8d33 Ensure cs_vNodes is held when using the return value from FindNode (Matt Corallo) 110a44b Delete some unused (and broken) functions in CConnman (Matt Corallo) 08a12e0 net: log an error rather than asserting if send version is misused (Cory Fields) cd8b82c net: Disallow sending messages until the version handshake is complete (Fuzzbawls) 54b454b net: don't run callbacks on nodes that haven't completed the version handshake (Cory Fields) 2be6877 net: deserialize the entire version message locally (Fuzzbawls) 444f599 Dont deserialize nVersion into CNode (Fuzzbawls) f30f10e net: remove cs_vRecvMsg (Fuzzbawls) 5812f9e net: add a flag to indicate when a node's send buffer is full (Fuzzbawls) 5ec4db2 net: Hardcode protocol sizes and use fixed-size types (Wladimir J. van der Laan) de87ea6 net: Consistent checksum handling (Wladimir J. van der Laan) d4bcd25 net: push only raw data into CConnman (Cory Fields) b79e416 net: add CVectorWriter and CNetMsgMaker (Cory Fields) 63c51d3 net: No need to check individually for disconnection anymore (Cory Fields) 07d8c7b net: don't send any messages before handshake or after fdisconnect (Cory Fields) 9adfc7f net: Set feelers to disconnect at the end of the version message (Cory Fields) f88c06c net: handle version push in InitializeNode (Cory Fields) 04d39c8 net: construct CNodeStates in place (Cory Fields) 40a6c5d net: remove now-unused ssSend and Fuzz (Cory Fields) 681c62d drop the optimistic write counter hack (Cory Fields) 9f939f3 net: switch all callers to connman for pushing messages (Cory Fields) 8f9011d connman is in charge of pushing messages (Cory Fields) f558bb7 serialization: teach serializers variadics (Cory Fields) 01ea667 net: Use deterministic randomness for CNode's nonce, and make it const (Cory Fields) de1ad13 net: constify a few CNode vars to indicate that they're threadsafe (Cory Fields) 34050a3 Move static global randomizer seeds into CConnman (Pieter Wuille) 1ce349f net: add a flag to indicate when a node's process queue is full (Fuzzbawls) 5581b47 net: add a new message queue for the message processor (Fuzzbawls) 701b578 net: rework the way that the messagehandler sleeps (Fuzzbawls) 7e55dbf net: Add a simple function for waking the message handler (Cory Fields) 47ea844 net: record bytes written before notifying the message processor (Cory Fields) ffd4859 net: handle message accounting in ReceiveMsgBytes (Cory Fields) 8cee696 net: log bytes recv/sent per command (Fuzzbawls) 754400e net: set message deserialization version when it's time to deserialize (Fuzzbawls) d2b8e0a net: make CMessageHeader a dumb storage class (Fuzzbawls) cc24eff net: remove redundant max sendbuffer size check (Fuzzbawls) 32ab0c0 net: wait until the node is destroyed to delete its recv buffer (Cory Fields) 6e3f71b net: only disconnect if fDisconnect has been set (Cory Fields) 1b0beb6 net: make GetReceiveFloodSize public (Cory Fields) 229697a net: make vRecvMsg a list so that we can use splice() (Fuzzbawls) d2d71ba net: fix typo causing the wrong receive buffer size (Cory Fields) 50bb09d Add test-before-evict discipline to addrman (Ethan Heilman) Pull request description: This is a combination of multiple upstream PRs focused on optimizing the P2P networking flow after the introduction of CConnman encapsulation, and a few older PRs that were previously missed to support the later optimizations. The PRs are as follows: - bitcoin#9037 - net: Add test-before-evict discipline to addrman - bitcoin#5151 - make CMessageHeader a dumb storage class - bitcoin#6589 - log bytes recv/sent per command - bitcoin#8688 - Move static global randomizer seeds into CConnman - bitcoin#9050 - net: make a few values immutable, and use deterministic randomness for the localnonce - bitcoin#8708 - net: have CConnman handle message sending - bitcoin#9128 - net: Decouple CConnman and message serialization - bitcoin#8822 - net: Consistent checksum handling - bitcoin#9441 - Net: Massive speedup. Net locks overhaul - bitcoin#9609 - net: fix remaining net assertions - bitcoin#9626 - Clean up a few CConnman cs_vNodes/CNode things - bitcoin#9698 - net: fix socket close race - bitcoin#9708 - Clean up all known races/platform-specific UB at the time PR was opened - Excluded bitcoin/bitcoin@512731b and bitcoin/bitcoin@d8f2b8a, to be done in a separate PR ACKs for top commit: furszy: code ACK 30d5c66 , testnet sync from scratch went well and tested with #1829 on top as well and all good. furszy: mainnet sync went fine, ACK 30d5c66 . random-zebra: ACK 30d5c66 and merging... Tree-SHA512: 09689554f53115a45f810b47ff75d887fa9097ea05992a638dbb6055262aeecd82d6ce5aaa2284003399d839b6f2c36f897413da96cfa2cd3b858387c3f752c1
118e9bd to
e0d9ad6
Compare
|
rebased on master |
e0d9ad6 to
bd6b917
Compare
random-zebra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK, looking pretty good.
Aside from a couple notes left inline, have two minor talking points:
- I'd rather remove commits 594fa7a924d028444687c789083c9e997385bdfd and 6f164f6775588a474225c0a50798694ae694f674 (and most
sleepin the test).
Time constants can remain the same as mainnet (if you use mocktime). LogPrintfinProcessSporkMsgcould be very spammy. Maybe you could move toLogPrintwithBCLog::NETor some other category.
i played with the mocktime implementation for a while and agree that would be cleaner with it but.. there still a missing piece there. MNs weren't getting enabled. Better to merge this initial step and continue moving forward on top. So neither you or me are depending on each other's work and can continue moving on. And btw, i actually think that 594fa7a and 6f164f6 are good outside of the mock time changes. I'm more on the side of keep them with a slightly modification in the upcoming work:
Good catch 👍 , moving them. |
bd6b917 to
94b9619
Compare
|
Updated per feedback. |
…t to obtain the collateral. 7f86031 wallet: GetMasternodeVinAndKeys was not locking cs_wallet to obtain the collateral tx. (furszy) Pull request description: Fixing `CWallet::GetMasternodeVinAndKeys` accessing `mapWallet` without locking `cs_wallet`. Decoupled from #1829 . ACKs for top commit: random-zebra: utACK 7f86031 Fuzzbawls: utACK 7f86031 Tree-SHA512: 34995e3d65d6506d4a2465dd41b6d710d8a3e34f613a88691b6aa53018afa48a000faa444d03bbca0f418126fd5220a416d85a5a81d2053fd1412159b8cfb3c9
94b9619 to
b68ec43
Compare
|
Rebased on master with some changes:
For me, it's good to go in this way (merging 1886 first). It's already fulfilling a lot of needs. |
2e3e313 Solving a dead end inconsistency over the masternode activation process. (furszy) Pull request description: Solving a dead end inconsistency over the masternode activation process. Context: A valid masternode is created, mn start message is broadcasted and the entire network receives it. Peers moves the masternode status to pre_enable (active) and are waiting for a mn ping update to fulfill the min ping time requirement to move the status to enable. Issue: As the mnping update message work flow is not performing any action until the masternode is in an enable status (masternode.cpp, line 707), the masternode ping is never updated, there by the masternode will never be moved to enable status. Why this is working now: It's working because (1) the masternodes list is being requested, cleaned and re requested an insane amount of times, (2) some peers are broadcasting the start message twice and (3) peers are broadcasting a start message with an inner ping time in the future, fulfilling the min ping requirements. Final thoughts: This is most likely one of the reason of the different masternodes list views across peers in the network (over the recently active/enable MNs) and why the activation process takes in some occasions few days (the mn list needs to be refreshed) Extra data: 1) The masternodes list request + re-request network bloat, as is mentioned in point 1 of the "Why is working now" section, will not happen anymore in the future moving forward with #1829 new tier two syncing process subsequent steps, the deterministic masternodes implementation and all of the PRs refactoring & cleaning the tier two sources (in other words, after the whole tier two sources revamp..). 2) This PR makes #1829 4963953 hack unneeded. ACKs for top commit: random-zebra: tested ACK 2e3e313 Fuzzbawls: utACK 2e3e313 Tree-SHA512: bebd8fe67905b848acf66b72b7462b2fef38825214a4b7209a5f77745f3eeb7086012abe1248e1789a510d74325bfc577810198f14b10658b1eea577a920fa7d
Only enabled for regtest for now. Variadic template for CMasternodeSync::RequestDataTo, cleaner code.
…ions, min ping time, min mnb time, expiration time, removal time.
Checking: 1) Masternode setup/creation. 2) Tier two network sync (masternode broadcasting). 3) Masternode activation.
…re initialization after expiration, removal, re initialization after removal and collateral spent.
1) Masternodes setup/creation. 2) Proposal creation. 3) Vote creation. 4) Proposal and vote broadcast. 5) Proposal and vote sync.
cb62628 to
4d9aa94
Compare
4d9aa94 to
9a1e5e2
Compare
|
Rebased on top of master after #1886 merge. Plus added two more commits to verify the budget finalization sync process, fulfilling the "governance_sync_basic" test purpose of validating the correct functioning of the tier two messages sync process. Ready for review and merge. Will not add any other change here. |
random-zebra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome first step 🥃 very much needed.
Left few more comments (mostly around the use of mocktime which seems a bit funky), but they can be addressed in a later PR, so ACK 9a1e5e2
| for i in range(0, 2): | ||
| self.generate(1) | ||
| time.sleep(5) | ||
| set_node_times(self.nodes, self.mocktime + 15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not increase the mocktime in successive calls.
Should do something like
self.mocktime += 15
set_node_times(self.nodes, self.mocktime)and in the same fashion in the other places where you update the time, otherwise you are always adding seconds to the same, starting, mocktime value (e.g. see how generate_pos/generate_pow return the updated time to the caller).
| for i in range(0, len(self.nodes)): | ||
| node = self.nodes[i] | ||
| votesInfo = node.getbudgetvotes(proposalName) | ||
| assert(len(votesInfo) > 0) | ||
| voteInfo = votesInfo[0] | ||
| assert_equal(voteInfo["mnId"], mnCollateralHash) | ||
| assert_equal(voteInfo["Vote"], voteType) | ||
| found = False | ||
| for voteInfo in votesInfo: | ||
| if (voteInfo["mnId"] == mnCollateralHash) : | ||
| assert_equal(voteInfo["Vote"], voteType) | ||
| found = True | ||
| assert_true(found, "Error checking vote existence in node " + str(i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could rewrite all this function in a single pythonic one-liner as:
assert_equal([x["Vote"] for node in self.nodes for x in node.getbudgetvotes(proposalName)
if x["mnId"] == mnCollateralHash], [voteType] * len(self.nodes))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice if we could print in this magic line which node wasn't able to sync as well.
| listMNs = node.listmasternodes() | ||
| assert(len(listMNs) > 0) | ||
| found = False | ||
| for mnData in listMNs: | ||
| if (mnData["txhash"] == mnTxHash): | ||
| assert_equal(mnData["status"], status) | ||
| found = True | ||
| assert(found) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could rewrite all this function in a single pythonic one-liner as:
assert_equal([x["status"] for x in node.listmasternodes() if x["txhash"] == mnTxHash], [status])
Fuzzbawls
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 9a1e5e2
…ral fixes 2b5fb9b [BUG] Re adding missing CInv elements. "ix" and "txlvote" (furszy) fcb9207 OperationResult class decoupled to its own file, so it can be used in several places. (furszy) 4983157 Masternode: Fixing bug on the disconnection and reconnection flow, happens when the controller needs to broadcast a new start message, the remote masternode was rejecting it because thinks that already has it enabled (when the MN is in un-capable state and it was kicked from the network list), loosing the reactivation opportunity and ending in a forever non-capable state. (furszy) af01054 Masternode: locking internal cs for lastPing set. (furszy) b82c0e6 masternodeman:CheckAndRemove moving to the next iterator with the map.erase return object. (furszy) 60a6f91 tiertwo_masternode_activation: fixing vin_spent check, collateral spent tx broadcast wasn't included un the block (furszy) 4ee8058 tier two sync shouldn't start until the chain is synced. (furszy) bfdaf6d masternode ping checking for IBD state (furszy) 8656b2a Add IBD status to getblockchaininfo RPC call (furszy) 5a1e3af masternodeman::ProcessGetMNList vMasternodes for loop wasn't being guarded. (furszy) 8a7360f Cached the best block time in UpdateTip (same as we are caching best block hash) to be used on CMasternodeSync::IsBlockchainSynced() and not lock cs_main there. (furszy) 2dcb88c Strip out Misbehaving call from ProcessSporkMsg and move tier two sync process to the next step if the sporks are received before the tierTwoMan requested them (messageHandler requested them) (furszy) 572c871 tier two functional test: wait_until_mnsync_finished advancing the clock on every fail sync attempt. (furszy) 21826ea tiertwo sync on regtest, updating blockchain synced on every loop as in mainnet/testnet. Plus, not updating asset when the received asset isn't what we are looking for. (furszy) 48b4514 [Tests] Tier2 tests: minor fixes (random-zebra) 4f7bdf7 [Refactor] OperationResult: make optional error private (random-zebra) 835088a [Tests] TierTwo: send pings during wait_until (random-zebra) 5d61b58 [Tests][CI] Raise travis build timeout for tiertwo job (random-zebra) cbd9c67 [Tests] Don't create PoW/PoS cache when running --tiertwo tests (random-zebra) 19f0797 [Tests] TierTwo: add masternode ping (no mocktime) functional test (random-zebra) 0ce9431 [Tests] Refactor tier2_masternode_activation (random-zebra) 2ac0e43 [Tests] Abstract tier2 init in PivxTier2TestFramework (random-zebra) ec936f5 [Tests] tiertwo_governance_sync_basic fixes and improvements (random-zebra) 4d984ab test_runner: add --tiertwo option to run only the tier two functional test + travis job only for tier two tests created. (furszy) 4929919 Added budget finalization payment test case in tiertwo_governance_sync_basic.py. (furszy) 41f2639 mnfinalbudgetsuggest returning the budget hash if it was broadcasted successfully. (furszy) 6a5b2d1 initmasternode RPC command documentation. (furszy) 5f5b190 initMasternode returning the string error to the caller instead of only throwing a runtime_error. (furszy) 9159b67 Initializing MN port to default port in case of not finding it in the conf. (furszy) 7a49627 MasternodesConfig entries access guarded properly (furszy) 1259c36 tiertwo_governance_sync functional test, setup masternodes without shutting down the nodes. (furszy) 4755a8d introduce "reload_conf" argument in startmasternode "alias" RPC command. (furszy) fbf7efb masternodeconfig files code styling cleanup. (furszy) 8a07010 Abstract out masternode initialization process. (furszy) Pull request description: Follow up to #1829, focused on decrease the testing time and continue improving the tier two functional tests. Introduced the following changes: 1) Implemented a new RPC command `initmasternode` to initialize the masternode at runtime, with direct impact in the functional test time. The tests don't need to shut the nodes down and restart them anymore in order to setup the masternode/s. 2) Adapted `tiertwo_governance_sync_basic.py` to the new runtime initialization feature. 3) Further tier two cleanup. `Masternodeconfig` code refactored properly. It's still in WIP (because it need some more work) but the results are very favorable for now: Test previous time: - 9:43 min Test time with this changes: - 3:21 min ACKs for top commit: random-zebra: ACK 2b5fb9b Tree-SHA512: 3f70e0ce42a8779ce7e67713cae59c50b28bc197575edb1e75105f8d76bc320007c6b60e171cc1f5ae0a8242a734f71397ec2afa141a3f737677f5a20cecb839
-- Only running on regtest --
Covering the following points:
1) A new synchronization structure for the tier two network.
It has no redundant checks nor redundant requests (there by less network bloat, faster synchronization) and the code is by far more readable and maintainable than the previous version.
2) Tier two network regtest support.
Adding Masternodes and Governance capabilities to the regtest network.
3) Sync and masternode activation test.
Functional test validating the correct operation of the tier two network sync and the masternode broadcasting & activation process.
TODO:
Have pushed it under WIP to be able to test other great PRs like #1826.