Skip to content

Conversation

@sdaftuar
Copy link
Member

@sdaftuar sdaftuar commented Jun 4, 2021

If all our high-bandwidth compact block serving peers (BIP 152) stall block
download, then we can be denied a block for (potentially) a long time. As
inbound connections are much more likely to be adversarial than outbound
connections, mitigate this risk by never removing our last outbound HB peer if
it would be replaced by an inbound.

If all our high-bandwidth compact block serving peers (BIP 152) stall block
download, then we can be denied a block for (potentially) a long time. As
inbound connections are much more likely to be adversarial than outbound
connections, mitigate this risk by never removing our last outbound HB peer if
it would be replaced by an inbound.
@fanquake fanquake added the P2P label Jun 4, 2021
Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept and code review ACK 6efbcec

Running a node with this patch.

A test in p2p_compactblocks.py would be nice.

If this is implemented, should BIP152 "Implementation Notes" be updated with a note?

if (nodestate->m_is_inbound) {
// If we're adding an inbound HB peer, make sure we're not removing
// our last outbound HB peer in the process.
if (lNodesAnnouncingHeaderAndIDs.size() >= 3 && num_outbound_hb_peers == 1) {
Copy link
Member

@jonatack jonatack Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, if retouch perhaps hoist the 3 here and in line 854 to a constexpr and maybe the following mini-wishlist:

         int num_outbound_hb_peers = 0;
         for (std::list<NodeId>::iterator it = lNodesAnnouncingHeaderAndIDs.begin(); it != lNodesAnnouncingHeaderAndIDs.end(); it++) {
             if (*it == nodeid) {
+                // Move this nodeid to the end of the list.
                 lNodesAnnouncingHeaderAndIDs.erase(it);
                 lNodesAnnouncingHeaderAndIDs.push_back(nodeid);
                 return;
@@ -1556,7 +1557,7 @@ void PeerManagerImpl::BlockChecked(const CBlock& block, const BlockValidationSta
              !m_chainman.ActiveChainstate().IsInitialBlockDownload() &&
              mapBlocksInFlight.count(hash) == mapBlocksInFlight.size()) {
         if (it != mapBlockSource.end()) {
-            MaybeSetPeerAsAnnouncingHeaderAndIDs(it->second.first);
+            MaybeSetPeerAsAnnouncingHeaderAndIDs(/*nodeid=*/ it->second.first);
         }

@Crypt-iQ
Copy link
Contributor

Crypt-iQ commented Jun 6, 2021

ACK 6efbcec

As jonatack mentioned, I think extending the existing functional test would be good.

@jnewbery
Copy link
Contributor

jnewbery commented Jun 7, 2021

Concept ACK. It seems reasonable to protect one outbound peer as an HB compact block source.

I agree with @jonatack and @Crypt-iQ that it'd be good to see a test for this new behaviour.

I've always found the MaybeSetPeerAsAnnouncingHeaderAndIDs() logic to be strangely convoluted and difficult to follow - is there a good reason to recursively call into ForNode() instead of just calling it serially to first switch off hb for one node and then enable it for the other? If we did that, then this logic would be a lot easier to read - we wouldn't need to std::swap elements in a list only to later pop off the front of that list.

@sdaftuar
Copy link
Member Author

sdaftuar commented Jun 7, 2021

I agree a test would be nice -- I'm not planning to write one myself, but if someone contributes one I'd be happy to include it here.

I've always found the MaybeSetPeerAsAnnouncingHeaderAndIDs() logic to be strangely convoluted and difficult to follow - is there a good reason to recursively call into ForNode() instead of just calling it serially to first switch off hb for one node and then enable it for the other? If we did that, then this logic would be a lot easier to read - we wouldn't need to std::swap elements in a list only to later pop off the front of that list.

I don't know for sure, but my guess is that it's easiest to reason about the logic if the substitution of going from one HB peer to another all happens within one lock acquisition? At any rate, I think the real problem with readability is having ForNode() at all; its existence implies that our model for how net and net_processing interact is broken (see eg #11604). I think the goal is to eventually refactor all the code like this so that we don't have ForNode() or ForEachNode() in our code anymore, so I don't think there's much value in refactoring the existing logic here just to have it rewritten in the future.

@sipa
Copy link
Member

sipa commented Jun 9, 2021

@sdaftuar
Copy link
Member Author

sdaftuar commented Jun 9, 2021

@sipa Thanks, I've cherry-picked that commit onto this branch.

@sdaftuar sdaftuar force-pushed the 2021-06-reserve-outbound-hb branch from 3aba631 to 30aee2d Compare June 10, 2021 14:23
def peer_info(self, from_node, to_node):
"""Query from_node for its getpeerinfo about to_node."""
for peerinfo in self.nodes[from_node].getpeerinfo():
if "(testnode%i)" % to_node in peerinfo['subver']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sipa sipa added this to the 22.0 milestone Jun 14, 2021
Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 30aee2d

Verified the test fails on master at line 81 (and at no other point) and passes on this branch; pprinting the value of status at that point shows the following differences between master and this change:

master

[True,  True, False, True ]
[True,  True, True,  False]
[False, True, True,  True ]   <--- assert fails at this point

branch

[True, True,  False, True ]
[True, True,  True,  False]
[True, False, True,  True ]
A few suggestions for the test, feel free to ignore

@@ -18,7 +18,7 @@ class CompactBlocksConnectionTest(BitcoinTestFramework):
     def peer_info(self, from_node, to_node):
         """Query from_node for its getpeerinfo about to_node."""
         for peerinfo in self.nodes[from_node].getpeerinfo():
-            if "(testnode%i)" % to_node in peerinfo['subver']:
+            if f"(testnode{to_node})" in peerinfo['subver']:
                 return peerinfo
         return None
 
@@ -73,14 +73,13 @@ class CompactBlocksConnectionTest(BitcoinTestFramework):
         # Now relay one block through peer 2 (outbound from node 1), so it should take HB status
         # from one of the inbounds.
         status = self.relay_block_through(2)
-        assert_equal(status[0], True)
-        assert_equal(sum(status), 3)
+        assert_equal(status, [True, False, True, True])
 
         # Now relay again through nodes 3,4,5. Since 2 is outbound, it should remain HB.
         for nodeid in range(3, 6):
             status = self.relay_block_through(nodeid)
-            assert status[0]
-            assert status[nodeid - 2]
+            assert_equal(status[0], True)
+            assert_equal(status[nodeid - 2], True)
             assert_equal(sum(status), 3)
 
         # Reconnect peer 2, and retry. Now the three inbounds should be HB again.
@@ -88,8 +87,8 @@ class CompactBlocksConnectionTest(BitcoinTestFramework):
         self.connect_nodes(1, 2)
         for nodeid in range(3, 6):
             status = self.relay_block_through(nodeid)
-            assert not status[0]
-            assert status[nodeid - 2]
+            assert_equal(status[0], False)
+            assert_equal(status[nodeid - 2], True)
         assert_equal(status, [False, True, True, True])

(Very nice test.)

(Out of curiousity, do people see inbounds selected as HB peers? Perhaps my peer topology has much better outbounds than inbounds, or not enough inbounds, but it's rare for even one inbound to provide a block and be selected for HB mode, much less three. Outbounds get the nod.)

@sipa
Copy link
Member

sipa commented Jun 15, 2021

My node at bitcoin.sipa.be currently has all 3 HB connections on inbounds (it has 250 connections).

@jonatack
Copy link
Member

@sipa thanks -- good to hear about different peer topologies. Am probably restarting too often (for testing pulls) to see anything close to that many inbounds.

@achow101
Copy link
Member

ACK 30aee2d

Reviewed code and checked that the test fails on master but passes with this PR.

@ariard
Copy link

ariard commented Jun 16, 2021

Code ACK 30aee2d

Our last oubound HB compact block peer could still be evicted under our CHAIN_SYNC_TIMEOUT/EXTRA_PEER_CHECK_INTERVAL logics if it appears slow on block-delivery. Hitting this eviction on our side would require from an attacker to control our peer's block-relay, assuming this peer's implementation has the same mitigations in place against adversarial peers, it sounds hard to achieve.

@naumenkogs
Copy link
Contributor

Concept ACK. I agree with protecting last outbound HB peer.

@laanwj laanwj merged commit f6a25be into bitcoin:master Jun 21, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jun 24, 2021
30aee2d tests: Add test for compact block HB selection (Pieter Wuille)
6efbcec Protect last outbound HB compact block peer (Suhas Daftuar)

Pull request description:

  If all our high-bandwidth compact block serving peers (BIP 152) stall block
  download, then we can be denied a block for (potentially) a long time. As
  inbound connections are much more likely to be adversarial than outbound
  connections, mitigate this risk by never removing our last outbound HB peer if
  it would be replaced by an inbound.

ACKs for top commit:
  achow101:
    ACK 30aee2d
  ariard:
    Code ACK 30aee2d
  jonatack:
    ACK 30aee2d

Tree-SHA512: 5c6c9326e3667b97e0864c371ae2174d2be9054dad479f4366127b9cd3ac60ffa01ec9707b16ef29cac122db6916cf56fd9985733390017134ace483278921d5
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Jun 27, 2021
If all our high-bandwidth compact block serving peers (BIP 152) stall block
download, then we can be denied a block for (potentially) a long time. As
inbound connections are much more likely to be adversarial than outbound
connections, mitigate this risk by never removing our last outbound HB peer if
it would be replaced by an inbound.

Github-Pull: bitcoin#22147
Rebased-From: 6efbcec
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Jun 27, 2021
gwillen pushed a commit to ElementsProject/elements that referenced this pull request Jun 1, 2022
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants