Skip to content

Conversation

@sdaftuar
Copy link
Member

@sdaftuar sdaftuar commented Sep 14, 2018

This PR does two one things:

  • Fixes a privacy issue where if we relay a transaction to peer A (but not peer B), peer B can request the transaction and we'll provide it. Instead, we should wait until we announce to peer B before providing the transaction.

* Respond to getdata requests for transactions that are ancestors of newly-announced transactions. Currently, our software will request the parents of an orphan transaction from the peer who announced the orphan, but if our peer is running a recent Bitcoin Core version, the request would typically be dropped. So this should improve propagation for transaction chains, particularly for nodes that are just starting up.

This PR also includes a test that transaction chains relay to peers that weren't connected when a parent was broadcast.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 15, 2018

No more conflicts as of last run.

@gmaxwell
Copy link
Contributor

concept ACK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Comment doesn't seem to match the assert. Could just add...

assert_equal(len(self.nodes[0].listunspent()), 1)

@conscott
Copy link
Contributor

Tested ACK b723d95fa08c8058f4ab32cdd5329b4df3aac036

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit "Don't relay tx data to peers until after tx announcement"

Doesn't seem quite obvious that std::set is worst, why did you picked std::unordered_set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit "Don't relay tx data to peers until after tx announcement"

nit, s/_map/_set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constant time lookup seemed better to me than log_n lookup but I dunno, is there a reason you think std::set is better here? I suspect it doesn't really matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno too. Considering the size is small I agree it doesn't matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit "Don't relay tx data to peers until after tx announcement"

From developer notes:

- By default, declare single-argument constructors `explicit`.

  - *Rationale*: This is a precaution to avoid unintended conversions that might
    arise when single-argument constructors are used as implicit conversion
    functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit "Don't relay tx data to peers until after tx announcement"

nit, could use emplace(k,v) since you are touching this line.

This is moved and updated in the next commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit "Add ancestors of announced transactions to mapRelay"

nit, could drop std::make_pair.

@maflcko
Copy link
Member

maflcko commented Sep 18, 2018

utACK. I wrote a test in #14240 that should only pass with your first commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018-09-22 21:02:10 cpplint(pr=14220): src/net_processing.cpp:158:  Single-parameter constructors should be marked explicit.  [runtime/explicit] [5]

@maflcko
Copy link
Member

maflcko commented Sep 24, 2018

@sdaftuar Do you feel like splitting this up? I believe one of the fixes is a bugfix, the other a feature.

@sdaftuar
Copy link
Member Author

@MarcoFalke This PR is now just the bugfix along with the test you wrote (thanks!), I'll separately PR the new feature.

I believe I've also incorporated all the PR comments that are relevant for the bugfix commit.

@sdaftuar sdaftuar changed the title Transaction relay improvements Transaction relay privacy bugfix Sep 25, 2018
@sdaftuar sdaftuar mentioned this pull request Sep 25, 2018
1 task
@maflcko
Copy link
Member

maflcko commented Sep 25, 2018

Tested ACK 736462c (I wrote the test, that fails without this patch)

@maflcko
Copy link
Member

maflcko commented Sep 25, 2018

Is this for backport?

@sdaftuar
Copy link
Member Author

I don't think we need to backport this (in particular I wouldn't want to interfere with the 0.17.0 release), as the bug has been present for so long, there's no urgency. And I tend to think that changes like this to the p2p behavior are better to sit in master for a while before appearing in a release anyway.

Any idea what is going on with the appveyor test run?

@sdaftuar
Copy link
Member Author

I guess it's worth pointing out as well, that this change without the improvement in #14318 may well worsen relay of transactions that are part of chains a bit. For instance, if a peer of ours asks for the parent of a transaction we announced, then as long as that parent is in mapRelay -- ie it was recently announced -- then we'd provide it, even if that peer wasn't connected to us at the time the parent transaction was announced.

I have no idea how common that sort of thing is, but rather than spend too much time wondering about it, I'd say it's better to not backport this change (and arguably I shouldn't have unbundled this PR from #14318).

@jamesob
Copy link
Contributor

jamesob commented Sep 26, 2018

utACK 736462c

In plain English, I'd characterize this change as "don't respond to a GETDATA(txn) for a peer who you haven't previously sent INV(txn)."

# Where,
# * f_in is the pdf of the exponential distribution for inbound peers,
# with lambda_in = 1 / INVENTORY_BROADCAST_INTERVAL = 1/5
# * F_out is the cdf of the expon. distribuiton for outbound peers,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: distribuiton (spelling)

measured_leak /= REPEATS
self.log.info('Measured leak of {}'.format(measured_leak))

assert_greater_than(EXPECTED_MEASURED_LEAK, measured_leak)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test, being stochastic, looks like it'll fail spuriously. I guess it if happens too often, we can up REPEATS or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone could run it for 1000 times and see if it fails at all

./test/functional/test_runner.py -j 15 $(for i in {0..999}; do echo p2p_leak_tx ; done)

Copy link
Member Author

@sdaftuar sdaftuar Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it fails about 9% of the time for me. :(

Copy link
Member

@maflcko maflcko Sep 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that. I haven't actually run the test substantially after writing. I seems you could set EXPECTED_MEASURED_LEAK = .42 or so without making the test useless. (On master the leak is always 1.00, so setting the expected measured leak to any number less than 1 should be fine)

figure_1
(measured leak on x-axis)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bumped it up to 0.45, and it failed once in 1000 runs, so i bumped it up to 0.50 and ran it 5000 times with no failures. I have not really looked at the test or analyzed the statistics at all to verify that this makes sense.

@sdaftuar
Copy link
Member Author

Rebased, but I think we need to adjust the test so that it has a drastically lower error rate, or drop it from this PR. @MarcoFalke thoughts?

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 47bf6e9622

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unrelated change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

measured_leak /= REPEATS
self.log.info('Measured leak of {}'.format(measured_leak))

assert_greater_than(EXPECTED_MEASURED_LEAK, measured_leak)
Copy link
Member

@maflcko maflcko Sep 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that. I haven't actually run the test substantially after writing. I seems you could set EXPECTED_MEASURED_LEAK = .42 or so without making the test useless. (On master the leak is always 1.00, so setting the expected measured leak to any number less than 1 should be fine)

figure_1
(measured leak on x-axis)

@DrahtBot
Copy link
Contributor

Coverage Change (pull 14220) Reference (master)
Lines +0.0094 % 87.0361 %
Functions -0.0130 % 84.1130 %
Branches +0.0047 % 51.5451 %

@maflcko
Copy link
Member

maflcko commented Oct 1, 2018

re-ACK 320a85e3427685d4bed9d585c3a3c45e288d5a30

Copy link
Contributor

@jamesob jamesob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-utACK 320a85e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this test is supposed to be testing. Is it:

(i) that we don't leak txs to inbound peers that we haven't yet announced to (as stated in the docstring); or
(ii) that txs are probabilistically announced to outbound peers before inbound peers.

If (i), can't we just test that for every tx, either we receive a NOTFOUND message for the tx or we've already received an INV message for the tx?

The assert_greater_than() call here seems to be simply testing (ii). Or maybe I'm misunderstanding?

Copy link
Member

@maflcko maflcko Oct 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! That makes the test easier to understand as well:

(to be applied on current HEAD commit of this branch)

diff --git a/test/functional/p2p_leak_tx.py b/test/functional/p2p_leak_tx.py
index 21d1d4ed2d..a0ceaedd67 100755
--- a/test/functional/p2p_leak_tx.py
+++ b/test/functional/p2p_leak_tx.py
@@ -13,10 +13,17 @@ from test_framework.util import (
     wait_until,
 )
 
+import time
+
+
+class P2PNode(P2PDataStore):
+    def on_inv(self, msg):
+        pass
+
 
 class P2PLeakTxTest(BitcoinTestFramework):
     def set_test_params(self):
-        self.num_nodes = 2
+        self.num_nodes = 1
 
     def skip_test_if_missing_module(self):
         self.skip_if_no_wallet()
@@ -26,67 +33,28 @@ class P2PLeakTxTest(BitcoinTestFramework):
         gen_node.generate(1)
         self.sync_all()
 
-        inbound_peer = self.nodes[0].add_p2p_connection(P2PDataStore())  # An "attacking" inbound peer
-        outbound_peer = self.nodes[1]  # Our outbound peer
+        inbound_peer = self.nodes[0].add_p2p_connection(P2PNode())  # An "attacking" inbound peer
 
-        # In an adversarial setting we can generally assume that inbound peers
-        # are more likely to spy on us than outbound peers. Thus, on average,
-        # we announce transactions first to outbound peers, then to (all)
-        # inbound peers. Inbound peers must not be able to successfully request a
-        # transaction if they haven't yet received the announcement for it.
-        #
-        # With only one outbound peer, we expect that a tx is first announced
-        # to (all) inbound peers (and thus present a potential leak) in 28.5% of
-        # the cases.
-        #
-        # Probability( time_ann_inbound < time_ann_outbound )                 =
-        # ∫ f_in(x)                           * F_out(x)                   dx =
-        # ∫ (lambda_in * exp(-lambda_in * x)) * (1 - exp(-lambda_out * x)) dx =
-        # 0.285714
-        #
-        # Where,
-        # * f_in is the pdf of the exponential distribution for inbound peers,
-        #   with lambda_in = 1 / INVENTORY_BROADCAST_INTERVAL = 1/5
-        # * F_out is the cdf of the expon. distribution for outbound peers,
-        #   with lambda_out = 1 / (INVENTORY_BROADCAST_INTERVAL >> 1) = 1/2
-        #
-        # Due to measurement delays, the actual monte-carlo leak is a bit
-        # higher. Assume a total delay of 0.6 s (Includes network delays and
-        # rpc delay to poll the raw mempool)
-        #
-        # Probability( time_ann_inbound < time_ann_outbound + 0.6 )           =
-        # ∫ f_in(x)                           * F_out(x + 0.6)             dx =
-        # ∫ (lambda_in * exp(-lambda_in * x)) * (1 - exp(-lambda_out * (x+.6))) dx =
-        # 0.366485
-        # EXPECTED_MEASURED_LEAK = .366485
-        # Because this test is empirical and our testing framework isn't set up
-        # to handle tests that fail with some expected likelihood, we bump this
-        # value up to decrease the false positive rate.
-        EXPECTED_MEASURED_LEAK = .50
-
-        REPEATS = 100
-        measured_leak = 0
-        self.log.info('Start simulation for {} repeats'.format(REPEATS))
+        REPEATS = 10
+        self.log.info('Start test for {} repeats'.format(REPEATS))
         for i in range(REPEATS):
             self.log.debug('Run {}/{}'.format(i, REPEATS))
             txid = gen_node.sendtoaddress(gen_node.getnewaddress(), 0.033)
+
+            time.sleep(5)
+
             want_tx = msg_getdata()
             want_tx.inv.append(CInv(t=1, h=int(txid, 16)))
-
-            wait_until(lambda: txid in outbound_peer.getrawmempool(), lock=mininode_lock)
             inbound_peer.send_message(want_tx)
             inbound_peer.sync_with_ping()
 
             if inbound_peer.last_message.get('notfound'):
+                self.log.debug('tx {} was not yet announced to us.'.format(txid))
                 assert_equal(inbound_peer.last_message['notfound'].vec[0].hash, int(txid, 16))
                 inbound_peer.last_message.pop('notfound')
             else:
-                measured_leak += 1
-
-        measured_leak /= REPEATS
-        self.log.info('Measured leak of {}'.format(measured_leak))
-
-        assert_greater_than(EXPECTED_MEASURED_LEAK, measured_leak)
+                self.log.debug('tx {} was announced to us.'.format(txid))
+                assert int(txid, 16) in [inv.hash for inv in inbound_peer.last_message['inv'].inv]
 
 
 if __name__ == '__main__':

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK the change to the test!

Can we add multiple P2P connections to the node to get more test repeats in parallel, or does the change to tx propagation in #13298 mean that we'll announce to all those peers at the same time and not get any additional benefit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we can only easily add inbound mininode peers (which are in the same bucket for tx propagation), so right now it couldn't run in parallel that way.

I am happy to adjust the test as soon as we can add outbound connections to mininode peers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

sdaftuar and others added 2 commits October 8, 2018 22:51
Prior to this commit, we'd respond with tx data for anything in mapRelay,
regardless of whether the requesting peer was one that we'd sent an INV to
for the transaction in question.

Close this privacy leak by maintaining a set of peers to which we've
relayed each transaction in mapRelay.
@sdaftuar
Copy link
Member Author

sdaftuar commented Oct 9, 2018

Picked up Marco's latest test and rebased on master due to the test_runner.py conflict.

@sdaftuar
Copy link
Member Author

sdaftuar commented Oct 9, 2018

Added a commit because of this:

./test/functional/p2p_leak_tx.py:8:1: F401 'test_framework.mininode.mininode_lock' imported but unused
./test/functional/p2p_leak_tx.py:10:1: F401 'test_framework.util.assert_greater_than' imported but unused
./test/functional/p2p_leak_tx.py:10:1: F401 'test_framework.util.wait_until' imported but unused

@maflcko
Copy link
Member

maflcko commented Oct 9, 2018

Feel free to just squash these trivial fixups in

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK the code change with a request for an additional comment.

The test appears to me to be broken. Perhaps we should split it out into a separate PR.

if (mi != mapRelay.end()) {
connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second));
if (mi != mapRelay.end() && mi->second.m_node_set.count(pfrom->GetId())) {
connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second.m_txref));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment here would be helpful (since mi->second.m_node_set.count(pfrom->GetId()) isn't immediately obvious when read outside the context of this PR). Something like:

// Only send the transaction if we've previously announced it to this peer

if (mi != mapRelay.end() && mi->second.m_node_set.count(pfrom->GetId())) {
connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second.m_txref));
push = true;
} else if (pfrom->timeLastMempoolReq) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we can leak mempool contents (and lose privacy) to a peer if it sends us a MEMPOOL message. That is unchanged by this PR, but the logic here is a bit confusing and we should probably deprecate support for MEMPOOL in future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One relatively easy solution (I think), is to only announce transactions that have been longer than say 60s in the mempool at the time the MEMPOOL message is received, and to enforce the same time restriction in this block of code (so you can't fetch transactions that were received less than 60s before the last MEMPOOL request).

@@ -0,0 +1,59 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now reviewed this test. Sadly, it doesn't test the new code (and does not fail on master).

Since the node-under-test has only one peer, the transaction won't get added to mapRelay until we send the INV to that peer. That means in the if/else branches at the bottom of this test:

  • if the node has already sent us an INV for the tx, then it'll be in the mapRelay and the node will send us the TX.
  • if the node has not yet sent us an INV for the tx, then it won't be in the mapRelay and we'll fail mi != mapRelay.end() and not even reach && mi->second.m_node_set.count(pfrom->GetId())

To test the new code, we want the to be in mapRelay, but the peer to not be in the RelayEntry. That's not possible if the node has only one peer.

Perhaps we should remove the test from this PR and add it under a future PR. If you do so, please tag me in that PR and I'll commit to reviewing the test.

@jnewbery
Copy link
Contributor

One other question: do we know how large mapRelay is expected to grow? Do we know how much additional memory will be used by changing the CTransactionRef to a RelayEntry?

@jnewbery
Copy link
Contributor

I've written an alternative test here: https://github.com/jnewbery/bitcoin/tree/pr14220.1 which reliably tests the new behaviour and fails on master.

/** Relay map */
typedef std::map<uint256, CTransactionRef> MapRelay;
struct RelayEntry {
explicit RelayEntry(CTransactionRef &&tx) : m_txref(tx) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use m_txref(std::move(tx)) here. tx in this context is not an rvalue reference (as it's been bound to variable), it's only initialized by accepting an rvalue reference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had no idea, thank you for pointing this out!

if (mi != mapRelay.end() && mi->second.m_node_set.count(pfrom->GetId())) {
connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second.m_txref));
push = true;
} else if (pfrom->timeLastMempoolReq) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One relatively easy solution (I think), is to only announce transactions that have been longer than say 60s in the mempool at the time the MEMPOOL message is received, and to enforce the same time restriction in this block of code (so you can't fetch transactions that were received less than 60s before the last MEMPOOL request).

@sdaftuar
Copy link
Member Author

One other question: do we know how large mapRelay is expected to grow? Do we know how much additional memory will be used by changing the CTransactionRef to a RelayEntry?

mapRelay is bounded by our rate-limited transaction relay algorithm, but on further thought, this approach may potentially use quite a bit more memory than I originally realized.

I'll close this PR for now and re-open when I have an alternative that I think is worth pursuing.

@sdaftuar sdaftuar closed this Oct 17, 2018
struct RelayEntry {
explicit RelayEntry(CTransactionRef &&tx) : m_txref(tx) {}
CTransactionRef m_txref;
std::unordered_set<NodeId> m_node_set;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are at most 9 buckets, so a set of outbound NodeIds and a flag for inbound ones might demand less memory?

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.