Transaction relay privacy bugfix #14220

sdaftuar · 2018-09-14T13:39:35Z

This PR does ~~two~~ one things:

Fixes a privacy issue where if we relay a transaction to peer A (but not peer B), peer B can request the transaction and we'll provide it. Instead, we should wait until we announce to peer B before providing the transaction.

* Respond to getdata requests for transactions that are ancestors of newly-announced transactions. Currently, our software will request the parents of an orphan transaction from the peer who announced the orphan, but if our peer is running a recent Bitcoin Core version, the request would typically be dropped. So this should improve propagation for transaction chains, particularly for nodes that are just starting up.

~~This PR also includes a test that transaction chains relay to peers that weren't connected when a parent was broadcast.~~

DrahtBot · 2018-09-15T12:46:42Z

No more conflicts as of last run.

gmaxwell · 2018-09-16T07:15:00Z

concept ACK

conscott · 2018-09-16T14:36:09Z

test/functional/p2p_txchain_relay.py

nit: Comment doesn't seem to match the assert. Could just add...

assert_equal(len(self.nodes[0].listunspent()), 1)

conscott · 2018-09-16T15:17:21Z

Tested ACK b723d95fa08c8058f4ab32cdd5329b4df3aac036

promag · 2018-09-16T20:50:38Z

src/net_processing.cpp

Commit "Don't relay tx data to peers until after tx announcement"

Doesn't seem quite obvious that std::set is worst, why did you picked std::unordered_set?

Commit "Don't relay tx data to peers until after tx announcement"

nit, s/_map/_set?

Constant time lookup seemed better to me than log_n lookup but I dunno, is there a reason you think std::set is better here? I suspect it doesn't really matter.

I dunno too. Considering the size is small I agree it doesn't matter.

promag · 2018-09-16T20:54:16Z

src/net_processing.cpp

Commit "Don't relay tx data to peers until after tx announcement"

From developer notes:

- By default, declare single-argument constructors `explicit`. - *Rationale*: This is a precaution to avoid unintended conversions that might arise when single-argument constructors are used as implicit conversion functions.

promag · 2018-09-16T20:55:10Z

src/net_processing.cpp

Commit "Don't relay tx data to peers until after tx announcement"

~~nit, could use emplace(k,v) since you are touching this line.~~

This is moved and updated in the next commit.

promag · 2018-09-16T20:57:28Z

src/net_processing.cpp

Commit "Add ancestors of announced transactions to mapRelay"

nit, could drop std::make_pair.

maflcko · 2018-09-18T20:40:12Z

utACK. I wrote a test in #14240 that should only pass with your first commit.

practicalswift · 2018-09-22T19:04:41Z

src/net_processing.cpp

2018-09-22 21:02:10 cpplint(pr=14220): src/net_processing.cpp:158: Single-parameter constructors should be marked explicit. [runtime/explicit] [5]

maflcko · 2018-09-24T19:29:16Z

@sdaftuar Do you feel like splitting this up? I believe one of the fixes is a bugfix, the other a feature.

sdaftuar · 2018-09-25T12:06:28Z

@MarcoFalke This PR is now just the bugfix along with the test you wrote (thanks!), I'll separately PR the new feature.

I believe I've also incorporated all the PR comments that are relevant for the bugfix commit.

maflcko · 2018-09-25T17:51:35Z

Tested ACK 736462c (I wrote the test, that fails without this patch)

maflcko · 2018-09-25T17:56:23Z

Is this for backport?

sdaftuar · 2018-09-25T18:07:19Z

I don't think we need to backport this (in particular I wouldn't want to interfere with the 0.17.0 release), as the bug has been present for so long, there's no urgency. And I tend to think that changes like this to the p2p behavior are better to sit in master for a while before appearing in a release anyway.

Any idea what is going on with the appveyor test run?

sdaftuar · 2018-09-25T18:42:17Z

I guess it's worth pointing out as well, that this change without the improvement in #14318 may well worsen relay of transactions that are part of chains a bit. For instance, if a peer of ours asks for the parent of a transaction we announced, then as long as that parent is in mapRelay -- ie it was recently announced -- then we'd provide it, even if that peer wasn't connected to us at the time the parent transaction was announced.

I have no idea how common that sort of thing is, but rather than spend too much time wondering about it, I'd say it's better to not backport this change (and arguably I shouldn't have unbundled this PR from #14318).

jamesob · 2018-09-26T19:13:16Z

utACK 736462c

In plain English, I'd characterize this change as "don't respond to a GETDATA(txn) for a peer who you haven't previously sent INV(txn)."

jamesob · 2018-09-26T19:16:12Z

test/functional/p2p_leak_tx.py

+        # Where,
+        # * f_in is the pdf of the exponential distribution for inbound peers,
+        #   with lambda_in = 1 / INVENTORY_BROADCAST_INTERVAL = 1/5
+        # * F_out is the cdf of the expon. distribuiton for outbound peers,


nit: distribuiton (spelling)

jamesob · 2018-09-26T19:16:55Z

test/functional/p2p_leak_tx.py

+        measured_leak /= REPEATS
+        self.log.info('Measured leak of {}'.format(measured_leak))
+
+        assert_greater_than(EXPECTED_MEASURED_LEAK, measured_leak)


This test, being stochastic, looks like it'll fail spuriously. I guess it if happens too often, we can up REPEATS or something.

Someone could run it for 1000 times and see if it fails at all

./test/functional/test_runner.py -j 15 $(for i in {0..999}; do echo p2p_leak_tx ; done)

Looks like it fails about 9% of the time for me. :(

Sorry for that. I haven't actually run the test substantially after writing. I seems you could set EXPECTED_MEASURED_LEAK = .42 or so without making the test useless. (On master the leak is always 1.00, so setting the expected measured leak to any number less than 1 should be fine)

(measured leak on x-axis)

I bumped it up to 0.45, and it failed once in 1000 runs, so i bumped it up to 0.50 and ran it 5000 times with no failures. I have not really looked at the test or analyzed the statistics at all to verify that this makes sense.

sdaftuar · 2018-09-27T18:04:30Z

Rebased, but I think we need to adjust the test so that it has a drastically lower error rate, or drop it from this PR. @MarcoFalke thoughts?

maflcko

re-ACK 47bf6e9622

maflcko · 2018-09-28T01:09:11Z

test/functional/test_framework/messages.py

nit: unrelated change?

maflcko · 2018-09-28T03:28:01Z

test/functional/p2p_leak_tx.py

+        measured_leak /= REPEATS
+        self.log.info('Measured leak of {}'.format(measured_leak))
+
+        assert_greater_than(EXPECTED_MEASURED_LEAK, measured_leak)


Sorry for that. I haven't actually run the test substantially after writing. I seems you could set EXPECTED_MEASURED_LEAK = .42 or so without making the test useless. (On master the leak is always 1.00, so setting the expected measured leak to any number less than 1 should be fine)

(measured leak on x-axis)

DrahtBot · 2018-09-28T10:49:06Z

Coverage	Change (pull 14220)	Reference (master)
Lines	+0.0094 %	87.0361 %
Functions	-0.0130 %	84.1130 %
Branches	+0.0047 %	51.5451 %

maflcko · 2018-10-01T01:26:10Z

re-ACK 320a85e3427685d4bed9d585c3a3c45e288d5a30

jamesob

re-utACK 320a85e

jnewbery · 2018-10-08T04:53:25Z

test/functional/p2p_leak_tx.py

I don't understand what this test is supposed to be testing. Is it:

(i) that we don't leak txs to inbound peers that we haven't yet announced to (as stated in the docstring); or
(ii) that txs are probabilistically announced to outbound peers before inbound peers.

If (i), can't we just test that for every tx, either we receive a NOTFOUND message for the tx or we've already received an INV message for the tx?

The assert_greater_than() call here seems to be simply testing (ii). Or maybe I'm misunderstanding?

Makes sense! That makes the test easier to understand as well:

(to be applied on current HEAD commit of this branch)

diff --git a/test/functional/p2p_leak_tx.py b/test/functional/p2p_leak_tx.py index 21d1d4ed2d..a0ceaedd67 100755 --- a/test/functional/p2p_leak_tx.py +++ b/test/functional/p2p_leak_tx.py @@ -13,10 +13,17 @@ from test_framework.util import ( wait_until, ) +import time + + +class P2PNode(P2PDataStore): + def on_inv(self, msg): + pass + class P2PLeakTxTest(BitcoinTestFramework): def set_test_params(self): - self.num_nodes = 2 + self.num_nodes = 1 def skip_test_if_missing_module(self): self.skip_if_no_wallet() @@ -26,67 +33,28 @@ class P2PLeakTxTest(BitcoinTestFramework): gen_node.generate(1) self.sync_all() - inbound_peer = self.nodes[0].add_p2p_connection(P2PDataStore()) # An "attacking" inbound peer - outbound_peer = self.nodes[1] # Our outbound peer + inbound_peer = self.nodes[0].add_p2p_connection(P2PNode()) # An "attacking" inbound peer - # In an adversarial setting we can generally assume that inbound peers - # are more likely to spy on us than outbound peers. Thus, on average, - # we announce transactions first to outbound peers, then to (all) - # inbound peers. Inbound peers must not be able to successfully request a - # transaction if they haven't yet received the announcement for it. - # - # With only one outbound peer, we expect that a tx is first announced - # to (all) inbound peers (and thus present a potential leak) in 28.5% of - # the cases. - # - # Probability( time_ann_inbound < time_ann_outbound ) = - # ∫ f_in(x) * F_out(x) dx = - # ∫ (lambda_in * exp(-lambda_in * x)) * (1 - exp(-lambda_out * x)) dx = - # 0.285714 - # - # Where, - # * f_in is the pdf of the exponential distribution for inbound peers, - # with lambda_in = 1 / INVENTORY_BROADCAST_INTERVAL = 1/5 - # * F_out is the cdf of the expon. distribution for outbound peers, - # with lambda_out = 1 / (INVENTORY_BROADCAST_INTERVAL >> 1) = 1/2 - # - # Due to measurement delays, the actual monte-carlo leak is a bit - # higher. Assume a total delay of 0.6 s (Includes network delays and - # rpc delay to poll the raw mempool) - # - # Probability( time_ann_inbound < time_ann_outbound + 0.6 ) = - # ∫ f_in(x) * F_out(x + 0.6) dx = - # ∫ (lambda_in * exp(-lambda_in * x)) * (1 - exp(-lambda_out * (x+.6))) dx = - # 0.366485 - # EXPECTED_MEASURED_LEAK = .366485 - # Because this test is empirical and our testing framework isn't set up - # to handle tests that fail with some expected likelihood, we bump this - # value up to decrease the false positive rate. - EXPECTED_MEASURED_LEAK = .50 - - REPEATS = 100 - measured_leak = 0 - self.log.info('Start simulation for {} repeats'.format(REPEATS)) + REPEATS = 10 + self.log.info('Start test for {} repeats'.format(REPEATS)) for i in range(REPEATS): self.log.debug('Run {}/{}'.format(i, REPEATS)) txid = gen_node.sendtoaddress(gen_node.getnewaddress(), 0.033) + + time.sleep(5) + want_tx = msg_getdata() want_tx.inv.append(CInv(t=1, h=int(txid, 16))) - - wait_until(lambda: txid in outbound_peer.getrawmempool(), lock=mininode_lock) inbound_peer.send_message(want_tx) inbound_peer.sync_with_ping() if inbound_peer.last_message.get('notfound'): + self.log.debug('tx {} was not yet announced to us.'.format(txid)) assert_equal(inbound_peer.last_message['notfound'].vec[0].hash, int(txid, 16)) inbound_peer.last_message.pop('notfound') else: - measured_leak += 1 - - measured_leak /= REPEATS - self.log.info('Measured leak of {}'.format(measured_leak)) - - assert_greater_than(EXPECTED_MEASURED_LEAK, measured_leak) + self.log.debug('tx {} was announced to us.'.format(txid)) + assert int(txid, 16) in [inv.hash for inv in inbound_peer.last_message['inv'].inv] if __name__ == '__main__':

Concept ACK the change to the test!

Can we add multiple P2P connections to the node to get more test repeats in parallel, or does the change to tx propagation in #13298 mean that we'll announce to all those peers at the same time and not get any additional benefit?

Currently we can only easily add inbound mininode peers (which are in the same bucket for tx propagation), so right now it couldn't run in parallel that way.

I am happy to adjust the test as soon as we can add outbound connections to mininode peers.

Prior to this commit, we'd respond with tx data for anything in mapRelay, regardless of whether the requesting peer was one that we'd sent an INV to for the transaction in question. Close this privacy leak by maintaining a set of peers to which we've relayed each transaction in mapRelay.

sdaftuar · 2018-10-09T02:57:57Z

Picked up Marco's latest test and rebased on master due to the test_runner.py conflict.

sdaftuar · 2018-10-09T03:06:09Z

Added a commit because of this:

./test/functional/p2p_leak_tx.py:8:1: F401 'test_framework.mininode.mininode_lock' imported but unused
./test/functional/p2p_leak_tx.py:10:1: F401 'test_framework.util.assert_greater_than' imported but unused
./test/functional/p2p_leak_tx.py:10:1: F401 'test_framework.util.wait_until' imported but unused

maflcko · 2018-10-09T03:30:48Z

Feel free to just squash these trivial fixups in

jnewbery

utACK the code change with a request for an additional comment.

The test appears to me to be broken. Perhaps we should split it out into a separate PR.

jnewbery · 2018-10-10T06:31:03Z

src/net_processing.cpp

-            if (mi != mapRelay.end()) {
-                connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second));
+            if (mi != mapRelay.end() && mi->second.m_node_set.count(pfrom->GetId())) {
+                connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second.m_txref));


I think a comment here would be helpful (since mi->second.m_node_set.count(pfrom->GetId()) isn't immediately obvious when read outside the context of this PR). Something like:

// Only send the transaction if we've previously announced it to this peer

jnewbery · 2018-10-10T06:32:11Z

src/net_processing.cpp

+            if (mi != mapRelay.end() && mi->second.m_node_set.count(pfrom->GetId())) {
+                connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second.m_txref));
                push = true;
            } else if (pfrom->timeLastMempoolReq) {


Note that we can leak mempool contents (and lose privacy) to a peer if it sends us a MEMPOOL message. That is unchanged by this PR, but the logic here is a bit confusing and we should probably deprecate support for MEMPOOL in future.

One relatively easy solution (I think), is to only announce transactions that have been longer than say 60s in the mempool at the time the MEMPOOL message is received, and to enforce the same time restriction in this block of code (so you can't fetch transactions that were received less than 60s before the last MEMPOOL request).

jnewbery · 2018-10-10T06:41:10Z

test/functional/p2p_leak_tx.py

@@ -0,0 +1,59 @@
+#!/usr/bin/env python3


I've now reviewed this test. Sadly, it doesn't test the new code (and does not fail on master).

Since the node-under-test has only one peer, the transaction won't get added to mapRelay until we send the INV to that peer. That means in the if/else branches at the bottom of this test:

if the node has already sent us an INV for the tx, then it'll be in the mapRelay and the node will send us the TX.

if the node has not yet sent us an INV for the tx, then it won't be in the mapRelay and we'll fail mi != mapRelay.end() and not even reach && mi->second.m_node_set.count(pfrom->GetId())

To test the new code, we want the to be in mapRelay, but the peer to not be in the RelayEntry. That's not possible if the node has only one peer.

Perhaps we should remove the test from this PR and add it under a future PR. If you do so, please tag me in that PR and I'll commit to reviewing the test.

jnewbery · 2018-10-10T08:34:37Z

One other question: do we know how large mapRelay is expected to grow? Do we know how much additional memory will be used by changing the CTransactionRef to a RelayEntry?

jnewbery · 2018-10-15T14:43:05Z

I've written an alternative test here: https://github.com/jnewbery/bitcoin/tree/pr14220.1 which reliably tests the new behaviour and fails on master.

sipa · 2018-10-17T07:35:39Z

src/net_processing.cpp

    /** Relay map */
-    typedef std::map<uint256, CTransactionRef> MapRelay;
+    struct RelayEntry {
+        explicit RelayEntry(CTransactionRef &&tx) : m_txref(tx) {}


Use m_txref(std::move(tx)) here. tx in this context is not an rvalue reference (as it's been bound to variable), it's only initialized by accepting an rvalue reference.

I had no idea, thank you for pointing this out!

sipa · 2018-10-17T07:47:49Z

src/net_processing.cpp

+            if (mi != mapRelay.end() && mi->second.m_node_set.count(pfrom->GetId())) {
+                connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second.m_txref));
                push = true;
            } else if (pfrom->timeLastMempoolReq) {


One relatively easy solution (I think), is to only announce transactions that have been longer than say 60s in the mempool at the time the MEMPOOL message is received, and to enforce the same time restriction in this block of code (so you can't fetch transactions that were received less than 60s before the last MEMPOOL request).

sdaftuar · 2018-10-17T17:08:41Z

One other question: do we know how large mapRelay is expected to grow? Do we know how much additional memory will be used by changing the CTransactionRef to a RelayEntry?

mapRelay is bounded by our rate-limited transaction relay algorithm, but on further thought, this approach may potentially use quite a bit more memory than I originally realized.

I'll close this PR for now and re-open when I have an alternative that I think is worth pursuing.

maflcko · 2018-10-21T14:19:34Z

src/net_processing.cpp

+    struct RelayEntry {
+        explicit RelayEntry(CTransactionRef &&tx) : m_txref(tx) {}
+        CTransactionRef m_txref;
+        std::unordered_set<NodeId> m_node_set;


There are at most 9 buckets, so a set of outbound NodeIds and a flag for inbound ones might demand less memory?

fanquake added the P2P label Sep 14, 2018

sdaftuar force-pushed the 2018-09-txrelay branch from d1fe7ee to b723d95 Compare September 14, 2018 13:59

DrahtBot mentioned this pull request Sep 15, 2018

tests: Reorder tests and move most of extended tests up to normal tests #14225

Merged

conscott reviewed Sep 16, 2018

View reviewed changes

promag reviewed Sep 16, 2018

View reviewed changes

DrahtBot mentioned this pull request Sep 17, 2018

Remove unnamed block in SendMessages function #14238

Closed

practicalswift reviewed Sep 22, 2018

View reviewed changes

sdaftuar force-pushed the 2018-09-txrelay branch from b723d95 to 736462c Compare September 25, 2018 12:04

sdaftuar changed the title ~~Transaction relay improvements~~ Transaction relay privacy bugfix Sep 25, 2018

sdaftuar mentioned this pull request Sep 25, 2018

Relay ancestors of transactions #14318

Closed

1 task

DrahtBot mentioned this pull request Sep 25, 2018

Tests: enforce critical class instance attributes in functional tests, fix segwit test specificity #14305

Merged

jamesob reviewed Sep 26, 2018

View reviewed changes

DrahtBot added the Needs rebase label Sep 27, 2018

sdaftuar force-pushed the 2018-09-txrelay branch from 736462c to 47bf6e9 Compare September 27, 2018 18:01

DrahtBot removed the Needs rebase label Sep 27, 2018

maflcko approved these changes Sep 28, 2018

View reviewed changes

sdaftuar force-pushed the 2018-09-txrelay branch from 47bf6e9 to 320a85e Compare September 28, 2018 21:01

jamesob approved these changes Oct 5, 2018

View reviewed changes

DrahtBot mentioned this pull request Oct 7, 2018

[tests] Remove rpc_zmq.py #14419

Merged

DrahtBot added the Needs rebase label Oct 8, 2018

jnewbery reviewed Oct 8, 2018

View reviewed changes

sdaftuar force-pushed the 2018-09-txrelay branch from 320a85e to ab0d473 Compare October 9, 2018 02:40

sdaftuar and others added 2 commits October 8, 2018 22:51

[qa] Test inter-bucket privacy leakage

a4e0c3b

sdaftuar force-pushed the 2018-09-txrelay branch from ab0d473 to a4e0c3b Compare October 9, 2018 02:57

Satisfy the linter

de9662b

DrahtBot removed the Needs rebase label Oct 9, 2018

jnewbery suggested changes Oct 10, 2018

View reviewed changes

sipa reviewed Oct 17, 2018

View reviewed changes

sdaftuar closed this Oct 17, 2018

maflcko reviewed Oct 21, 2018

View reviewed changes

sdaftuar mentioned this pull request Oct 29, 2019

p2p: Stop relaying non-mempool txs, improve tx privacy slightly #17303

Closed

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Transaction relay privacy bugfix #14220

Transaction relay privacy bugfix #14220

Uh oh!

Conversation

sdaftuar commented Sep 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Sep 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gmaxwell commented Sep 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

conscott commented Sep 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko commented Sep 18, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko commented Sep 24, 2018

Uh oh!

sdaftuar commented Sep 25, 2018

Uh oh!

maflcko commented Sep 25, 2018

Uh oh!

maflcko commented Sep 25, 2018

Uh oh!

sdaftuar commented Sep 25, 2018

Uh oh!

sdaftuar commented Sep 25, 2018

Uh oh!

jamesob commented Sep 26, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdaftuar Sep 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko Sep 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdaftuar commented Sep 27, 2018

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko Sep 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented Sep 28, 2018

Uh oh!

maflcko commented Oct 1, 2018

sdaftuar commented Sep 14, 2018 •

edited

Loading

DrahtBot commented Sep 15, 2018 •

edited

Loading

sdaftuar Sep 27, 2018 •

edited

Loading

maflcko Sep 28, 2018 •

edited

Loading

maflcko Sep 28, 2018 •

edited

Loading

maflcko Oct 8, 2018 •

edited

Loading