net processing: avoid serving non-announced txs as a result of a MEMPOOL message #27602

sr-gi · 2023-05-08T21:23:38Z

Currently, non-announced transactions can be served after a BIP35 MEMPOOl message has been received from a peer. By following this pattern, we are potentially allowing peers to fingerprint transactions that may have originated in our node and to perform timing analysis on the propagation of transactions that enter our mempool.

This simply removes the relay bypass so only announced transactions are served to peers.

DrahtBot · 2023-05-08T21:23:41Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	glozow, willcl-ark
Approach NACK	vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#27675 (net_processing: Drop m_recently_announced_invs bloom filter by ajtowns)
#27625 (p2p: Stop relaying non-mempool txs by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

sr-gi · 2023-05-08T21:24:38Z

Motivation: this fix was discussed over IRC by @willcl-ark @glozow @sipa and @darosior

https://gnusha.org/bitcoin-core-dev/2023-05-02.log

ajtowns · 2023-05-09T08:11:42Z

src/net_processing.cpp

Doesn't this just change the fingerprinting vector from GETDATA txid to FILTERCLEAR; FILTERADD whatever; MEMPOOL ? In which case we're just as easily fingerprinted, but we spend more cpu while being fingerprinted?

Do you mean keep adding what you're interested in into the filter so it triggers an INV from the node and, therefore, we can request it (or, similarly, keep resetting the filter and just add the one single transaction that we want to probe)? If so, isn't that equivalent to having no filter?

I think in both of the aforementioned cases (which boil down to the same logic if I'm not mistaken) the patch prevents the attacker from requesting unannounced transactions straight-away, they will have to wait for m_next_inv_send_time to go off.

Ah, I see what you mean. That makes sense.

Next point on those lines though: adding many things to m_recently_announced_invs can overflow the bloom filter, either causing false positives (leading to privacy bugs), or since it's a rolling bloom filter, causing overflow leading to false negatives (leading to functionality bugs where you can't get the txs from the MEMPOOL response). Wouldn't it be better to make the insertion conditional on txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY ? Even then this approach seems like it can trigger false positives fairly easily, since the rolling bloom filter capacity is only 3500 txs?

Wouldn't it be better to make the insertion conditional on txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY ?

Yeah, I think that makes perfect sense, given there is no point in adding something to the filter if we are going to unconditionally serve it anyway.

Even then this approach seems like it can trigger false positives fairly easily, since the rolling bloom filter capacity is only 3500 txs?

It may indeed. I haven't got my head around how to properly compute what are the odds of this happening though, given it feels like it depends on how often MEMPOOL messages are sent to us, which may be unbounded (how many MEMPOOL requests can we receive in a UNCONDITIONAL_RELAY_DELAY period, and how full our mempool may be?). In any case, this feels detrimental for the peer implementing this behavior instead of for us (?)

Something I'm really wondering is why this is actually bound by INVENTORY_MAX_RECENT_RELAY (3500) instead of by MAX_PEER_TX_ANNOUNCEMENTS (5000). It feels quite inconsistent that we are able to accept up to 5000 transaction announcements from someone but we are only able to remember the last 3500 that we have announced to someone(else).

I think what may make even more sense would be to reply to mempool messages only with transactions that fall into the txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY check, that way we don't really have to add transactions to m_recently_announced_invs here, and we are also discouraging peers bloating us with MEMPOOL messages. As long as they have a filter set, any transaction that matches their filter will be announced to them anyway (h/t @sdaftuar). I guess this could be done as a followup if it makes sense.

Wouldn't it be better to make the insertion conditional on txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY ?

Also wouldn't that also apply here? https://github.com/bitcoin/bitcoin/blob/27c40e3dcf280a137bcfaa787674c26781d911b8/src/net_processing.cpp#L5688

Whoops. I was incorrectly assuming that a reply to a mempool message was rate-limited and size-limited. (Which on a second thought, probably doesn't make sense to do). So a single reply can hold up to 50'000 inv entries, which will certainly overflow. Sorry for the distraction.

~~Side note: The 50k max inv size also seems to overflow, which is probably another source for false negatives, depending on the use case.~~

Image stolen from the comment #27426 (comment) by 0xB10C .

Side note: The 50k max inv size also seems to overflow, which is probably another source for false negatives, depending on the use case.

You lost me there. Do you mean this may be a possible source of false negatives for the current path, or for both?

In this patch, I can see that potentially being an issue (that's what the original concern from AJ's comes from), but I don't see how that is an issue for master, given nothing is added to m_recently_announced_invs when building this huge INVs as a result of a MEMPOOL message.

You lost me there.

Apologies, I typed something incorrect, again.

All good, I was just wondering if I may have been missing something.

src/net_processing.cpp

glozow

Concept ACK

The issue on master that I believe this addresses is being able to learn exactly when a transaction enters your mempool. You send mempool and get the response, then put out the tx, then getdata getdata getdata getdata... until the node tells you the tx instead of notfound. Also in general I think it's quite sensible to only serve transactions you announced.

sr-gi · 2023-05-09T19:33:40Z

Addressed comments by @MarcoFalke and fixed a couple of logs from the tests

willcl-ark · 2023-05-10T20:00:04Z

Concept ACK

darosior · 2023-05-11T08:20:07Z

I'm only a casual lurker on the net_processing side but i was under the impression we were thinking of the BIP35 as permissioned anyways. If we are already trusting the peer that sends use MEMPOOL message, why not just include everything we've got?
A practical counter-argument to this could be - yeah but at this time around 20% to 30% of reachable nodes set peerbloomfilters so it's still worth closing an obvious fingerprinting vector.

Another thing is whether it's breaking any usecase. Since the message is from a trusted peer a client may be relying on getting the whole mempool. For instance BTCPay/NBXplorer needs whitelist=mempool iirc. What's the proportion of announced vs non-announced transactions? How much would be left out with this patch?

willcl-ark · 2023-05-11T10:34:11Z

i was under the impression we were thinking of the BIP35 as permissioned anyways.
...
For instance BTCPay/NBXplorer needs whitelist=mempool iirc.

Just on these points, currently the MEMPOOL message will be responded to either if you have the whitelist permission mempool on the peer or if you have enabled NODE_BLOOM (with -peerbloomfilters). In the latter case I don't think we can accurately describe these as "trusted peers". See the changes in 4581a68 (#27559) which clarify this.

vasild · 2023-05-11T12:34:11Z

The title "avoid serving non-announced txs as a result of a MEMPOOL message" is confusing - it gives the impression that this PR will change the response to MEMPOOL messages, but it does not do that. Before and after the PR we would send the full mempool in a reply to mempool (correct me if I got it wrong).

glozow · 2023-05-11T12:56:52Z

What's the proportion of announced vs non-announced transactions? How much would be left out with this patch?

The PR (as it is right now) still announces everything in response to mempool. This is saying "I haven't even announced this to you, why are you requesting it from me?" A normal client, who presumably does not already know what it wants to request until it receives an announcement (i.e. using mempool to learn about transactions), shouldn't be affected at all.

test/functional/p2p_filter.py

vasild · 2023-05-11T15:05:55Z

Is it the case that this PR will change the behavior in the following two cases (from the PR title and description I got the impression that it is wider scope):

Case 1

the node receives MEMPOOL request and replies to it
in the same second a new transaction enters the mempool (after the MEMPOOL message)
the node receives GETDATA for that transaction

In master it would send the transaction even though it never advertised it with INV due to the txinfo.m_time <= mempool_req check (both values would be equal).

In this PR it would claim "not found".

This can be resolved by increasing the precision of TxRelay::m_last_mempool_req and TxMempoolInfo::m_time or by changing the condition to <.

Case 2

the node receives MEMPOOL request and replies to it, but some transaction is omitted from the reply because it is not in TxRelay::m_bloom_filter.
the node receives GETDATA for that transaction

In master it would send the transaction even though it never advertised it with INV because PeerManagerImpl::FindTxForGetData() does not check the bloom filter.

In this PR it would claim "not found".

Is this a problem at all? If yes, then it can be resolved by checking the bloom filter from PeerManagerImpl::FindTxForGetData().

sr-gi · 2023-05-11T15:07:28Z

I'm only a casual lurker on the net_processing side but i was under the impression we were thinking of the BIP35 as permissioned anyways. If we are already trusting the peer that sends use MEMPOOL message, why not just include everything we've got? A practical counter-argument to this could be - yeah but at this time around 20% to 30% of reachable nodes set peerbloomfilters so it's still worth closing an obvious fingerprinting vector.

We still do. This is not modifying the behavior of the announcements that result from a mempool message, but the behavior of the node to the getdata messages that may follow. Before the patch, a node was able to send us a mempool message and, from that point on, request any transaction from our mempool (even if we haven't announced that to them at all).

Another thing is whether it's breaking any usecase. Since the message is from a trusted peer a client may be relying on getting the whole mempool. For instance BTCPay/NBXplorer needs whitelist=mempool iirc. What's the proportion of announced vs non-announced transactions? How much would be left out with this patch?

As of 27c40e3 we still do, even though I would argue that, by doing this, we are encouraging peers to spam us with mempool messages instead of waiting for us to announce the transactions that match their filters in due time.

sr-gi · 2023-05-11T15:10:29Z

The title "avoid serving non-announced txs as a result of a MEMPOOL message" is confusing - it gives the impression that this PR will change the response to MEMPOOL messages, but it does not do that. Before and after the PR we would send the full mempool in a reply to mempool (correct me if I got it wrong).

Maybe it should read "after receiving a MEMPOOL message" instead of "as a result of a MEMPOOL message".

What this PR is trying to achieve is to prevent us from responding to getdata requests about unannounced transactions after receiving a mempool message.

sr-gi · 2023-05-11T15:30:50Z

Is it the case that this PR will change the behavior in the following two cases (from the PR title and description I got the impression that it is wider scope):

Case 1

the node receives MEMPOOL request and replies to it

in the same second a new transaction enters the mempool (after the MEMPOOL message)

the node receives GETDATA for that transaction

In master it would send the transaction even though it never advertised it with INV due to the txinfo.m_time <= mempool_req check (both values would be equal).

In this PR it would claim "not found".

This can be resolved by increasing the precision of TxRelay::m_last_mempool_req and TxMempoolInfo::m_time or by changing the condition to <.

This is exactly what we are trying to prevent, but I don't think your approach fixes the issue. This condition is checking that our inv (txinfo) is more recent than the last mempool_req, meaning that once a mempool request has been processed the peer can request any transaction from our mempool. Why is this bad? It makes transactions that enter our memopool easily probable. The node can create a transaction (tx), send it through any other link, and spam us with getdata messages until we reply with it.

Case 2

the node receives MEMPOOL request and replies to it, but some transaction is omitted from the reply because it is not in TxRelay::m_bloom_filter.

the node receives GETDATA for that transaction

In master it would send the transaction even though it never advertised it with INV because PeerManagerImpl::FindTxForGetData() does not check the bloom filter.

In this PR it would claim "not found".

Is this a problem at all? If yes, then it can be resolved by checking the bloom filter from PeerManagerImpl::FindTxForGetData().

Arguably this is the expected behavior, master's is not.

The response to the mempool message didn't include an INV from that specific transaction (both in the patch nor in master's) because it didn't match the filter set by the peer, therefore, the peer should not be interested in it. With respect to why claiming not found after the peer requests the transaction (what makes this different from master) I think we can default to the same reasoning as per Case 1.

vasild · 2023-05-11T16:00:39Z

txinfo.m_time <= mempool_req

This condition is checking that our inv (txinfo) is more recent than the last mempool_req, meaning that once a mempool request has been processed the peer can request any transaction from our mempool.

I don't get it. Is it not that the condition is checking whether txinfo is older than the last mempool_req? In that case the transaction has been part of the MEMPOOL reply, so we have sent an INV about it.

glozow · 2023-05-11T19:17:03Z

txinfo.m_time <= mempool_req

This condition is checking that our inv (txinfo) is more recent than the last mempool_req, meaning that once a mempool request has been processed the peer can request any transaction from our mempool.

I don't get it. Is it not that the condition is checking whether txinfo is older than the last mempool_req? In that case the transaction has been part of the MEMPOOL reply, so we have sent an INV about it.

Ah, I think you're right! So this changes Case 2, not Case 1.

sr-gi · 2023-05-11T21:16:32Z

txinfo.m_time <= mempool_req

This condition is checking that our inv (txinfo) is more recent than the last mempool_req, meaning that once a mempool request has been processed the peer can request any transaction from our mempool.

I don't get it. Is it not that the condition is checking whether txinfo is older than the last mempool_req? In that case the transaction has been part of the MEMPOOL reply, so we have sent an INV about it.

Ah, I think you're right! So this changes Case 2, not Case 1.

He is indeed. The test for Case 1 was passing because both requests were processed within the same second 😅

vasild · 2023-05-12T08:32:07Z

No, wait! This PR changes both Case1 and Case2, but Case1 is only about things happening in the same second.

What I tried yesterday:

Run the newly added tests, they were passing, of course. Cool!
Revert the net_processing.cpp changes, both tests Request the irrelevant transaction even though it has not being offered to us and Create a third transaction (that was therefore not part of the mempool message) and try to... failed. Good! To reach the second test I disabled the assert from the first.

But I couldn't explain why the second test would fail (without net_processing.cpp changes aka on master). Then I suspected things happen in the same second in the test and added time.sleep(1) before creating the third transaction and then the test passed.

Concept ACK. I agree that we should not respond to GETDATA asking for a transaction which we did not INV to that peer and this PR does that.

However, I agree with @ajtowns's concern raised above about blowing up m_recently_announced_invs. That bloom filter has size 3500 and if we put 300k elements in it, then it is probably not going to work as expected.
Approach NACK due to that.

So, to achieve the goal of not replying to GETDATA without a prior INV we have to remember all INVs sent to the peer, but that is kind of "hard" considering a MEMPOOL reply sends 300k INVs...

vasild · 2023-05-12T08:49:12Z

Looking at the comment of CRollingBloomFilter if m_recently_announced_invs is changed from 3500 to 300000 then its size would increase from 37KB to 3MB. I guess that is too much and maybe was the reason why MEMPOOL replies don't use m_recently_announced_invs and instead use the m_last_mempool_req timing mechanism.

sr-gi · 2023-05-12T16:14:46Z

No, wait! This PR changes both Case1 and Case2, but Case1 is only about things happening in the same second.

I meant regarding the time checks in master, the checks were indeed protecting against querying something that was in the mempool before the mempool message was received. Meaning that while this patch fixes the issues around requesting data after a mempool message is received, they are not as severe as initially assumed.

However, I agree with @ajtowns's concern raised above about blowing up m_recently_announced_invs. That bloom filter has size 3500 and if we put 300k elements in it, then it is probably not going to work as expected. Approach NACK due to that.

I think at this point there are two easy solutions to the problem (with their corresponding tradeoffs, of course):

Do things as stated in the patch but guarding addition to the m_recently_announced_invs rolling filter by checking that the data we are adding is not older than two minutes, as @ajtowns originally suggested. This will prevent us from adding data that will be served unconditionally anyway, reducing the odds of overflowing the filter. However, I'm not sure how AJ computed the odds of the filter overflowing due to high traffic, so I'm not able to convince myself this is enough.
Keep the m_recently_announced_invs bypass as it is in master, but do check the bloom filter when deciding whether to relay something that may have been announced to a peer as a response to a mempool message. This has two main drawbacks:

First, we will be checking the filter twice, once when deciding whether to include the transaction announcement as part of the response to the MEMPOOL message (https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L5649-L5651) and again as part of the combined check in FindTxForGetData (https://github.com/bitcoin/bitcoin/blob/master/src/net_processing.cpp#L2271).
Second, this, potentially, can result in the two versions of the filter differing between them. A node could clear their filter after sending the MEMPOOL message and before requesting the data and we will end up serving something we haven't really announced. However, I cannot think of a proper use case (or attack, FWIW) that takes advantage of this in a meaningful way. If the goal was for us to include all the data we had in our mempool when sending us the MEMPOOL message, they could have had a clear filter to begin with.

sr-gi · 2023-05-12T19:55:04Z

Addresses @willcl-ark comments and fixes tests to make sure the MEMPOOL message and the second GETDATA message are not processed within the same second, as per @vasild comment.

sr-gi · 2023-05-22T14:21:49Z

Adds invs to m_recently_announced_invs conditionally to them not being unconditionally relayable and rebases master.

Currently, non-announced transactions can be served after a BIP35 MEMPOOl message has been received from a peer. By following this pattern, we are potentially allowing peers to fingerprint transactions that may have originated in our node and to perform timing analysis on the propagation of transactions that enter our mempool. This simply removes the relay bypass so only announced transactions are served to peers.

DrahtBot · 2023-06-12T10:43:29Z

🐙 This pull request conflicts with the target branch and needs rebase.

luke-jr · 2023-06-23T17:18:56Z

I wonder if there should be a noprivacy permission flag...

maflcko · 2023-08-17T10:01:42Z

Are you still working on this?

sr-gi · 2023-08-17T10:55:46Z

Are you still working on this?

I wonder if this is still meaningful now that #27675 has landed

maflcko · 2023-08-17T10:59:47Z

Maybe the test? Though, I haven't looked at it recently.

sr-gi · 2023-08-17T16:58:37Z

Maybe the test? Though, I haven't looked at it recently.

That may make sense. I'll take a look and rebase

sr-gi · 2023-09-14T19:59:26Z

Maybe the test? Though, I haven't looked at it recently.

The test can be adapted to fit the logic after #27675, but the patch itself does not seem relevant anymore. Would it be worth reworking this PR or just closing it and opening one for the test?

maflcko · 2023-09-14T20:19:00Z

Thanks for taking a look. Seems fine to open a new test-only pull

sr-gi · 2023-09-14T20:45:18Z

I'm closing this one then

DrahtBot added the P2P label May 8, 2023

glozow changed the title ~~net: avoid serving non-announced txs as a result of a MEMPOOL message~~ net processing: avoid serving non-announced txs as a result of a MEMPOOL message May 8, 2023

ajtowns reviewed May 9, 2023

View reviewed changes

maflcko reviewed May 9, 2023

View reviewed changes

src/net_processing.cpp Outdated Show resolved Hide resolved

glozow reviewed May 9, 2023

View reviewed changes

sr-gi force-pushed the mempool-not-inved branch from c04f145 to 27c40e3 Compare May 9, 2023 19:32

glozow mentioned this pull request May 10, 2023

Improve performance of p2p inv to send queues #27610

Merged

willcl-ark reviewed May 11, 2023

View reviewed changes

test/functional/p2p_filter.py Outdated Show resolved Hide resolved

willcl-ark reviewed May 11, 2023

View reviewed changes

test/functional/p2p_filter.py Outdated Show resolved Hide resolved

sr-gi force-pushed the mempool-not-inved branch from 27c40e3 to 8d640ad Compare May 12, 2023 19:53

DrahtBot mentioned this pull request May 12, 2023

test: Return dict in MiniWallet::send_to #27640

Merged

DrahtBot mentioned this pull request May 16, 2023

p2p: Drop m_recently_announced_invs bloom filter #27675

Merged

maflcko mentioned this pull request May 18, 2023

p2p: Stop relaying non-mempool txs #27625

Merged

DrahtBot added the Needs rebase label May 18, 2023

sr-gi force-pushed the mempool-not-inved branch from 8d640ad to 48b40f6 Compare May 22, 2023 14:20

DrahtBot added CI failed and removed Needs rebase labels May 22, 2023

sr-gi force-pushed the mempool-not-inved branch from 48b40f6 to f543b7d Compare May 23, 2023 15:44

DrahtBot removed the CI failed label May 23, 2023

DrahtBot mentioned this pull request Jun 1, 2023

Relay own transactions only via short-lived Tor or I2P connections #27509

Closed

5 tasks

DrahtBot added the Needs rebase label Jun 12, 2023

sr-gi closed this Sep 14, 2023

bitcoin locked and limited conversation to collaborators Sep 13, 2024

net processing: avoid serving non-announced txs as a result of a MEMPOOL message #27602

net processing: avoid serving non-announced txs as a result of a MEMPOOL message #27602

Uh oh!

Conversation

sr-gi commented May 8, 2023

Uh oh!

DrahtBot commented May 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

sr-gi commented May 8, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sr-gi May 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko May 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

sr-gi commented May 9, 2023

Uh oh!

willcl-ark commented May 10, 2023

Uh oh!

darosior commented May 11, 2023

Uh oh!

willcl-ark commented May 11, 2023

Uh oh!

vasild commented May 11, 2023

Uh oh!

glozow commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vasild commented May 11, 2023

Case 1

Case 2

Uh oh!

sr-gi commented May 11, 2023

Uh oh!

sr-gi commented May 11, 2023

Uh oh!

sr-gi commented May 11, 2023

Case 1

Case 2

Uh oh!

vasild commented May 11, 2023

Uh oh!

glozow commented May 11, 2023

Uh oh!

sr-gi commented May 11, 2023

Uh oh!

vasild commented May 12, 2023

Uh oh!

vasild commented May 12, 2023

Uh oh!

sr-gi commented May 12, 2023

Uh oh!

DrahtBot commented May 8, 2023 •

edited

Loading

sr-gi May 10, 2023 •

edited

Loading

maflcko May 15, 2023 •

edited

Loading

glozow commented May 11, 2023 •

edited

Loading