-
Notifications
You must be signed in to change notification settings - Fork 38.7k
net processing: avoid serving non-announced txs as a result of a MEMPOOL message #27602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Motivation: this fix was discussed over IRC by @willcl-ark @glozow @sipa and @darosior |
src/net_processing.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this just change the fingerprinting vector from GETDATA txid to FILTERCLEAR; FILTERADD whatever; MEMPOOL ? In which case we're just as easily fingerprinted, but we spend more cpu while being fingerprinted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean keep adding what you're interested in into the filter so it triggers an INV from the node and, therefore, we can request it (or, similarly, keep resetting the filter and just add the one single transaction that we want to probe)? If so, isn't that equivalent to having no filter?
I think in both of the aforementioned cases (which boil down to the same logic if I'm not mistaken) the patch prevents the attacker from requesting unannounced transactions straight-away, they will have to wait for m_next_inv_send_time to go off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see what you mean. That makes sense.
Next point on those lines though: adding many things to m_recently_announced_invs can overflow the bloom filter, either causing false positives (leading to privacy bugs), or since it's a rolling bloom filter, causing overflow leading to false negatives (leading to functionality bugs where you can't get the txs from the MEMPOOL response). Wouldn't it be better to make the insertion conditional on txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY ? Even then this approach seems like it can trigger false positives fairly easily, since the rolling bloom filter capacity is only 3500 txs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to make the insertion conditional on
txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY?
Yeah, I think that makes perfect sense, given there is no point in adding something to the filter if we are going to unconditionally serve it anyway.
Even then this approach seems like it can trigger false positives fairly easily, since the rolling bloom filter capacity is only 3500 txs?
It may indeed. I haven't got my head around how to properly compute what are the odds of this happening though, given it feels like it depends on how often MEMPOOL messages are sent to us, which may be unbounded (how many MEMPOOL requests can we receive in a UNCONDITIONAL_RELAY_DELAY period, and how full our mempool may be?). In any case, this feels detrimental for the peer implementing this behavior instead of for us (?)
Something I'm really wondering is why this is actually bound by INVENTORY_MAX_RECENT_RELAY (3500) instead of by MAX_PEER_TX_ANNOUNCEMENTS (5000). It feels quite inconsistent that we are able to accept up to 5000 transaction announcements from someone but we are only able to remember the last 3500 that we have announced to someone(else).
I think what may make even more sense would be to reply to mempool messages only with transactions that fall into the txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY check, that way we don't really have to add transactions to m_recently_announced_invs here, and we are also discouraging peers bloating us with MEMPOOL messages. As long as they have a filter set, any transaction that matches their filter will be announced to them anyway (h/t @sdaftuar). I guess this could be done as a followup if it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to make the insertion conditional on
txinfo.m_time <= now - UNCONDITIONAL_RELAY_DELAY?
Also wouldn't that also apply here? https://github.com/bitcoin/bitcoin/blob/27c40e3dcf280a137bcfaa787674c26781d911b8/src/net_processing.cpp#L5688
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. I was incorrectly assuming that a reply to a mempool message was rate-limited and size-limited. (Which on a second thought, probably doesn't make sense to do). So a single reply can hold up to 50'000 inv entries, which will certainly overflow. Sorry for the distraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: The 50k max inv size also seems to overflow, which is probably another source for false negatives, depending on the use case.
Image stolen from the comment #27426 (comment) by 0xB10C .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: The 50k max inv size also seems to overflow, which is probably another source for false negatives, depending on the use case.
You lost me there. Do you mean this may be a possible source of false negatives for the current path, or for both?
In this patch, I can see that potentially being an issue (that's what the original concern from AJ's comes from), but I don't see how that is an issue for master, given nothing is added to m_recently_announced_invs when building this huge INVs as a result of a MEMPOOL message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You lost me there.
Apologies, I typed something incorrect, again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, I was just wondering if I may have been missing something.
glozow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
The issue on master that I believe this addresses is being able to learn exactly when a transaction enters your mempool. You send mempool and get the response, then put out the tx, then getdata getdata getdata getdata... until the node tells you the tx instead of notfound. Also in general I think it's quite sensible to only serve transactions you announced.
|
Addressed comments by @MarcoFalke and fixed a couple of logs from the tests |
|
Concept ACK |
|
I'm only a casual lurker on the net_processing side but i was under the impression we were thinking of the BIP35 as permissioned anyways. If we are already trusting the peer that sends use MEMPOOL message, why not just include everything we've got? Another thing is whether it's breaking any usecase. Since the message is from a trusted peer a client may be relying on getting the whole mempool. For instance BTCPay/NBXplorer needs |
Just on these points, currently the MEMPOOL message will be responded to either if you have the whitelist permission |
|
The title "avoid serving non-announced txs as a result of a MEMPOOL message" is confusing - it gives the impression that this PR will change the response to |
The PR (as it is right now) still announces everything in response to mempool. This is saying "I haven't even announced this to you, why are you requesting it from me?" A normal client, who presumably does not already know what it wants to request until it receives an announcement (i.e. using |
|
Is it the case that this PR will change the behavior in the following two cases (from the PR title and description I got the impression that it is wider scope): Case 1
In In this PR it would claim "not found". This can be resolved by increasing the precision of Case 2
In In this PR it would claim "not found". Is this a problem at all? If yes, then it can be resolved by checking the bloom filter from |
We still do. This is not modifying the behavior of the announcements that result from a
As of 27c40e3 we still do, even though I would argue that, by doing this, we are encouraging peers to spam us with mempool messages instead of waiting for us to announce the transactions that match their filters in due time. |
Maybe it should read "after receiving a MEMPOOL message" instead of "as a result of a MEMPOOL message". What this PR is trying to achieve is to prevent us from responding to getdata requests about unannounced transactions after receiving a mempool message. |
This is exactly what we are trying to prevent, but I don't think your approach fixes the issue. This condition is checking that our inv (
Arguably this is the expected behavior, master's is not. The response to the mempool message didn't include an INV from that specific transaction (both in the patch nor in master's) because it didn't match the filter set by the peer, therefore, the peer should not be interested in it. With respect to why claiming |
I don't get it. Is it not that the condition is checking whether |
Ah, I think you're right! So this changes Case 2, not Case 1. |
He is indeed. The test for Case 1 was passing because both requests were processed within the same second 😅 |
|
No, wait! This PR changes both Case1 and Case2, but Case1 is only about things happening in the same second. What I tried yesterday:
But I couldn't explain why the second test would fail (without Concept ACK. I agree that we should not respond to However, I agree with @ajtowns's concern raised above about blowing up So, to achieve the goal of not replying to |
|
Looking at the comment of |
I meant regarding the time checks in master, the checks were indeed protecting against querying something that was in the mempool before the mempool message was received. Meaning that while this patch fixes the issues around requesting data after a mempool message is received, they are not as severe as initially assumed.
I think at this point there are two easy solutions to the problem (with their corresponding tradeoffs, of course):
|
|
Addresses @willcl-ark comments and fixes tests to make sure the |
|
Adds invs to |
Currently, non-announced transactions can be served after a BIP35 MEMPOOl message has been received from a peer. By following this pattern, we are potentially allowing peers to fingerprint transactions that may have originated in our node and to perform timing analysis on the propagation of transactions that enter our mempool. This simply removes the relay bypass so only announced transactions are served to peers.
|
🐙 This pull request conflicts with the target branch and needs rebase. |
|
I wonder if there should be a |
|
Are you still working on this? |
I wonder if this is still meaningful now that #27675 has landed |
|
Maybe the test? Though, I haven't looked at it recently. |
That may make sense. I'll take a look and rebase |
The test can be adapted to fit the logic after #27675, but the patch itself does not seem relevant anymore. Would it be worth reworking this PR or just closing it and opening one for the test? |
|
Thanks for taking a look. Seems fine to open a new test-only pull |
|
I'm closing this one then |

Currently, non-announced transactions can be served after a BIP35 MEMPOOl message has been received from a peer. By following this pattern, we are potentially allowing peers to fingerprint transactions that may have originated in our node and to perform timing analysis on the propagation of transactions that enter our mempool.
This simply removes the relay bypass so only announced transactions are served to peers.