Address mempool information leak and resource wasting attacks. #7093

gmaxwell · 2015-11-24T21:59:50Z

Currently an attacker can poll mempool frequently to bypass
trickling logic and trace a transaction through the network.
This also wastes lots of bandwidth by causing the same huge
invs to be sent over and over.

This change makes mempool not return results with an arrival
time greater than the current minus 16 rounded down to a
multiple of 16 seconds. This is a 16 to 32 second delay.

It also makes mempool calls return only responses which have
not been INVed before (using the known inv mruset) and
limits the mempool command to considering only the top 8192
entries in the mempool.

gmaxwell · 2015-11-24T22:05:18Z

For some time there has been some party connecting to every node on the network (sometimes multiple times) and fetching the mempool over and over again at high speed. This wastes a ton of node's bandwidth. These fetches are also an information leak, as the frequent polling bypasses the trickle logic (and I believe that is their actual motivation, based on other behavior by the same IP).

Jgarzik suggested removing the mempool command entirely, but it's used by some lite clients to display unconfirmed transactions that arrived before they connected. This change is more polite, preserving the usecase for lite wallets without making the call an attractive nuisance for third parties to waste a nodes' bandwidth with various information gathering attacks.

We should also look to see if we've created other trickle bypassing bugs (e.g. responding to a getdata on a tx we've never INVed), and consider limiting the maximum size of the mempool response. Now that we get the result back in feerate order, it would be pretty reasonable to return only the next couple blocks worth. Anyone know a reason we shouldn't?

pstratem · 2015-11-25T00:14:31Z

src/rpcblockchain.cpp

We've been using std::numeric_limits<int64_t>::max() elsewhere (also INT64_MAX seems to fail on arm?)

gmaxwell · 2015-11-25T05:17:07Z

Pulltester fails because of known inadvisable stuff it does with the mempool command. I'm not sure what to do about that.

I updated the patch so that the filter commands would reset the starting time to zero, allowing re-filtering of already filtered stuff. I did this after observing breadwallet making multiple mempool requests (after keys were added). The downside of this is that no leaves open a DOS vector when bloom filtering is enabled, but there are so many of those that I don't know if its worth worrying about.

laanwj · 2015-11-25T09:19:04Z

Pulltester fails because of known inadvisable stuff it does with the mempool command. I'm not sure what to do about that.

Well, the proper fix would be that tests that want to probe the mempool should use the privileged RPC commands instead of the P2P message. Not sure how much work that would be in practice.

laanwj · 2015-11-25T09:28:10Z

src/txmempool.cpp

I'm not sure I understand the rationale behind not using the index because it would leak information. We're still querying on mi->GetTime() so 'leaking' the same information?

If the order of the result is the problem, we could just sort the result by hash before returning.

Right the order is the problem. It's true that it could sort, but thats another expensive operation, and the result being in fee sorted order is quite useful... e.g. if the peer wanted to grab just the top of the mempool.

I'm not sure how a post-sort of the result would stack up against bypassing a linear scan when an index exists anyway. It depends on the input and output sizes.
In the least this compromise is more subtle than the comment makes it look.

We can also use any of the other indexes we have, like the feerate one?

But the query is by time range - so wouldn't the time be the only index that helps speed this up?

laanwj · 2015-11-25T11:33:30Z

Concept ACK

gmaxwell · 2015-11-25T11:36:02Z

So I think I have two mutually exclusive approaches to suggest: one is the current PR code that has a start and quantized end time. The purpose of the start is to limit how much subsequent requests return.

The alternative is the quantized end time and a max depth, where it returns the top X of the feerate sorted mempool (but only txn which are before the end time). This one might even worth will pull tester (if the delays are set suitably).

I think the latter may be more resource friendly. The reason the two cannot be efficiently combined is that a tx might be not sent because it's too deep, but then a block happens and everything moves up.. and then it won't be transmitted because it's too early. No great harm, the duplicate removal is mostly to using lots of bandwidth and limiting would accomplish the same.

Primary downside of the latter is that if you're interested in some transaction deep in the mempool you won't learn about it until it's close to getting mined. Advantage is that it's simpler behavior that is easier to understand and explain, and also more resource friendly.

gmaxwell · 2015-11-25T21:26:22Z

A third option, would be to maintain a limited size set of responses sent already, and do the only top N of the mempool trick. That would combine both benefits (cpu benefit from considering only the top of the mempool, bandwidth benefit from not being willing to just send the same data multiple times).

gmaxwell · 2015-11-25T22:01:34Z

Switched to that third option, using setInventoryKnown and only considering the top 8192 entries in the mempool.

dgenr8 · 2015-11-26T01:29:49Z

TIL the default size of setInventoryKnown is only 1000. It was 10000 before the default send buffer size was reduced.

gmaxwell · 2015-11-26T02:36:09Z

@dgenr8 ouch. I don't think it makes sense to link it to max_sendbuffer at all. I've unlinked it and set it to 50,000; so that we can at least take a whole inv on it.

(I created another PR that changes it to a rolling bloomfilter; if that goes forward I'll rebase this on that.)

Currently an attacker can poll mempool frequently to bypass trickling logic and trace a transaction through the network. This also wastes lots of bandwidth by causing the same huge invs to be sent over and over. This change makes mempool not return results with an arrival time greater than the current minus 16 rounded down to a multiple of 16 seconds. This is a 16 to 32 second delay. It also makes mempool calls return only responses which have not been INVed before (using the known inv mruset) and limits the mempool command to considering only the top 8192 entries in the mempool. This also introduces the constant MAX_SETINVENTORYKNOWN_SZ and sets it to 50,000 (large enough to hold the inventory for a maximum sized INV), which is a substantial increase from the current default of 1000.

laanwj · 2015-11-26T09:50:22Z

@dgenr8 ouch. I don't think it makes sense to link it to max_sendbuffer at all. I've unlinked it and set it to 50,000; so that we can at least take a whole inv on it.

From 1000 to 50,000 is quite an increase. How does that (approximately) affect worst-case memory usage per connected node?

gmaxwell · 2015-11-26T10:11:08Z

@laanwj See #7100 for my preferred solution (which I think adds about 400k per peer, though I've not measured yet)-- I just didn't want the 1000 messing up any testing here.

DeftNerd · 2015-11-30T20:25:35Z

Wouldn't this pull (designed to make access to mempool information more difficult) damage future fee markets?

I thought one of the points of RBF was to make it so we could keep small blocks and just pay higher necessary fees to have a transaction included in a block.

If there are tens of thousands of transactions clamoring to get included in blocks, then those wallets could alter their transaction and rebroadcast it with a higher fee.

It's like a few water fountains in the desert that can only support 7 drinkers per second, but tens-of-thousands of people increasing their bids for access to the water before they die (I believe the current transaction time-out is 72 hours?). If they don't have enough to bid extra, then they didn't deserve access anyway.

For a proper fee market, shouldn't you be able to hear what other people are bidding for access at that moment?

If this pull goes through, wouldn't it make it difficult for services to determine the most up-to-date pricing information on what is a fee that is likely to get included in a block?

You can get information from the confirmed blocks, but sometimes the 10 minute average between blocks results in spans of over an hour between blocks. Without the real-time information, it won't be possible to adjust fees to try to be on top of the pile of transactions that'll confirm in the next block.

laanwj · 2015-12-04T08:46:42Z

Needs rebase, and to pass travis

gmaxwell · 2015-12-04T09:08:16Z

@DeftNerd mempool message are not suitable for that and you'll hear them via you'll hear the transactions through the normal transfer mechanism. If anything this pull makes it more suitable since its not sending you megabytes of IDs a dozen blocks deep in the history that you've already received from it. The delay from it is only a few seconds, which isn't that large compared to the operation of the network.

@laanwj K. pull tester will be fun; the blocktester misuses the mempool command for operation checking

laanwj · 2015-12-04T09:17:52Z

Oh the blocktester... bleh I can't help you with that.
@theuni Do you still remember how to change that beast?

laanwj · 2015-12-04T09:32:29Z

We could also deprecate the blocktester if we feel that its task has been sufficiently surpassed by the new test framework. But I'm not sure of this (in most part because I've never had a good grasp on what it actually does, being external software and in Java for that matter).

sipa · 2015-12-05T07:45:26Z

@BugTheBlueMatt Would it be possible to reduce the blocktester to only do consensus checks, and not mempool checks?

morcos · 2015-12-11T16:41:03Z

src/txmempool.cpp

If your intention is to add the top sorted txs by fee rate then you need to iterate over index 3.

We should keep in mind other uses (like this) that that index takes on (utxo cache priming, fee estimation) because the intention had been to change that index to be a sort of ancestor fee rate packages, which isn't necessarily what's wanted for the other use cases.

laanwj · 2016-04-11T12:08:01Z

To be clear: this is blocked by #4545

sipa · 2016-04-11T12:19:38Z

Related (but not a replacement): #7840

gmaxwell · 2016-05-20T07:58:52Z

#7840 moved forward enough that I'll close this for now.

gmaxwell added the P2P label Nov 24, 2015

pstratem reviewed Nov 25, 2015
View reviewed changes

laanwj reviewed Nov 25, 2015
View reviewed changes

gmaxwell added the Privacy label Nov 28, 2015

laanwj mentioned this pull request Dec 8, 2015

Comparison tool successor? #4545

Closed

morcos reviewed Dec 11, 2015
View reviewed changes

dgenr8 mentioned this pull request Dec 29, 2015

Further work on thin blocks bitcoinxt/bitcoinxt#109

Merged

gmaxwell closed this May 20, 2016

laanwj mentioned this pull request Aug 13, 2016

test: Remove java comparison tool #8504

Merged

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Address mempool information leak and resource wasting attacks. #7093

Address mempool information leak and resource wasting attacks. #7093

Uh oh!

Conversation

gmaxwell commented Nov 24, 2015

Uh oh!

gmaxwell commented Nov 24, 2015

Uh oh!

pstratem Nov 25, 2015

Choose a reason for hiding this comment

Uh oh!

gmaxwell commented Nov 25, 2015

Uh oh!

laanwj commented Nov 25, 2015

Uh oh!

laanwj Nov 25, 2015

Choose a reason for hiding this comment

Uh oh!

gmaxwell Nov 25, 2015

Choose a reason for hiding this comment

Uh oh!

laanwj Nov 25, 2015

Choose a reason for hiding this comment

Uh oh!

sipa Nov 25, 2015

Choose a reason for hiding this comment

Uh oh!

laanwj Nov 25, 2015

Choose a reason for hiding this comment

Uh oh!

laanwj commented Nov 25, 2015

Uh oh!

gmaxwell commented Nov 25, 2015

Uh oh!

gmaxwell commented Nov 25, 2015

Uh oh!

gmaxwell commented Nov 25, 2015

Uh oh!

dgenr8 commented Nov 26, 2015

Uh oh!

gmaxwell commented Nov 26, 2015

Uh oh!

laanwj commented Nov 26, 2015

Uh oh!

gmaxwell commented Nov 26, 2015

Uh oh!

DeftNerd commented Nov 30, 2015

Uh oh!

laanwj commented Dec 4, 2015

Uh oh!

gmaxwell commented Dec 4, 2015

Uh oh!

laanwj commented Dec 4, 2015

Uh oh!

laanwj commented Dec 4, 2015

Uh oh!

sipa commented Dec 5, 2015

Uh oh!

morcos Dec 11, 2015

Choose a reason for hiding this comment

Uh oh!

laanwj commented Apr 11, 2016

Uh oh!

sipa commented Apr 11, 2016

Uh oh!

gmaxwell commented May 20, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants