Skip to content

Conversation

@ashleyholman
Copy link
Contributor

This patch changes CTxMemPool::mapTx from a std::map to a boost::multi_index_container.

The container has three indexes:

  • Tx Hash
  • Fee Rate (fee-per-kb) - ordered from highest to lowest
  • Priority (at current chain height) - ordered from highest to lowest

The idea is that these indices can later be used for:

  • Creating blocks
  • Mempool eviction

The priority stuff is a bit ugly, so suggestions on alternate designs are welcome. I would also like to add another test case that tests the ConnectTip/DisconnectTip triggers, but I need to do some more work to figure out how to construct such a test.

Indexes on:
- Tx Hash
- Fee Rate (fee-per-kb)
- Priority (at current chain height)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation style is 4 spaces per level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest commit

@sipa
Copy link
Member

sipa commented Jun 24, 2015

Untested ACK. This is one of the few places where I think boost shines.

@sipa
Copy link
Member

sipa commented Jun 24, 2015

It would be nice to see how much the block creation code could be sped up and simplified with this, though.

src/main.cpp Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to know priorities aimed for the next block's height, as that is where the transactions are expected to end up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest commit. Also moved the recalcPriorities call to happen after UpdateTip(), because I didn't think the priority recalc was important enough to delay updating the tip.

- Rename ordered_transaction_set to indexed_transaction_set and move typedef
  inside CTxMemPool
- Use chain height + 1 for mempool priority calculation (also moved the
  calculation to happen after updating the tip, so it doesn't delay the block
  being added to the active chain)
- Fix indentation of test case
@ashleyholman
Copy link
Contributor Author

If this gets merged, the next thing I would look at is changing CreateNewBlock to use the indexes.

But first I need to fix the bug from #6292 because the CreateNewBlock code currently calculates priorities correctly, whereas these indexes use CTxMemPoolEntry::GetPriority which gives incorrect results.

@luke-jr
Copy link
Member

luke-jr commented Jun 27, 2015

@ashleyholman One thing to consider is that these things are policy, so need to be easily modified by the end user - especially for miners. Obviously we're far from that with the current code, but I wouldn't want to make the situation worse (I'm not saying this PR does, but it's something to keep in mind).

@dgenr8
Copy link
Contributor

dgenr8 commented Jul 8, 2015

Multiple indexes are great but, the building-block for eviction policy will probably take the form of an enhanced "priority" calc that depends on fees, and anything else that's required.

Separate indexes leave unsolved the problem of how to combine them to find the next tx to evict.

@sipa
Copy link
Member

sipa commented Jul 8, 2015 via email

@dgenr8
Copy link
Contributor

dgenr8 commented Jul 8, 2015

@sipa Completely agree. At least 2 indexes are needed - txid and "priority".

EDIT: @morcos by quoted "priority" I mean the same thing as "effective fee" or fees with fee deltas applied. Another name for it is an objective function (that which is to be maximized).

@morcos
Copy link
Contributor

morcos commented Jul 8, 2015

Continuing to support transaction priority (as defined by coin age) seems more trouble that its worth. However, by combining this with #6357, it'll be easy and efficient to keep the existing policies in place. I did some timing tests on the first 7 days of July and there was some small increase in the average of 250ms to connect a block. Calling recalcPriorities took about 2ms. In the busy spam period of July 7th, recalcPriorities was averaging about 8ms out of 500ms to connect a block. Those seem like acceptable performance hits to me. #6357 did not seem to adversely affect that performance and causes the priorities to actually be correct.

Where the default policies will break is when we start adding mempool eviction. We'd need to maintain two separate maps of transactions, one for those destined to be included in the fee section and one for those to be included in the priority section of blocks and evict from them independently. I'd argue that instead, we should accept that high priority but low fee transactions may get evicted from the mempool if we hit mempool saturation. Alternatively we could implement @petertodd's exponentially increasing minRelayTxFee and not actually evict things. Either way those are next steps and shouldn't hold up this getting merged.

utACK at this point, I'm going to keep testing.

@dgenr8
Copy link
Contributor

dgenr8 commented Jul 8, 2015

@mikehearn insight: mempool DoS attack may be mitigated through some reliance on coin age priority, as coin age is not something that can be trivially bought on the spot, unlike fee priority.

@ashleyholman
Copy link
Contributor Author

@morcos thanks for measuring the recalcPriorities performance

@jgarzik
Copy link
Contributor

jgarzik commented Jul 10, 2015

Drop the priority index, IMO

@sipa
Copy link
Member

sipa commented Jul 10, 2015

@wumpus @jgarzik @pstratem @gmaxwell @morcos @ashleyholman @jtimon @luke-jr Ping to get attention of the people involved here.

There are a bunch of related but not quite compatible changes being made here (better indexes, limited mempool, removing priority, accounting memory usage). I think we need to figure out what to do here at a higher level.

  1. Remain compatible with the existing priorities system. This means the recomputations of priority as implemented in this PR, and retaining the relatively complex multi-faceted block creation code. This is likely the least controversial change, but implementing correct mempool limiting becomes pretty hard.

  2. Simplify the priority mechanism to become single-faceted. This would mean defining a (potentially policy-dependent) single scoring function for transactions that can give a benefit to specific transactions. Something similar to priority could be implemented by mapping the "zero-fee" transactions onto a negative number -1/priority. A single ordered index would suffice, and mempool limiting and block creation would be simplified significantly, but re-evaluating the ordering would be necessary after every block.

  3. The same as (2), but only compute the policy-dependent scoring once only - when entering the mempool. You can approximate something similar to priority by counting old/large inputs as a virtual small increase in fee. It would not get updated for new blocks, simplifying code for replacement/limiting further.

  4. Just have a single feerate index, essentially limiting ourselves to a single mempool policy. This is arguably the most economic thing to do for miners, and has the best predictability for fee estimation. It's perhaps the most controversial change.

I would prefer 2 or 3.

@jtimon
Copy link
Contributor

jtimon commented Jul 10, 2015

I think my preference is a mix between 3 and 4.
I would prefer a single heuristic function that doesn't need to be recalculated with every block.
My preference: priority(tx, utxo) = fee(tx) / (size(tx) + delta_utxo_size(tx, utxo))
Note that delta_utxo_size(tx, utxo) can be negative but size(tx) + delta_utxo_size(tx, utxo) shouldn't.
In fact, we could make a consensus_size(tx, utxo) = size(tx) + delta_utxo_size(tx, utxo) a consensus rule, but I guess that's a topic for another PR.

essentially limiting ourselves to a single mempool policy

Note that this limitation disappears when we have a CPolicy interface class and a -policy= command line option.

@sipa
Copy link
Member

sipa commented Jul 10, 2015

@jtimon I wasn't even talking about specific policy we would like to see implemented, but rather about what type the "framework" should support. (3) and (4) cannot do the same thing as policy now. (4) doesn't have any policy in the mix at all for prioritization of transactions.

@morcos
Copy link
Contributor

morcos commented Jul 10, 2015

I think I'd vote for 1. Not because I really care about keeping priority, but because I don't want the controversy of removing it to delay the other needed improvements. I can see two downsides to 1. There is some additional memory footprint and complication by maintaining correct priorities, but I think the efficiency hit is tolerable. And the issue about defining "correct" mempool limiting. However, I think that issue exists regardless of whether we maintain priority. @jgarzik's comments about booting oldest first make a lot of sense for relay nodes which are incapable of perfectly guessing the composite miner transaction inclusion policies, but don't make sense for miners who should boot based on their own inclusion policies. Also if we combine a floating relay minimum with these other things, we might be able to reach a situation where limiting the mempool is an edge case safety measure.

@sipa
Copy link
Member

sipa commented Jul 10, 2015

@morcos My largest objection to 1 is that it makes implementing limiting the mempool harder, and (2) can be made to do the (almost?) same thing, but needs a larger code change.

I don't think we can make mempool limiting an edge case. Adjusting the relayfee works preventive, while mempool limiting works on-the-fly.

@jtimon
Copy link
Contributor

jtimon commented Jul 10, 2015

@jtimon I wasn't even talking about specific policy we would like to see implemented, but rather about what type the "framework" should support.

Ok, then I'd like a unified Priority(const CTransaction& tx, const CCoinsViewCache& mapInputs) function and a single unified space: no separated space for low fee transactions. If I understood you correctly, that's the same as 3.

(3) and (4) cannot do the same thing as policy now.

I don't think I understand this. Why can't Priority(const CTransaction& tx, const CCoinsViewCache& mapInputs) be implemented in policy/policy or policy/fees.

@sipa
Copy link
Member

sipa commented Jul 10, 2015

@jtimon (3) and (4) cannot update the score after acceptance into the mempool. Priority is time-dependent (needs to be recomputed after every block).

@ashleyholman ashleyholman force-pushed the mempool_multiindex branch 2 times, most recently from c7e0cbb to 362bf76 Compare July 11, 2015 09:04
@ashleyholman
Copy link
Contributor Author

@sipa I have removed the priority index. The fee-rate index previously used priority as a secondary ordering, so I have now changed that to use nTime instead. Ie. when the fee-rate is equal, the index will favour older transactions.

@jtimon I can add an index on time, but first I would like to understand the reasoning for CTxMemPool::removeOldTransactions. I've posted a comment in #6281 to clarify.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@ashleyholman sure, that it's something that could be always added later anyway (for example, in #6281 ).

What about replacing CompareTxMemPoolEntryByFee with something that calls an independent function instead of calling CTxMemPoolEntry::GetFeeRate() ?
It can be functionally identical as what you have (ie: the independent function also looks returns the feerate), but it will make it easier to support multiple policies with different heuristic functions later (specially if you define the function in policy/policy.cpp). The new priority function can return just an integer instead of a double, and it should take const CTransaction& as parameter instead of a CTxMemPoolEntry like CTxMemPoolEntry::GetFeeRate() [as this "parameter"] IMO.

@sipa
Copy link
Member

sipa commented Jul 11, 2015

@jtimon You might as well just replace the feerate field in CTxMemPoolEntry with a score field, and call a policy function to compute the score when creating the entry, and then simply compare the scores. Alternatively, you definitely have to pass the CTxMemPoolEntry and not CTransaction, as you can't compute the feerate from a CTransaction.

@petertodd
Copy link
Contributor

Concept ACK @sipa's unified priority ideas.

Done right this should make implementing CPFP reasonably easy, as the child can have a high priority, meaning "mine this tx, and whatever txs it depends on" Block creation then needs to be taught to to keep a list of already included txs; if it has a list of already spent txouts we can easily implement CPFP RBF scorched-earth by simply not removing double-spent txs immediately.

I did some work on that idea awhile back; didn't have time/funding to finish it, but it seemed like a reasonable approach at least.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@ashleyholman What I'm saying above, corrected and in code (on top of fad00fb ): f86244e
Also, I think https://github.com/bitcoin/bitcoin/pull/6331/files#diff-8304b3e94624036c3673f31eeb7e9de0R83 removes the need for the time index.

@sipa Yes, there were some wrong things in what I said. I decided that it would be faster for me to communicate with code. Please, let me know what you think about that. Most of the arguments are not used, but they could be useful for some other policies. I believe this is the most policy-flexible thing we can do without disturbing anything.

@sipa
Copy link
Member

sipa commented Jul 11, 2015

@jtimon I fully agree that the sorting order should be something controlled by policy. I think there are more urgent issues than that now, though.

@sipa
Copy link
Member

sipa commented Jul 11, 2015

@ashleyholman See rebased version of this PR on top of #6410 inside #6421.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@sipa The point is that my proposal is functionally equivalent and almost diff-free (compared to the total PR diff) if it is done in this PR. If it is not done in this PR, it can be done later with a less clean history and more work as additional costs. In terms of git history bike-shedding, this is the perfect time and place to do this, precisely because it's almost free to do so (nothing is depending on the new index yet).
Note that my 19+7- (compared to the PR, not as a delta(PR, PR+jtimon_nits) which is even smaller) proposal doesn't have any dependencies besides this PR in itself. It is intended to be squashed in this PR or left for later.
What is the disadvantage of squashing my suggestion?

@sipa
Copy link
Member

sipa commented Jul 11, 2015

@jtimon I'm sorry, but I have no time to deal with that right now.

@ashleyholman
Copy link
Contributor Author

@jtimon looks good to me. The only thing I wonder about is the nHeight ordering which is still inside the comparator. I wonder if there's a nice way to move that inside the policy as well. Maybe the comparator function has to be provided by the policy class?

BTW I'm not sure how to fetch your commit down into my local repository, because github isn't telling me which repo/branch it's from.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

Since @sipa is worried that introducing policy may slow this down, here's an even simpler fixup commit ( 1bae675 ) that would solve my concerns, which I will try to summarize.
I'm fine with using the feeRate alone as the heuristics/score for the new index, but let's not get married to it, we don't know what the score will depend on in the future: all we know it's that we want it to be calculated once and the information we have available in CTxMemPoolEntry's constructor.
Please, let's calculate the feeRate score in an independent function that doesn't make any additional assumptions. I don't think I'm asking for that much (specially when it's almost free total-diff-wise).

@petertodd
Copy link
Contributor

Keep in mind that some of @sipa's fixes may wind up getting backported to v0.10.x as a emergency hot fix, so it pays to keep them as minimal as possible.

On 11 July 2015 10:53:14 GMT-04:00, "Jorge Timón" [email protected] wrote:

Since @sipa is worried that introducing policy may slow this down,
here's an even simpler fixup commit (
1bae675
) that would solve my concerns, which I will try to summarize.
I'm fine with using the feeRate alone as the heuristics/score for the
new index, but let's not get married to it, we don't know what the
score will depend on in the future: all we know it's that we want it to
be calculated once and the information we have available in
CTxMemPoolEntry's constructor.
Please, let's calculate the feeRate score in an independent function
that doesn't make any additional assumptions. I don't think I'm asking
for that much (specially when it's almost free total-diff-wise).


Reply to this email directly or view it on GitHub:
#6331 (comment)

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@petertodd I believe my current suggestion is very minimal and portable (no less than the current PR, since it only adds a single-line function that can be easily pasted in previous versions).
My suggestion isn't actually not functionally equivalent but superior: the PR as it stands is dividingmultiplying the score by 1000 for no good reason.

@petertodd
Copy link
Contributor

Sure, so make it a separate pull-req to make that easy to evaluate; easier to cherry pick if its separate.

On 11 July 2015 11:05:50 GMT-04:00, "Jorge Timón" [email protected] wrote:

@petertodd I believe my current suggestion is very minimal and portable
(no less than the current PR, since it only adds a single-line function
that can be easily pasted in previous versions).
My suggestion isn't actually not functionally equivalent but superior:
the PR as it stands is dividing the score by 1000 for no good reason.


Reply to this email directly or view it on GitHub:
#6331 (comment)

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@petertodd No, it is better to cherry pick this one if ashleyholman squashes my commit.

@sipa
Copy link
Member

sipa commented Jul 11, 2015

@jtimon The code in that commit looks perfectly fine to me, but I don't think there's a benefit to including it now.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@sipa the only benefit is that it won't have to be introduced in the future (when it won't be free total-diff-of-this-PR-wise [well, it's actually +4 for the new function but "just now" is as cheap as it can get diff-wise]).
What is the disadvantage? That's totally what I'm missing.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

IMO the question is why would CTxMemPoolEntry::feeRate ever need to be replaced by CTxMemPoolEntry::nScore if CTxMemPoolEntry::feeRate hasn't been introduced yet, forget about the independent function if you want:

    nScore = _nFee / nTxSize;

or even (to be fully-funcationally equivalent to the current PR)

    nScore = _nFee * 1000 / nTxSize;

is enough for me: fully diff-free if included in this PR.

EDIT: @ashleyholman My latest proposal is a596abe ashleyholman/bitcoin@mempool_multiindex...jtimon:pr-6331-0.11.99 +7-7 with respect to your PR and +0-0 total-diff-wise. You tell me if what I'm asking is that crazy.

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

Nits commit rebased.

@ashleyholman
Copy link
Contributor Author

@jtimon I see your point, but I think that keeping each patch understandable and isolated is just as important as minimizing the lines changed.

This patch is more readable if it is just introducing an index on fee rate without trying to genericize the policy at the same time. A future patch could externalize the policy while introducing the score variable(s).

@jtimon
Copy link
Contributor

jtimon commented Jul 11, 2015

@ashleyholman The code (that doesn't contain anything related to policy) is here https://github.com/jtimon/bitcoin/commits/pr-6331-0.11.99

@sipa whatever call it nRevenueScore instead of nScore or even feeRate, but there's not reason to use CFeeRate instead of int64_t.
What is the disadvantage of using int64_t instead of CFeeRate. Or, what is the advantage of doing the opposite?

The main advantage of not using CFeeRate right now is not having to fix it later.

@mikehearn
Copy link
Contributor

Priority cannot be removed without breaking things for current users and programs that rely on it. In particular, priority is the main reason that a transaction that undershoots on fee will still confirm eventually, which lends significant robustness. And priority is probably very useful for ensuring that at least some legitimate transactions confirm even if someone is burning fee money to flood the network with DoS transactions. Please do not remove it.

@ashleyholman
Copy link
Contributor Author

This work has been rolled into #6421 so I'll close this off now.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants