Faster duplicate input check in CheckTransaction (alternative to #14387) #14397

sipa · 2018-10-05T06:46:43Z

This is a simple improvement to the performance of the duplicate input check in CheckTransaction.

It includes the benchmark from #14387, for which it shows about a 2x speedup compared to master (while #14387 shows a 4.5x speedup, but with a lot more complexity).

JeremyRubin · 2018-10-05T06:53:42Z

cr-ack 6d1779d -- one of my earlier versions looked a lot like that.

I'd also suggest another optimization to make the vector a prevector with say 20 outputs so we can avoid the allocation on most txns.

May I suggest using adjacent_find with std::equal instead of your own loop?
https://en.cppreference.com/w/cpp/algorithm/adjacent_find

jnewbery · 2018-10-05T07:04:06Z

I prefer this to #14387, but I have the same question for @sipa as I asked Jeremy in that PR:

This makes block propagation faster, but do we have an understanding of how much these milliseconds matter? Is there a way we can determine whether the increased complexity introduced is a reasonable price to pay for the performance win?

Granted, the complexity added here is minimal compared to 14387, but I'd still like to understand the cost/benefit analysis.

sipa · 2018-10-05T07:06:39Z

@jnewbery Yeah, that's fair. I just wanted to show an alternative which is a smaller improvement but at much lower review complexity. I'm not convinced we need either approach.

jamesob · 2018-10-05T07:43:11Z

For what it's worth, bitcoinperf didn't show any significant performance regression for IBD or reindex after the merge of #14247, so it's not clear that either changeset is needed.

maflcko · 2018-10-05T07:50:00Z

@jamesob I think this is not so much about ibd but rather propagation

JeremyRubin · 2018-10-05T08:00:18Z

@MarcoFalke correct.

It's something like 13ms slower.

DrahtBot · 2018-10-05T10:42:55Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#14400 (Add Benchmark to test input de-duplication worst case by JeremyRubin)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

gmaxwell · 2018-10-05T23:08:51Z

Even block propagation is no longer critical for this: When the prior optimization went in, we didn't have relay-before-validate. I'm not opposed to this change-- as it's pretty straight forward to review-- but since this particular validity criteria is pure (a function of the tx itself with no dependency on external state) effort might be better spent reorging things so that this test ends up covered by the wtxid validity caching which, although complementary, would likely give a bigger improvement.

ryanofsky · 2018-10-06T03:05:40Z

Is there an explanation of what purpose the duplicate input check will serve if the connect block code is fixed to check for valid spends?

gmaxwell · 2018-10-07T16:43:27Z

@ryanofsky Making block connection alone check for duplicate inputs isn't sufficient: The whole reason this code was initially added in PR #443 was to keep duplicate inputs out of the mempool-- block connection did originally prevent the dupes. If both block-connection and mempool processing prevented duplicate inputs, then, indeed, there should be no reason for this code.

jl2012 · 2018-10-10T17:55:30Z

I think both set+insert (the existing one) and sort (this PR) are O(NlogN). So sort is faster as it is more optimized?

sipa · 2018-10-12T23:37:21Z

@jl2012 You get better memory locality and lower allocation overhead by representing things as a vector rather than a set. The set permits much faster updating though, but that's not something we need here.

jl2012 · 2018-10-13T07:55:13Z

src/consensus/tx_verify.cpp

    }

    // Check for duplicate inputs - note that this check is slow so we skip it in CheckBlock
    if (fCheckDuplicateInputs) {


Tx with 1 input is quite common. Would it be faster if we skip those?

… case e4eee7d Add Benchmark to test input de-duplication worst case (Jeremy Rubin) Pull request description: Because there are now 2PRs referencing this benchmark commit, we may as well add it independently as it is worth landing the benchmark even if neither patch is accepted. bitcoin#14397 bitcoin#14387 Tree-SHA512: 4d947323c02297b0d8f5871f9e7cc42488c0e1792a8b10dc174a25f4dd53da8146fd276949a5dbacf4083f0c6a7235cb6f21a8bc35caa499bc2508f8a048b987

DrahtBot · 2018-11-25T23:13:52Z

Needs rebase

maflcko · 2019-04-19T17:32:19Z

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.

sipa force-pushed the 201810_fastduplicate branch from fb8edd0 to 6d1779d Compare October 5, 2018 06:47

JeremyRubin mentioned this pull request Oct 5, 2018

Faster Input Deduplication Algorithm #14387

Closed

JeremyRubin mentioned this pull request Oct 5, 2018

Add Benchmark to test input de-duplication worst case #14400

Merged

fanquake added the Consensus label Oct 5, 2018

sipa force-pushed the 201810_fastduplicate branch from 6d1779d to d5573dc Compare October 12, 2018 23:35

JeremyRubin and others added 2 commits October 12, 2018 16:38

Add Benchmark to test input de-duplication worst case

619a568

Faster duplicate checking in CheckTransaction

7a9ac18

sipa force-pushed the 201810_fastduplicate branch from d5573dc to 7a9ac18 Compare October 12, 2018 23:39

jl2012 reviewed Oct 13, 2018

View reviewed changes

DrahtBot added the Needs rebase label Nov 25, 2018

JeremyRubin mentioned this pull request Nov 29, 2018

Stricter & More Performant Invariant Checking in CheckBlock #14837

Closed

maflcko closed this Apr 19, 2019

laanwj removed the Needs rebase label Oct 24, 2019

andrewtoth mentioned this pull request May 25, 2020

validation: Persist coins cache to disk and load on startup #18941

Closed

bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021

Faster duplicate input check in CheckTransaction (alternative to #14387) #14397

Faster duplicate input check in CheckTransaction (alternative to #14387) #14397

Uh oh!

Conversation

sipa commented Oct 5, 2018

Uh oh!

JeremyRubin commented Oct 5, 2018

Uh oh!

jnewbery commented Oct 5, 2018

Uh oh!

sipa commented Oct 5, 2018

Uh oh!

jamesob commented Oct 5, 2018

Uh oh!

maflcko commented Oct 5, 2018

Uh oh!

JeremyRubin commented Oct 5, 2018

Uh oh!

DrahtBot commented Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

gmaxwell commented Oct 5, 2018

Uh oh!

ryanofsky commented Oct 6, 2018

Uh oh!

gmaxwell commented Oct 7, 2018

Uh oh!

jl2012 commented Oct 10, 2018

Uh oh!

sipa commented Oct 12, 2018

Uh oh!

jl2012 Oct 13, 2018

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented Nov 25, 2018

Uh oh!

maflcko commented Apr 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

DrahtBot commented Oct 5, 2018 •

edited

Loading