Skip to content

Conversation

@sipa
Copy link
Member

@sipa sipa commented Oct 5, 2018

This is a simple improvement to the performance of the duplicate input check in CheckTransaction.

It includes the benchmark from #14387, for which it shows about a 2x speedup compared to master (while #14387 shows a 4.5x speedup, but with a lot more complexity).

@sipa sipa force-pushed the 201810_fastduplicate branch from fb8edd0 to 6d1779d Compare October 5, 2018 06:47
@JeremyRubin
Copy link
Contributor

cr-ack 6d1779d -- one of my earlier versions looked a lot like that.

I'd also suggest another optimization to make the vector a prevector with say 20 outputs so we can avoid the allocation on most txns.

May I suggest using adjacent_find with std::equal instead of your own loop?
https://en.cppreference.com/w/cpp/algorithm/adjacent_find

@jnewbery
Copy link
Contributor

jnewbery commented Oct 5, 2018

I prefer this to #14387, but I have the same question for @sipa as I asked Jeremy in that PR:

This makes block propagation faster, but do we have an understanding of how much these milliseconds matter? Is there a way we can determine whether the increased complexity introduced is a reasonable price to pay for the performance win?

Granted, the complexity added here is minimal compared to 14387, but I'd still like to understand the cost/benefit analysis.

@sipa
Copy link
Member Author

sipa commented Oct 5, 2018

@jnewbery Yeah, that's fair. I just wanted to show an alternative which is a smaller improvement but at much lower review complexity. I'm not convinced we need either approach.

@jamesob
Copy link
Contributor

jamesob commented Oct 5, 2018

For what it's worth, bitcoinperf didn't show any significant performance regression for IBD or reindex after the merge of #14247, so it's not clear that either changeset is needed.

@maflcko
Copy link
Member

maflcko commented Oct 5, 2018

@jamesob I think this is not so much about ibd but rather propagation

@JeremyRubin
Copy link
Contributor

@MarcoFalke correct.

It's something like 13ms slower.

@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 5, 2018

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #14400 (Add Benchmark to test input de-duplication worst case by JeremyRubin)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@gmaxwell
Copy link
Contributor

gmaxwell commented Oct 5, 2018

Even block propagation is no longer critical for this: When the prior optimization went in, we didn't have relay-before-validate. I'm not opposed to this change-- as it's pretty straight forward to review-- but since this particular validity criteria is pure (a function of the tx itself with no dependency on external state) effort might be better spent reorging things so that this test ends up covered by the wtxid validity caching which, although complementary, would likely give a bigger improvement.

@ryanofsky
Copy link
Contributor

Is there an explanation of what purpose the duplicate input check will serve if the connect block code is fixed to check for valid spends?

@gmaxwell
Copy link
Contributor

gmaxwell commented Oct 7, 2018

@ryanofsky Making block connection alone check for duplicate inputs isn't sufficient: The whole reason this code was initially added in PR #443 was to keep duplicate inputs out of the mempool-- block connection did originally prevent the dupes. If both block-connection and mempool processing prevented duplicate inputs, then, indeed, there should be no reason for this code.

@jl2012
Copy link
Contributor

jl2012 commented Oct 10, 2018

I think both set+insert (the existing one) and sort (this PR) are O(NlogN). So sort is faster as it is more optimized?

@sipa sipa force-pushed the 201810_fastduplicate branch from 6d1779d to d5573dc Compare October 12, 2018 23:35
@sipa
Copy link
Member Author

sipa commented Oct 12, 2018

@jl2012 You get better memory locality and lower allocation overhead by representing things as a vector rather than a set. The set permits much faster updating though, but that's not something we need here.

@sipa sipa force-pushed the 201810_fastduplicate branch from d5573dc to 7a9ac18 Compare October 12, 2018 23:39
}

// Check for duplicate inputs - note that this check is slow so we skip it in CheckBlock
if (fCheckDuplicateInputs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tx with 1 input is quite common. Would it be faster if we skip those?

maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Nov 25, 2018
… case

e4eee7d Add Benchmark to test input de-duplication worst case (Jeremy Rubin)

Pull request description:

  Because there are now 2PRs referencing this benchmark commit, we may as well add it independently as it is worth landing the benchmark even if neither patch is accepted.

  bitcoin#14397
  bitcoin#14387

Tree-SHA512: 4d947323c02297b0d8f5871f9e7cc42488c0e1792a8b10dc174a25f4dd53da8146fd276949a5dbacf4083f0c6a7235cb6f21a8bc35caa499bc2508f8a048b987
@DrahtBot
Copy link
Contributor

Needs rebase

@maflcko
Copy link
Member

maflcko commented Apr 19, 2019

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.