Skip to content

Conversation

@jtimon
Copy link
Contributor

@jtimon jtimon commented May 4, 2017

The other day @gmaxwell pointed out on IRC that we're calculating the block hash many more times than we need to.
This attempts to improve that and should result in better performance but I haven't run any benchmark.

Contains: #9717

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use blockHash?

Copy link
Contributor Author

@jtimon jtimon May 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I saw this and then forgot about it

return true;
}

bool CheckBlockHeader(const CBlockHeader& block, CValidationState& state, const Consensus::Params& consensusParams, bool fCheckPOW)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this while you're at it. CheckBlockHeader is a terrible name if basically just compares 2 values :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, nevermind. I see that's been suggested in #9717

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion there was more about fully removing it and inlining it, which I prefer not to do for now. Just renaming is fine and as you point out it's free to do it now. Renaming to CheckProof (not with CheckProofOfWork to avoid confusing it with the version without CValidationState).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hash doesn't match the one that's ultimately reconstructed.

@jtimon jtimon force-pushed the b15-optimization-blockhash branch from ce6d1b2 to d11d277 Compare May 4, 2017 21:04
@jtimon
Copy link
Contributor Author

jtimon commented May 4, 2017

Fixed 3 nits by @theuni added a commit for TestBlockValidity, with a tiny optimization on getblocktemplate, and not much extra disruption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckProof doesnt say anything to me? I have no idea what "Proof" it would be checking, probably just remove the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, CheckProof also makes sense for the blocksigning branch #9177 (rewrite and rebase in progress), since you could be checking proof of work or the block script. But that name change can be done later, after all this function is being only called from 2 places at this point.
Feel free to propose another name, maybe we can also move it to pow.o while at it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets just remove it and call CheckProofOfWork directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussion on #9717 and #10339 (comment)
It doesn't seem to make much sense to remove a wrapper of CheckProofOfWork that takes the CValidationState to just add it again on another PR doesn't seem to make much sense. But if we don't do the blocksigning stuff or something we can remove the wrapper later.
In any case, renamed CheckProof to CheckProofOfWork, the name collision doesn't matter because the parameters are different, but maybe some other name is better.

@NicolasDorier
Copy link
Contributor

I think a better approach would be to cache it. A bit like it is already done in CTransaction versus CMulableTransaction

@jtimon
Copy link
Contributor Author

jtimon commented May 7, 2017

I think a better approach would be to cache it. A bit like it is already done in CTransaction versus CMulableTransaction

What is better about it?
Anyway, this doesn't preclude from caching it too.

Needs rebase

@NicolasDorier
Copy link
Contributor

better in the sense it does not require one more parameter and that it is coherent with what we are doing for Transaction.

@jtimon jtimon force-pushed the b15-optimization-blockhash branch from 21ed5e3 to 2069f92 Compare May 7, 2017 19:25
@jtimon
Copy link
Contributor Author

jtimon commented May 7, 2017

Rebased

EDIT: @NicolasDorier I think even if we do that it's better that some functions take the hash directly, maybe not all the the functions here. But I'm happy to review such a PR.

@gmaxwell
Copy link
Contributor

gmaxwell commented May 7, 2017

Hm. if the block were made immutable, caching the result would be trivial. The only place where a block is not immutable in the code is in the creation of a block (where we don't care about the hash). These changes require carrying around a hash where with an immutable block it would just be a method to return it.

So while these changes don't preclude caching it, I think the caching it patch would be smaller and simpler. Thoughts?

@ryanofsky
Copy link
Contributor

utACK 2069f9282cebb1c08b7e12c8155de3403dda39b4. Nice cleanup.

If a small caching implementation is possible, that would probably give a preferable end result. But this PR very simple and an obvious improvement over the status quo.

@jtimon jtimon force-pushed the b15-optimization-blockhash branch from 2069f92 to 46f5a9b Compare May 9, 2017 16:59
@jtimon
Copy link
Contributor Author

jtimon commented May 9, 2017

Needed rebase.
Yes, making the block immutable and using a cache would be more complete and also likely less disruptive. But I don't think I will have time to try that next week. If somebody else writes that alternative, please ping for review.

@sipa
Copy link
Member

sipa commented May 9, 2017

I prefer this approach for reducing rehashing first. Caching is always possible later, but brings its own complications (like either eager precomputation, possibly hashing in unnecessary places, or locking issues).

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor code comment from 2 days ago (forgot to hit submit button).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "Pow: Optimization: Pass const uint256& blockHash to CheckBlock"

Some places you are using const uint256 and other places you are using const uint256& for the blockHash local. Both ways should be functionality identical, but it might be nice to stick with a consistent style.

@ryanofsky
Copy link
Contributor

utACK 46f5a9b090d8084edd0160411f634c6140ac983d. Changes since previous review were just resolving chainParams merge conflict, renaming CheckProofOfWork, and making it static in an earlier commit.

Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK with tiny nits

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to stick to variable names that look like Hungarian style (which is certainly not a requirement, IMO), I do prefer it to be consistent. hashBlock would be correct here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does hashBlock even qualify as "hungarian style"? Anyway, it wasn't my intention, I just cleed it blockHash because it's the block's hash. I don't care to change it, but I really don't think s/blockHash/hashBlock/ is worth it here at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit: this may be slightly confusing as it may make it look like nothing new is computed, so I prefer const uint256 hashBlock = here instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't see how you suggestion for changing the name improves anything or gives more clarity, but if you think so, I can change it everywhere (including all function arguments) I guess...

@jtimon
Copy link
Contributor Author

jtimon commented May 31, 2017

Before bikeshedding blockHash, please let me squash. Nobody seems to have complained about individual functions, which was the main point of keeping it separated. I think it would would be cleaner in a single commit on top of #9717 (or maybe squash that too if it's not going to be merged independently).

@jtimon jtimon force-pushed the b15-optimization-blockhash branch from 46f5a9b to 04f6ce7 Compare May 31, 2017 01:50
@jtimon
Copy link
Contributor Author

jtimon commented May 31, 2017

Squashed to 2 commits on top of #9717 for now as explained in the previous comment.

EDIT: also scripted-diff: s/blockHash/block_hash/ on top

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 49342ca8ad5be47119a74a9989109f8eacf4dd68. Only changes since last review were squashing last 6 commits down into 1, and adding the new scripted diff commit to rename blockHashes.

@ryanofsky
Copy link
Contributor

@luke-jr do you have suggestions for this PR given your objection to #9717, which this is built on? Would you also disagree with changes in this PR? Or would you prefer to keep some changes here and discard others?

@luke-jr
Copy link
Member

luke-jr commented Jun 3, 2017

I don't see why it would require #9717. At least here, it renames the function, although it looks like it's doing some confusing overloading now :/

@jtimon
Copy link
Contributor Author

jtimon commented Jun 5, 2017

I don't see why it would require #9717. At least here, it renames the function, although it looks like it's doing some confusing overloading now :/

It is not strictly necessary, I can do this without any renaming and while keeping the stupid boolean argument in CheckBlockHeader if that is preferred. Or squash the first 2 commits, or rename it again to something different from "CheckProofOfWork" (but not "CheckProof" since that has been discarded already.
The only thing I don't want to do is remove CheckBlockHeader here or in #9717 . Can we please move forward and leave that decision (remove CheckBlockHeader or not [or whatever is called after rename]) for later?
I thought we agreed on #9717 that it was an improvement even if some people want to go further and remove CheckBlockHeader and others aren't convinced.

@ryanofsky
Copy link
Contributor

I acked both #9717 and this PR, but in my opinion the situation would be less confusing if this PR were modified not to depend on #9717, and then merged, and then there was an attempt to come to an agreement on #9717.

@TheBlueMatt
Copy link
Contributor

Yea, I think the disagreement in #9717 merits this being rebased on master directly instead.

@jtimon
Copy link
Contributor Author

jtimon commented Jun 7, 2017

Needs rebase Agreed, I will do it without #9717 and also without renaming CheckBlockHeader (which I believe was only asked because of #9717). My fault, I really expected #9717 to be fully uncontroversial and an obvious (although small) win to everyone, but let's just separate concerns and move forward.

…ore functions

- Pow: Pass const uint256& block_hash to s/CheckBlockHeader/CheckProofOfWork/
  Optimization because AcceptBlockHeader had the hash calculated already
- Pass const uint256& block_hash to CheckBlock
  Optimization because now ConnectBlock calls CBlock::GetHash() less times
- Pass const uint256& block_hash to ConnectBlock
  Optimization because TestBlockValidity had calculated block_hash already
- Pass const uint256& block_hash to AcceptBlock
  Optimization because now AcceptBlock reuses CBlock::GetHash() calculation from ProcessNewBlock
- Pass const uint256& block_hash to ProcessNewBlock
- Call CBlock::GetHash() less in net_processing::ProcessMessage()
- Pass const uint256& block_hash to TestBlockValidity
  Optimization because getblocktemplate had the hash calculated already
@jtimon jtimon force-pushed the b15-optimization-blockhash branch from 49342ca to aa881b2 Compare June 7, 2017 21:55
@jtimon
Copy link
Contributor Author

jtimon commented Jun 7, 2017

Rebased without #9717 as discussed and squashed into 1 commit.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK aa881b2. Change is straightforward, though I didn't try to confirm whether it actually does result in fewer block hashes (potentially some parts of this change might compute block hashes in cases where they aren't needed).

@laanwj
Copy link
Member

laanwj commented Jun 8, 2017

I'm going to do some measurements on this.

@laanwj
Copy link
Member

laanwj commented Jun 8, 2017

Aim: Block 1..300000 -reindex-chainstate. Added instrumentation to CBlockHeader::GetHash():

  • With patch: 1665111 block hashing operations (up to block 301215)
  • Without patch: 2260763 block hashing operations (up to block 300085)

There's a slight bit of imprecision because -stopatheight=300000 didn't work, so I had to use my reflexes. But I'm fairly sure the ~26% saving in block hash operations doesn't result from only those 1130 blocks. Oh wait, it's the other way around even :)

@achow101 achow101 mentioned this pull request Jun 9, 2017
@jtimon
Copy link
Contributor Author

jtimon commented Jun 17, 2017

Let's say I claim
"In a rather naive benchmark (reindex-chainstate up to block 284k, with
-assumevalid=0 and a very large dbcache), this connected blocks ~1.05x faster."
similarly to what @TheBlueMatt in #10192, would someone check this claim (did anyone checked for #10192)?
If true, would that be enough for making this interesting to those who isn't?
Is anyone working on doing that kind of benchmark automatically?

I assume he did it by looking at the logs. I haven't tried that for wither this PR or #10192, but if 1.05x is not enough...this probably gives less than that.
It just seems strange that this is disregarded as "the performance improvement is not worth it" without anyone actually meassured it besides laanwj's tests which only counts the number of hashes (not saying that's not helpful, but that doesn't seem to be enough).

@ryanofsky
Copy link
Contributor

I think this change is more good than bad. I generally think it's better to pass data explicitly where it needs to go than to sneak it through side channels or recompute it unnecessarily, and I don't see that much cost in adding function arguments. But I also agree with others in
https://botbot.me/freenode/bitcoin-core-dev/msg/86999458/ who pointed out that there is some cost to adding function arguments, and that it would be nicer if hashes could be obtained more efficiently from CBlocks or from another block class.

@jtimon, from my recollection of what people were saying two weeks ago, if this change actually did connect blocks "~1.05x faster," that would be enough to overcome objections. The thought was more that hashing a block is so much simpler than connecting a block, that you would not actually see any measurable increase in runtime performance from this change.

@sipa
Copy link
Member

sipa commented Jun 21, 2017

I would be very surprised if this has a measurable impact at all. A 26% reduction in block hashing operations would completely vanish compared to Merkle tree computations which are also done for each blocks (and need 1000s of hashes). From @laanwj's numbers I would expect a 1s CPU time reduction for a full sync on a modern system.

@sdaftuar
Copy link
Member

If we don't think this will have a measurable impact, I'd prefer that we not make this change -- seems to me that this makes the consensus code more brittle.

@laanwj
Copy link
Member

laanwj commented Jun 24, 2017

So while these changes don't preclude caching it, I think the caching it patch would be smaller and simpler. Thoughts?

Implicit caching, although the explicit code impact is smaller, is always somewhat error-prone. I think I prefer explicitly passing values in.

But if this isn't worth doing, we should simply not do it. Unnecessary changes are only a risk.
Closing.

@laanwj laanwj closed this Jun 24, 2017
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.