Split headers-only versions off CheckBlock/AcceptBlock #3884

sipa · 2014-03-16T20:17:03Z

There is a small semantic change here, namely that blocks with invalid transaction data (eg duplicate hashes, non-first coinbase, ...) will have their headers accepted into the tree (but never considered for connecting).

That's an inevitable consequence of moving towards headers-first, as we'll do header connectivity checkinng before transaction checking anyway (the transaction data won't be available until later, so we rather do more extensive checking on the headers that are available first).

This has very low DoS risk, as it still requires faking PoW.

luke-jr · 2014-03-16T20:30:58Z

This doesn't change the actual behaviour of the pre-existing functions, correct? So, eg, unit test assumptions remain valid..

sipa · 2014-03-16T20:34:58Z

A better summary of the changes:

before: standalone headers/tx checks, (orphan store), tree headers/tx checks, store block, connect
after: standalone headers checks, tree headers checks, store headers, standalone tx checks, (orphan store), tree tx checks, store block, connect

In a later pull request, the orphan storage will be removed entirely, as blocks can be stored with just headers validated but not transactions (potentially resulting in out-of-order block storagre).

sipa · 2014-03-17T19:06:41Z

Please take your time, and ask for clarification as necessary.

sipa · 2014-03-29T16:16:46Z

There is a bug in this code, which prevents syncing. Closing until fixed.

sipa · 2014-04-08T00:14:46Z

\o/ segfault

sipa · 2014-04-08T18:22:16Z

Bug fixed.

@luke-jr I don't think any assumptions that are used in unit tests change. The only real change is that now CBlockIndex entries in the block tree may exist that do not have actual block data associated with them.

mikehearn · 2014-04-10T15:08:41Z

I did a review, comments ended up on the origin commit rather than here - oops.

Overall looks good, just would appreciate some more comments and helper methods in a few places to make the logic easier to follow and reduce the chances of typos with all the bit twiddling.

Diapolo · 2014-04-11T12:42:37Z

src/main.cpp

Shouldn't these be constants perhaps in CChainParams?

Agree, but let's do that in another PR.

rdponticelli · 2014-04-16T16:43:40Z

There seems to be some bug on the connection logic in this pull. I got a node stuck after it got shutdown during syncing. After launching it again, all the remaining blocks got marked as orphans, and it never resumed syncing. Shutting down and launching again several times didn't help.

sipa · 2014-04-16T17:24:54Z

@rdponticelli With a recent version of this code (less than 8 days old?)

rdponticelli · 2014-04-16T17:35:03Z

@sipa Yes, last version. Culprit seems to be some kind of race condition with the low disk space code. Some logs:

2014-04-16 14:18:49 UpdateTip: new best=0000000000000000937bdca5485b8a02da4cbfea74fe982903a2da1ba8537866 height=295933 log2_work=77.988128 tx=36873027 date=2014-04-15 07:18:51 progress=0.993805
2014-04-16 14:18:49 ProcessBlock: ACCEPTED
2014-04-16 14:18:49 *** Error: Disk space is low!
2014-04-16 14:19:03 ERROR: AcceptBlock() : FindBlockPos failed
2014-04-16 14:19:03 ERROR: ProcessBlock() : AcceptBlock FAILED
2014-04-16 14:19:03 Requesting shutdown
2014-04-16 14:19:03 ProcessBlock: ORPHAN BLOCK 0, prev=0000000000000000a89d601cd42705cfe8a25b6546ed892309c67e5a197e3fdb
2014-04-16 14:19:03 ProcessBlock: ORPHAN BLOCK 1, prev=000000000000000005ea1ca8a944a574826bd856b83a29871f46c8e2a2323deb
2014-04-16 14:19:03 Running Shutdown in thread

After that all the chain is detected as orphan:

2014-04-16 14:58:04 Initialization result: 1
2014-04-16 14:58:14 ProcessBlock: ORPHAN BLOCK 0, prev=0000000000000000a89d601cd42705cfe8a25b6546ed892309c67e5a197e3fdb
2014-04-16 14:58:15 ProcessBlock: ORPHAN BLOCK 1, prev=000000000000000005ea1ca8a944a574826bd856b83a29871f46c8e2a2323deb
2014-04-16 14:58:17 ProcessBlock: ORPHAN BLOCK 2, prev=00000000000000001179ce5490d2ff833c0afb1799b6beb0981096c9800002c8
2014-04-16 14:58:17 ProcessBlock: ORPHAN BLOCK 3, prev=00000000000000001b4fd2aaec801ac068ea2a39a4b9f9eb16569a69c6713d6f
2014-04-16 14:58:18 ProcessBlock: ORPHAN BLOCK 4, prev=000000000000000023524d33b27b7ea7135332b32853a7582cf504e20644ea48
2014-04-16 14:58:19 ProcessBlock: ORPHAN BLOCK 5, prev=0000000000000000a5344e5239e17ae8995d3c7c1417b38d0dc1e047595a3807

rdponticelli · 2014-04-16T18:18:40Z

Running the code in master in this state I got:

2014-04-16 17:10:31 init message: Done loading
2014-04-16 17:10:31 Initialization result: 1
2014-04-16 17:10:41 Pre-allocating up to position 0x7000000 in blk00132.dat
2014-04-16 17:10:47 ERROR: ReadBlockFromDisk : OpenBlockFile failed
2014-04-16 17:10:47 *** Failed to read block
2014-04-16 17:10:51 ERROR: AcceptBlock() : AddToBlockIndex failed
2014-04-16 17:10:51 ERROR: ProcessBlock() : AcceptBlock FAILED
2014-04-16 17:10:51 ERROR: ReadBlockFromDisk : OpenBlockFile failed
2014-04-16 17:10:51 *** Failed to read block
2014-04-16 17:10:51 ERROR: AcceptBlock() : AddToBlockIndex failed
2014-04-16 17:10:51 ERROR: ProcessBlock() : AcceptBlock FAILED
2014-04-16 17:10:51 Requesting shutdown
2014-04-16 17:10:51 Running Shutdown in thread
2014-04-16 17:10:51 ERROR: ReadBlockFromDisk : OpenBlockFile failed
2014-04-16 17:10:51 *** Failed to read block
2014-04-16 17:10:51 addcon thread interrupt
2014-04-16 17:10:51 net thread interrupt
2014-04-16 17:10:51 dumpaddr thread stop
2014-04-16 17:10:51 opencon thread interrupt
2014-04-16 17:10:53 ERROR: AcceptBlock() : AddToBlockIndex failed
2014-04-16 17:10:53 ERROR: ProcessBlock() : AcceptBlock FAILED
2014-04-16 17:10:53 msghand thread interrupt
2014-04-16 17:10:53 Shutdown : In progress...

And after that the new code again:

2014-04-16 17:45:19 init message: Done loading
2014-04-16 17:45:19 Initialization result: 1
2014-04-16 17:45:28 ProcessBlock: ACCEPTED
2014-04-16 17:45:31 ProcessBlock: ACCEPTED
2014-04-16 17:45:31 CheckForkWarningConditions: Warning: Found invalid chain at least ~6 blocks longer than our best chain.
Chain state database corruption likely.
2014-04-16 17:45:31 ProcessBlock: ACCEPTED
2014-04-16 17:45:31 CheckForkWarningConditions: Warning: Large valid fork found
forking the chain at height 295933 (0000000000000000937bdca5485b8a02da4cbfea74fe982903a2da1ba8537866)
lasting to height 295941 (00000000000000002cda160c4452e7ad5252453a969dc6ff9b6e9e8bb584093f).
Chain state database corruption likely.
2014-04-16 17:45:31 ProcessBlock: ACCEPTED
2014-04-16 17:45:33 CheckForkWarningConditions: Warning: Large valid fork found
forking the chain at height 295933 (0000000000000000937bdca5485b8a02da4cbfea74fe982903a2da1ba8537866)
lasting to height 295942 (0000000000000000adfe890f4c33801b26fac05207747652d8c311244e59a6d0).
Chain state database corruption likely.

And it's still going like this....

sipa · 2014-04-17T11:21:54Z

@rdponticelli Thanks for testing!

So the problem was that if the block data failed to write, we ended up with a block header in the index, but no corresponding block data in the block file. At next startup, this would mean it would need to download it again, but the block download logic can't deal with the in-index-but-not-on-disk case yet, which will be added later.

I've (hopefully) fixed it by only writing the block header to the index after the block data is present.

rdponticelli · 2014-04-17T19:19:42Z

Now I'm getting this on another node. It seems that it's breaking when it changes the block file:

2014-04-17 19:02:56 ProcessBlock: ACCEPTED
2014-04-17 19:03:07 UpdateTip: new best=0000000000000000551b394408e467f83cd5d6f36612f8e33643d188f0d4498f height=293947 log2_work=77.722856 tx=36140885 date=2014-04-03 11:35:26 progress=0.931634
2014-04-17 19:03:07 ProcessBlock: ACCEPTED
2014-04-17 19:03:18 Pre-allocating up to position 0x300000 in rev00129.dat
2014-04-17 19:03:19 UpdateTip: new best=000000000000000055c9d3b2a816877d3b1a9ca6048170ed7cee1c8629255e0e height=293948 log2_work=77.72298 tx=36141736 date=2014-04-03 11:56:14 progress=0.931702
2014-04-17 19:03:19 ProcessBlock: ACCEPTED
2014-04-17 19:03:21 Pre-allocating up to position 0x2000000 in blk00129.dat
2014-04-17 19:03:52 UpdateTip: new best=0000000000000000b2a0b9c58aa86994aa85385d0d8a8676945f6d3b642b27a9 height=293949 log2_work=77.723105 tx=36142897 date=2014-04-03 12:08:09 progress=0.931744
2014-04-17 19:03:53 ProcessBlock: ACCEPTED
2014-04-17 19:03:53 ProcessBlock: ORPHAN BLOCK 0, prev=00000000000000006c3f5ca15b21e43e7328c6ea033a25671f08c396cfd027ca
2014-04-17 19:03:59 UpdateTip: new best=00000000000000006e97b96972f7f31f0847fcf35e723193c0949dad4bbc730c height=293950 log2_work=77.723229 tx=36143254 date=2014-04-03 12:14:21 progress=0.931765
2014-04-17 19:03:59 ProcessBlock: ACCEPTED
2014-04-17 19:04:03 UpdateTip: new best=0000000000000000cf179e61ecbe56eb19561396ef72650f2a8b3e7d29d21cde height=293951 log2_work=77.723353 tx=36143569 date=2014-04-03 12:20:27 progress=0.931785
2014-04-17 19:04:03 ProcessBlock: ACCEPTED
2014-04-17 19:04:05 UpdateTip: new best=000000000000000039022b02422dfee754c2dd91cf8dac8289c956b30da37b80 height=293952 log2_work=77.723478 tx=36143600 date=2014-04-03 12:20:35 progress=0.931785
2014-04-17 19:04:05 ProcessBlock: ACCEPTED
2014-04-17 19:04:05 ProcessBlock: ORPHAN BLOCK 1, prev=00000000000000007f49a436045b4190c4164adef4479ebc290ae1379826b40a
2014-04-17 19:04:07 UpdateTip: new best=00000000000000001c9b145536ce2f3ffd529742b0ab58a4bdee1ebd985e4c69 height=293953 log2_work=77.723602 tx=36143765 date=2014-04-03 12:24:37 progress=0.931799
2014-04-17 19:04:07 ProcessBlock: ACCEPTED
2014-04-17 19:04:07 ProcessBlock: ORPHAN BLOCK 2, prev=0000000000000000c0b4bede793045606417892efcbb83c3d5c52b9aa74b9086
2014-04-17 19:04:08 UpdateTip: new best=0000000000000000c9cbe25307641906939f3e767d94dd7d2553b860dad7a835 height=293954 log2_work=77.723726 tx=36143895 date=2014-04-03 12:27:15 progress=0.931808

sipa · 2014-04-17T19:49:07Z

Just seeing orphans while syncing is perfectly expected (not intentional though, but it will require more to fix that).

sipa · 2014-04-18T21:42:40Z

Rebased and addressed @mikehearn's comments.

Also modify some connection logic to deal with non-full blocks in the index.

sipa · 2014-04-24T22:36:46Z

Rebased.

BitcoinPullTester · 2014-04-24T23:00:56Z

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/942b33a19d3aa96326acc044c1834c48beff777c for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

mikehearn · 2014-04-29T21:42:24Z

@gavinandresen How does this look to you?

leofidus · 2014-05-02T13:38:56Z

Tested this on the Dogecoin blockchain. Both initial sync and keeping up with the blockchain worked as normal, both restarting the client and waking up from hibernation worked fine and I didn't spot anything unusual in the logs.

rebroad · 2014-05-09T00:09:55Z

I'm currently testing this. It's still incredibly slow due to the peer selection for the block download. I really think it should take the speed of download into consideration for selecting a peer, and I still don't understand why it selects only one peer - why can't it select several? After all, most download tools these days choose to download from multiple sources to reduce the load on any one source.

gmaxwell · 2014-05-09T00:14:00Z

@rebroad This is an incremental reorganization. All that stuff you are talking about was implemented months ago in sipa's headers first branch but it was too risky a change to take at once and wasn't seeing adequate testing. This pull (and others) are breaking up the steps to move to it incrementally in a way which is easier to verify.

rebroad · 2014-05-09T01:48:09Z

@gmaxwell yes, I know. The stuff I am talking about hasn't been implemented yet. This is why I am mentioning it.

gmaxwell · 2014-05-09T02:00:54Z

The stuff I am talking about hasn't been implemented yet.

Pulling from multiple peers and taking speed into account were both implemented in the headers first branch. (actually it does something which is usually better than take speed into account it tracks when peers are stalling the process and kicks out the straggler(s)).

laanwj · 2014-05-09T05:46:27Z

@rebroad That's just rude. Please keep discussion in issues about the goal of the issue. @sipa does not claim that this change solves the problems that you mention, so you should not be disappointed that they are not solved by it.

sipa · 2014-05-09T10:25:24Z

@rebroad Selecting multiple peers to download from would just bring duplicate block downloads back, unless it happens in a coordinated way. No such coordination exists, because we don't know in advance which peers have which blocks - we just download what they tell us they have.

Nobody claims we can't download from multiple peers at once - that is one of the goals of headers-first synchronization (which this is incremental work towards, but it will take a few more pull requests before we're actually there).

mikehearn · 2014-05-09T11:00:06Z

How do we decide when there's been enough testing and review? Seems like the path forward here isn't clear.

laanwj · 2014-05-09T11:31:36Z

@mikehearn What part is unclear? This is tagged '0.10' so will be merged after the branch-off of 0.9.2.

mikehearn · 2014-05-09T11:46:03Z

This may be a discussion for the mailing list, but I'm not sure it's healthy to keep pull requests that are "done" queued up where they have to be constantly rebased, waiting to enter master, missing out on testing that people might do. If you want to have a branch for stabilisation, why not branch for 0.9.2 now and cherry pick fixes into it until you're confident it's ready?

laanwj · 2014-05-09T12:48:00Z

I know that - the plan is to split off 0.9.2 this week.

mikehearn · 2014-05-09T12:59:22Z

Okie dokie.

rebroad · 2014-05-09T13:18:11Z

@gmaxwell ah, so it's not in the master branch? Am I able to submit pull requests to the branch you mention? What is the purpose of this branch and will it as some point be merged into master?

My main reason for mentioning these concerns is that the current master as it is doesn't work well, and gets 300 or so orphan blocks piling up without catching up on the blockchain, often requiring a restart of bitcoind to get it downloading again, at which point the hundreds of orphan blocks already downloaded are then lost (since they are not saved to disk).

It's also frustrating to see this, considering I raised a pull request about 2 years ago which downloaded blocks far faster and more efficiently than it currently does, and which did detect stuck downloads, unlike the current master branch.

sipa · 2014-05-09T13:57:53Z

@rebroad Gregory talking about my personal headers-first branch (#2964),
which was never merged because of multiple reasons, and later abandoned.
I've since (slowly) been working on several pull requests with pieces of
that functionality. This pull request is just some refactoring work towards
that.

942b33a Split AcceptBlockHeader from AcceptBlock. (Pieter Wuille) f457347 Split up CheckBlock in a block and header version (Pieter Wuille)

sipa mentioned this pull request Mar 18, 2014

Speed limit / throttle network usage #273

Closed

laanwj mentioned this pull request Mar 29, 2014

Block-header-only, faster startup client #7

Closed

sipa closed this Mar 29, 2014

sipa reopened this Apr 7, 2014

laanwj added this to the 0.10.0 milestone Apr 11, 2014

Diapolo reviewed Apr 11, 2014
View reviewed changes

sipa added 2 commits April 25, 2014 00:33

Split up CheckBlock in a block and header version

f457347

Split AcceptBlockHeader from AcceptBlock.

942b33a

Also modify some connection logic to deal with non-full blocks in the index.

laanwj mentioned this pull request May 1, 2014

Headers-based synchronization and parallel block download #2964

Closed

sipa mentioned this pull request May 7, 2014

Push cs_main locks down in ProcessBlock #4148

Merged

laanwj added the P2P label May 9, 2014

laanwj merged commit 942b33a into bitcoin:master May 9, 2014

laanwj added a commit that referenced this pull request May 9, 2014

Merge pull request #3884

d54985f

942b33a Split AcceptBlockHeader from AcceptBlock. (Pieter Wuille) f457347 Split up CheckBlock in a block and header version (Pieter Wuille)

laanwj mentioned this pull request May 9, 2014

Remove dependency of main on OpenSSL bignum #4076

Merged

jnewbery mentioned this pull request Jan 6, 2021

[code review] Isn't this check redundant? #19223

Closed

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Split headers-only versions off CheckBlock/AcceptBlock #3884

Split headers-only versions off CheckBlock/AcceptBlock #3884

Uh oh!

Conversation

sipa commented Mar 16, 2014

Uh oh!

luke-jr commented Mar 16, 2014

Uh oh!

sipa commented Mar 16, 2014

Uh oh!

sipa commented Mar 17, 2014

Uh oh!

sipa commented Mar 29, 2014

Uh oh!

sipa commented Apr 8, 2014

Uh oh!

sipa commented Apr 8, 2014

Uh oh!

mikehearn commented Apr 10, 2014

Uh oh!

Diapolo Apr 11, 2014

Choose a reason for hiding this comment

Uh oh!

sipa Apr 11, 2014

Choose a reason for hiding this comment

Uh oh!

rdponticelli commented Apr 16, 2014

Uh oh!

sipa commented Apr 16, 2014

Uh oh!

rdponticelli commented Apr 16, 2014

Uh oh!

rdponticelli commented Apr 16, 2014

Uh oh!

sipa commented Apr 17, 2014

Uh oh!

rdponticelli commented Apr 17, 2014

Uh oh!

sipa commented Apr 17, 2014

Uh oh!

sipa commented Apr 18, 2014

Uh oh!

sipa commented Apr 24, 2014

Uh oh!

BitcoinPullTester commented Apr 24, 2014

Uh oh!

mikehearn commented Apr 29, 2014

Uh oh!

leofidus commented May 2, 2014

Uh oh!

rebroad commented May 9, 2014

Uh oh!

gmaxwell commented May 9, 2014

Uh oh!

rebroad commented May 9, 2014

Uh oh!

gmaxwell commented May 9, 2014

Uh oh!

laanwj commented May 9, 2014

Uh oh!

sipa commented May 9, 2014

Uh oh!

mikehearn commented May 9, 2014

Uh oh!

laanwj commented May 9, 2014

Uh oh!

mikehearn commented May 9, 2014

Uh oh!

laanwj commented May 9, 2014

Uh oh!

mikehearn commented May 9, 2014

Uh oh!

rebroad commented May 9, 2014

Uh oh!

sipa commented May 9, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects