Skip to content

Conversation

@sipa
Copy link
Member

@sipa sipa commented Nov 7, 2014

There are 3 pieces of data that are maintained on disk. The actual block and undo data, the block index (which can refer to positions on disk), and the chainstate (which refers to the best block hash).

Earlier, there was no guarantee that blocks were written to disk before block index entries referring to them were written. This commit introduces dirty flags for block index data, and delays writing it until the actual block data is flushed.

With this stricter ordering in writes, it is now safe to not always flush after every block, so there is no need for the IsInitialBlockDownload() check there - instead we just write whenever enough time has passed or the cache size grows too large.

In addition, only do a write inside the block processing loop if necessary (because of cache size exceeded). Otherwise, move the writing to a point after processing is done, after relaying.

This should improve block relay speed, by not writing chainstate and index entries before announcing the new hash.

Future work: move block and undo data writing into FlushStateToDisk, and move it to a background thread.

@sipa
Copy link
Member Author

sipa commented Nov 12, 2014

Added wallet tip updating and block/undo file flushing to the FlushStateToDisk() method. Very preliminary testing on my VPS (with very slow I/O) shows a ~100ms decrease in block processing time.

@sipa sipa force-pushed the noflush branch 2 times, most recently from 46cec70 to 56e8308 Compare November 12, 2014 21:48
src/main.cpp Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrization on multiple bool parameters leads to hard-to-read code; I'd prefer to pass an enum, or bit field

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I'll add a commit for changing it.

@laanwj
Copy link
Member

laanwj commented Nov 13, 2014

Concept ACK, will test

@laanwj
Copy link
Member

laanwj commented Nov 16, 2014

Found a small issue. When AppInit2 exits prematurely, bitcoind will crash in shutdown with the following:

#0  CCoinsViewCache::GetCacheSize (this=0x0) at coins.cpp:196
#1  0x54afe528 in FlushStateToDisk (state=..., fast=fast@entry=false, forceWrite=forceWrite@entry=true) at main.cpp:1774
#2  0x54afe906 in FlushStateToDisk () at main.cpp:1811
#3  0x54ad5c16 in Shutdown () at init.cpp:153
#4  0x54aca9d6 in AppInit (argc=6, argv=<optimized out>) at bitcoind.cpp:170
#5  0x54acad20 in main (argc=6, argv=0x7efff714) at bitcoind.cpp:182

You need a check pcoinsTip!=0 in Shutdown() before calling FlushStateToDisk.

@sipa
Copy link
Member Author

sipa commented Nov 16, 2014

Fixed.

@laanwj
Copy link
Member

laanwj commented Nov 17, 2014

What about the pcoinstip->Flush() in gettxoutsetinfo https://github.com/bitcoin/bitcoin/blob/master/src/rpcblockchain.cpp#L322 - does this need to be some form of FlushStateToDisk?

@sipa
Copy link
Member Author

sipa commented Nov 17, 2014

@laanwj Ah right, it does. That's sort of inconvienient but inevitable until there's a way to iterate the UTXO entries in a CCoinsView. Writing pcoinsTip to disk means a reference to the tip block hash in the index, so that needs to be written, which on itself refers to disk positions, so blocks need to be written too...

@sipa
Copy link
Member Author

sipa commented Nov 17, 2014

Fixed.

@sipa
Copy link
Member Author

sipa commented Nov 20, 2014

Tested by kill -KILL'ing a node while it was flushing. No problems.

@gmaxwell
Copy link
Contributor

This seems to greatly improves our handling of unclean shutdowns during the IBD. Previously we'd reliably corrupt the database, requring the user to manually delete things. I've tested this with a bunch of killing while syncing and not been able to break it.

@rdponticelli
Copy link
Contributor

I performed a thorough synchronization of a testnet node from scratch using this patch, sending term or kill signals randomly every 180-640 seconds. It worked like a charm. Such test easily triggers #5156 without it.

@gmaxwell
Copy link
Contributor

I think we should seriously consider this for 0.10. It's a big reliablity improvement during the initial sync, beyond its performance implications.

@laanwj
Copy link
Member

laanwj commented Nov 24, 2014

I've subjected this to some horrible crashes, and a full sync w/ DEBUG_LOCKORDER. ACK commithash 9dd92a12fe92b4a81e612acf1c6245b85fa12a73 for 0.10.
https://dev.visucore.com/bitcoin/acks/5241

sipa added 2 commits November 24, 2014 15:15
There are 3 pieces of data that are maintained on disk. The actual block
and undo data, the block index (which can refer to positions on disk),
and the chainstate (which refers to the best block hash).

Earlier, there was no guarantee that blocks were written to disk before
block index entries referring to them were written. This commit introduces
dirty flags for block index data, and delays writing entries until the actual
block data is flushed.

With this stricter ordering in writes, it is now safe to not always flush
after every block, so there is no need for the IsInitialBlockDownload()
check there - instead we just write whenever enough time has passed or
the cache size grows too large. Also updating the wallet's best known block
is delayed until this is done, otherwise the wallet may end up referring to an
unknown block.

In addition, only do a write inside the block processing loop if necessary
(because of cache size exceeded). Otherwise, move the writing to a point
after processing is done, after relaying.
@sipa
Copy link
Member Author

sipa commented Nov 24, 2014

Sorry, needed to rebase after BIP23 block proposals.

@laanwj laanwj added this to the 0.10.0 milestone Nov 24, 2014
@gmaxwell
Copy link
Contributor

ACK.

@laanwj laanwj merged commit a206950 into bitcoin:master Nov 25, 2014
laanwj added a commit that referenced this pull request Nov 25, 2014
a206950 Introduce separate flushing modes (Pieter Wuille)
51ce901 Improve chainstate/blockindex disk writing policy (Pieter Wuille)
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants