-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Allow to optional specify the directory for the blocks storage #12653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
560f507 to
8496b7e
Compare
randolf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the size of the blockchain continues to increase, this will facilitate the need to store the blocks on a different hard drive, hence this feature serves a probable future need. There are likely other uses for this feature too.
|
The argument for keeping the block index in the data dir would be that in the vast majority of cases it means the block index will be on the same filesystem as the utxo database, and therefore we can rely on regular filesystem semantics regarding data ordering. If the block index and utxo database are on different filesystems then it's not possible to guarantee that one is ahead of the other without fully serializing the operations with fsync. I will try to look through the code more and think about this more closely. I think this change is fine the way the code currently works because the flushes are basically serialized anyway (and we fsync for block storage and block index updates), but this might become annoying long term. |
8496b7e to
34ec8e3
Compare
|
Added a commit (basically a one-liner) that will make the block index database stick in datadir. |
|
utACK Once you fix the python lint failure. |
|
Concept ACK. |
src/util.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering why this should be net specific, considering that we have the same concept for wallets and the GetWalletDir is not net specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the blocks directory by definition net specific?
test/functional/feature_blocksdir.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove ;.
test/functional/feature_blocksdir.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add
blocksdir = os.path.join(self.options.tmpdir, "blocksdir")and reuse here and below.
src/txdb.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"blockindex"? Should be "blocks" / "index"?
Maybe add GetBlockIndexDir()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this seems wrong.
I tried providing my existing blocks directory:
$ src/bitcoind -testnet -blocksdir=/store2/tmp/bitcoin/testnet3/blocks -printtoconsole
2018-03-15T19:29:04Z Opening LevelDB in /.../.bitcoin/testnet3/blockindex
2018-03-15T19:29:04Z Opened LevelDB successfully
...
2018-03-15T19:29:05Z : Error initializing block database.
Please restart with -reindex or -reindex-chainstate to recover.
: Error initializing block database.
Please restart with -reindex or -reindex-chainstate to recover.
This fails. Shouldn't it use the index directory within the blocksdir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the same directory looks most obvious correct, though, if you switch the blocks dir to a different path (with -blocksdir=<path>) users may run into troubles since the old blocks dir may still be in the datadir...
But I agree, lets keep "block" / "Index" for now...
fixed.
|
Concept ACK, this would avoid some symlinking gymnastics I'm currently doing. |
|
ACK 34ec8e3, once lint issues are fixed. Does what it says on the tin, seems like a useful feature. |
|
I might have to retract my earlier statement, as @sipa says here that the blocks index and block files should be on the same filesystem, contradicting my earlier statement. I'm not sure if I understand that reasoning though, as we fsync after writing the block and before committing to the block index, which should be fine across filesystems. If it's really true that they need to be on the same filesystem that would be problematic in the future if we ever wanted to merge the block index and the chainstate index. |
|
|
|
utACK |
34ec8e3 to
4855b9e
Compare
4855b9e to
a192636
Compare
@sipa: can you elaborate your concerns of having the index outside of the blocks directory? |
|
@jonasschnelli @eklitzke I don't think there's an actual problem with having the blocks and the block index on separate filesystems. I was worried about them going out of sync due to management (you copy one but not the other, etc). |
|
In that case, utACK a192636 |
…storage a192636 -blocksdir: keep blockindex leveldb database in datadir (Jonas Schnelli) f38e4fd QA: Add -blocksdir test (Jonas Schnelli) 386a6b6 Allow to optional specify the directory for the blocks storage (Jonas Schnelli) Pull request description: Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks). This PR adds a `-blocksdir` option that allows one to keep the blockfiles and the blockindex external from the data directory (instead of creating symlinks). I fist had an option to keep the blockindex within the datadir, but seems to make no sense since accessing the index will (always) lead to access (r/w) the block files. Tree-SHA512: f8b9e1a681679eac25076dc30e45e6e12d4b2d9ac4be907cbea928a75af081dbcb0f1dd3e97169ab975f73d0bd15824c00c2a34638f3b284b39017171fce2409
To get the current blocksdir is valuable for debug purposes after merging bitcoin#12653.
To get the current blocksdir is valuable for debug purposes after merging bitcoin#12653.
e4a0c35 Improve blocksdir functional test. (Hennadii Stepanov) c3f1821 Make blockdir always net specific (Hennadii Stepanov) Pull request description: The blocks directory is net specific by definition. Also this prevents the side effect of calling `GetBlocksDir(false)` in the non-mainnet environment. Currently a new node creates an unused `blocks\` directory in the root of the data directory when `-testnet` or `-regtest` is specified. Refs: - #12653 - #12653 (comment) by @laanwj - #14595 (comment) Tree-SHA512: c9957a68a4a200ebd2010823a56db7e61563afedcb7c9828e86b13f3af2990e07854b622c1f3374756f94574acb3ea32de7d2a399eef6c0623f0e11265155627
| } | ||
|
|
||
| CBlockTreeDB::CBlockTreeDB(size_t nCacheSize, bool fMemory, bool fWipe) : CDBWrapper(GetDataDir() / "blocks" / "index", nCacheSize, fMemory, fWipe) { | ||
| CBlockTreeDB::CBlockTreeDB(size_t nCacheSize, bool fMemory, bool fWipe) : CDBWrapper(gArgs.IsArgSet("-blocksdir") ? GetDataDir() / "blocks" / "index" : GetBlocksDir() / "index", nCacheSize, fMemory, fWipe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this conditional for -- why not always just GetDataDir() / "blocks" / "index" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -blocksdir parameter lets you put blocks in a completely different directory, e.g. on a slow cheap magnetic harddrive, keeping the rest on your datadir on SSD. blocks/index also needs to be on a fast drive. Ideally tough I think datadir/blocks/index should move to datadir/index/blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I just mean that no matter whether the gArgs.IsArgSet("-blocksdir") ? conditional here is True / False, either way it results in using "DATADIR/blocks/index". So the conditional seems like just extra code ... though this makes me suspect I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gArgs.IsArgSet("-blocksdir") ? GetDataDir() / "blocks" / "index" : GetBlocksDir() / "index"
If launched without -blocksdir:
false -> GetBlocksDir() / "index" -> ~/.bitcoin/blocks/index
If launched with with -blocksdir=/mnt/magnetic/blocks:
true -> GetDataDir() / "blocks" / "index" -> /mnt/magnetic/blocks/index
Note that GetBlocksDir() internally also checks the -blocksdir argument, and create a directory if needed, so it won't use DATADIR in that is set.
Lines 607 to 634 in a192636
| const fs::path &GetBlocksDir(bool fNetSpecific) | |
| { | |
| LOCK(csPathCached); | |
| fs::path &path = fNetSpecific ? g_blocks_path_cache_net_specific : g_blocks_path_cached; | |
| // This can be called during exceptions by LogPrintf(), so we cache the | |
| // value so we don't have to do memory allocations after that. | |
| if (!path.empty()) | |
| return path; | |
| if (gArgs.IsArgSet("-blocksdir")) { | |
| path = fs::system_complete(gArgs.GetArg("-blocksdir", "")); | |
| if (!fs::is_directory(path)) { | |
| path = ""; | |
| return path; | |
| } | |
| } else { | |
| path = GetDataDir(false); | |
| } | |
| if (fNetSpecific) | |
| path /= BaseParams().DataDir(); | |
| path /= "blocks"; | |
| fs::create_directories(path); | |
| return path; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I meant to say either way it uses GetBlocksDir()/"blocks"/"index" but wrote DATADIR. Anyway though as you say, the directory check&create does make it slightly different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If launched with with
-blocksdir=/mnt/magnetic/blocks:true -> GetDataDir() / "blocks" / "index" -> /mnt/magnetic/blocks/index
This is incorrect, right @Sjors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's like this: -blocksdir=/mnt/magnetic will put blocks in /mnt/magnetic/blocks. The block index will remain at ~/.bitcoin/blocks/index and ~/.bitcoin/blocks will otherwise be empty.
…blocks storage a192636 -blocksdir: keep blockindex leveldb database in datadir (Jonas Schnelli) f38e4fd QA: Add -blocksdir test (Jonas Schnelli) 386a6b6 Allow to optional specify the directory for the blocks storage (Jonas Schnelli) Pull request description: Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks). This PR adds a `-blocksdir` option that allows one to keep the blockfiles and the blockindex external from the data directory (instead of creating symlinks). I fist had an option to keep the blockindex within the datadir, but seems to make no sense since accessing the index will (always) lead to access (r/w) the block files. Tree-SHA512: f8b9e1a681679eac25076dc30e45e6e12d4b2d9ac4be907cbea928a75af081dbcb0f1dd3e97169ab975f73d0bd15824c00c2a34638f3b284b39017171fce2409
… Allow to specify blocks storage directory b0e0d55 doc: add '-blocksdir' option to release-notes (furszy) 50e66c2 Turn TryCreateDirectory() into TryCreateDirectories() (Marko Bencun) 035d3c2 util: Simplify path argument for CBlockTreeDB ctor (Hennadii Stepanov) a571d24 Fail if either disk space check fails (Ben Woosley) e567003 Improve blocksdir functional test. (Hennadii Stepanov) ff0ae45 Make blockdir always net specific (Hennadii Stepanov) 13a4119 doc: Clarify -blocksdir usage (Daniel McNally) a4ff899 -blocksdir: keep blockindex leveldb database in datadir (Jonas Schnelli) 3054076 QA: Add -blocksdir test (Jonas Schnelli) 39b8127 Allow to optional specify the directory for the blocks storage (Jonas Schnelli) a821b0b BugFix: Move TestingSetup::ECCVerifyHandle member to BasicTestingSetup. (furszy) 5844452 Remove unneeded GetTempPath, only use temp directory in unit tests (furszy) 7fc893b [Test] BasicTestingSetup constructor receiving the network param. (furszy) Pull request description: Grouped two different, zero risk, modifications here to not have to create two/three small and straightforward PRs: #### 1) Unit Tests: * Fixing basic unit test setup issue, was not initializing the libsecp256k1 context. Thus why we have been forced to use the extended unit testing setup for almost every test case (which includes CConnman, scheduler, validation interface, etc.. that are not needed so often). * Refactor: `BasicTestingSetup` receiving the network in the constructor. #### 2) Back ported the ability to specify a custom directory for the blocks storage. > Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks). This PR adds a -blocksdir option that allows one to keep the blockfiles external from the data directory (instead of creating symlinks). * bitcoin#9895. * bitcoin#12653. * bitcoin#14364. * bitcoin#14409. * bitcoin#15124. * bitcoin#17059. ACKs for top commit: Fuzzbawls: re-ACK b0e0d55 random-zebra: utACK b0e0d55 and merging... Tree-SHA512: 289e5b826c8fc23a5933a5b30e6c043f82bad827b10f3093cbc5869a9e278d7ad0a325dd19b5ad3c97f23a3dee69f5b0bee3869e486eeed5e8d8342a39919cf5
ba8c8b2 Fail if either disk space check fails (Ben Woosley) Pull request description: Rather than both. Introduced in 386a6b6, bitcoin#12653 Tree-SHA512: 24765dd3c62b742c491d7d9a751917c2ce6f3819a8764a7725ce84910ef69bffca07f4c0dfbeed8c4f978a12c4b04a2ac3b8c2ff59602330a8a3e8a68878c41b
… specific e4a0c35 Improve blocksdir functional test. (Hennadii Stepanov) c3f1821 Make blockdir always net specific (Hennadii Stepanov) Pull request description: The blocks directory is net specific by definition. Also this prevents the side effect of calling `GetBlocksDir(false)` in the non-mainnet environment. Currently a new node creates an unused `blocks\` directory in the root of the data directory when `-testnet` or `-regtest` is specified. Refs: - bitcoin#12653 - bitcoin#12653 (comment) by @laanwj - bitcoin#14595 (comment) Tree-SHA512: c9957a68a4a200ebd2010823a56db7e61563afedcb7c9828e86b13f3af2990e07854b622c1f3374756f94574acb3ea32de7d2a399eef6c0623f0e11265155627
2ab9140 Add tooltips for both datadir and blocksdir (Hennadii Stepanov) 3045704 Add "Blocksdir" to Debug window (Hennadii Stepanov) Pull request description: To get the current `blocksdir` is valuable for debug purposes after merging bitcoin#12653.  Tree-SHA512: a93f2c00ee19cf6acb499d3bd9bccf4be8ef01c53c44d917ad401aa4797db02cbccb71a9c24e05262ea09345e15f9299381367fdc6951f21dd3788a4a58d2132
2ab9140 Add tooltips for both datadir and blocksdir (Hennadii Stepanov) 3045704 Add "Blocksdir" to Debug window (Hennadii Stepanov) Pull request description: To get the current `blocksdir` is valuable for debug purposes after merging bitcoin#12653.  Tree-SHA512: a93f2c00ee19cf6acb499d3bd9bccf4be8ef01c53c44d917ad401aa4797db02cbccb71a9c24e05262ea09345e15f9299381367fdc6951f21dd3788a4a58d2132
Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks).
This PR adds a
-blocksdiroption that allows one to keep the blockfilesand the blockindexexternal from the data directory (instead of creating symlinks).I fist had an option to keep the blockindex within the datadir, but seems to make no sense since accessing the index will (always) lead to access (r/w) the block files.