Skip to content

Conversation

@jonasschnelli
Copy link
Contributor

@jonasschnelli jonasschnelli commented Mar 9, 2018

Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks).

This PR adds a -blocksdir option that allows one to keep the blockfiles and the blockindex external from the data directory (instead of creating symlinks).

I fist had an option to keep the blockindex within the datadir, but seems to make no sense since accessing the index will (always) lead to access (r/w) the block files.

@jonasschnelli jonasschnelli force-pushed the 2018/03/blocksdir branch 2 times, most recently from 560f507 to 8496b7e Compare March 9, 2018 06:36
Copy link
Contributor

@randolf randolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the size of the blockchain continues to increase, this will facilitate the need to store the blocks on a different hard drive, hence this feature serves a probable future need. There are likely other uses for this feature too.

@eklitzke
Copy link
Contributor

The argument for keeping the block index in the data dir would be that in the vast majority of cases it means the block index will be on the same filesystem as the utxo database, and therefore we can rely on regular filesystem semantics regarding data ordering. If the block index and utxo database are on different filesystems then it's not possible to guarantee that one is ahead of the other without fully serializing the operations with fsync.

I will try to look through the code more and think about this more closely. I think this change is fine the way the code currently works because the flushes are basically serialized anyway (and we fsync for block storage and block index updates), but this might become annoying long term.

@jonasschnelli
Copy link
Contributor Author

Added a commit (basically a one-liner) that will make the block index database stick in datadir.

@eklitzke
Copy link
Contributor

utACK Once you fix the python lint failure.

@promag
Copy link
Contributor

promag commented Mar 11, 2018

Concept ACK.

src/util.h Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering why this should be net specific, considering that we have the same concept for wallets and the GetWalletDir is not net specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the blocks directory by definition net specific?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove ;.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add

blocksdir = os.path.join(self.options.tmpdir, "blocksdir")

and reuse here and below.

src/txdb.cpp Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"blockindex"? Should be "blocks" / "index"?

Maybe add GetBlockIndexDir()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems wrong.
I tried providing my existing blocks directory:

$ src/bitcoind -testnet -blocksdir=/store2/tmp/bitcoin/testnet3/blocks -printtoconsole
2018-03-15T19:29:04Z Opening LevelDB in /.../.bitcoin/testnet3/blockindex
2018-03-15T19:29:04Z Opened LevelDB successfully
...
2018-03-15T19:29:05Z : Error initializing block database.
Please restart with -reindex or -reindex-chainstate to recover.
: Error initializing block database.
Please restart with -reindex or -reindex-chainstate to recover.

This fails. Shouldn't it use the index directory within the blocksdir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the same directory looks most obvious correct, though, if you switch the blocks dir to a different path (with -blocksdir=<path>) users may run into troubles since the old blocks dir may still be in the datadir...
But I agree, lets keep "block" / "Index" for now...
fixed.

@laanwj
Copy link
Member

laanwj commented Mar 13, 2018

Concept ACK, this would avoid some symlinking gymnastics I'm currently doing.
(ref; #10922)

@laanwj laanwj self-assigned this Mar 13, 2018
@hkjn
Copy link
Contributor

hkjn commented Mar 13, 2018

ACK 34ec8e3, once lint issues are fixed.

Does what it says on the tin, seems like a useful feature.

@eklitzke
Copy link
Contributor

I might have to retract my earlier statement, as @sipa says here that the blocks index and block files should be on the same filesystem, contradicting my earlier statement. I'm not sure if I understand that reasoning though, as we fsync after writing the block and before committing to the block index, which should be fine across filesystems. If it's really true that they need to be on the same filesystem that would be problematic in the future if we ever wanted to merge the block index and the chainstate index.

@laanwj
Copy link
Member

laanwj commented Mar 15, 2018

I think this is almost ready. Will test. However, python-nanny wants you to fix:

./test/functional/feature_blocksdir.py:20:63: E703 statement ends with a semicolon
^---- failure generated from contrib/devtools/lint-python.sh

@donaloconnor
Copy link
Contributor

donaloconnor commented Mar 15, 2018

utACK

@laanwj
Copy link
Member

laanwj commented Mar 19, 2018

I might have to retract my earlier statement, as @sipa says here that the blocks index and block files should be on the same filesystem

Wouldn't this imply that we're better off without commit 4855b9e? That would keep the block index with the blocks, which is @sipa's preference.

@jonasschnelli
Copy link
Contributor Author

I might have to retract my earlier statement, as @sipa says here that the blocks index and block files should be on the same filesystem

Wouldn't this imply that we're better off without commit 4855b9e? That would keep the block index with the blocks, which is @sipa's preference.

@sipa: can you elaborate your concerns of having the index outside of the blocks directory?

@sipa
Copy link
Member

sipa commented Mar 20, 2018

@jonasschnelli @eklitzke I don't think there's an actual problem with having the blocks and the block index on separate filesystems. I was worried about them going out of sync due to management (you copy one but not the other, etc).

@jonasschnelli
Copy link
Contributor Author

Thanks @sipa.
I think then we should keep the last commit (a192636).

@laanwj
Copy link
Member

laanwj commented Mar 27, 2018

In that case, utACK a192636

@laanwj laanwj merged commit a192636 into bitcoin:master Mar 27, 2018
laanwj added a commit that referenced this pull request Mar 27, 2018
…storage

a192636 -blocksdir: keep blockindex leveldb database in datadir (Jonas Schnelli)
f38e4fd QA: Add -blocksdir test (Jonas Schnelli)
386a6b6 Allow to optional specify the directory for the blocks storage (Jonas Schnelli)

Pull request description:

  Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks).

  This PR adds a `-blocksdir` option that allows one to keep the blockfiles and the blockindex external from the data directory (instead of creating symlinks).

  I fist had an option to keep the blockindex within the datadir, but seems to make no sense since accessing the index will (always) lead to access (r/w) the block files.

Tree-SHA512: f8b9e1a681679eac25076dc30e45e6e12d4b2d9ac4be907cbea928a75af081dbcb0f1dd3e97169ab975f73d0bd15824c00c2a34638f3b284b39017171fce2409
jfhk pushed a commit to jfhk/bitcoin that referenced this pull request Nov 14, 2018
To get the current blocksdir is valuable for debug purposes after
merging bitcoin#12653.
HashUnlimited pushed a commit to HashUnlimited/chaincoin that referenced this pull request Nov 26, 2018
To get the current blocksdir is valuable for debug purposes after
merging bitcoin#12653.
@Empact
Copy link
Contributor

Empact commented Jan 7, 2019

#15124

laanwj added a commit that referenced this pull request Jan 9, 2019
ba8c8b2 Fail if either disk space check fails (Ben Woosley)

Pull request description:

  Rather than both.

  Introduced in 386a6b6, #12653

Tree-SHA512: 24765dd3c62b742c491d7d9a751917c2ce6f3819a8764a7725ce84910ef69bffca07f4c0dfbeed8c4f978a12c4b04a2ac3b8c2ff59602330a8a3e8a68878c41b
laanwj added a commit that referenced this pull request Jan 16, 2019
e4a0c35 Improve blocksdir functional test. (Hennadii Stepanov)
c3f1821 Make blockdir always net specific (Hennadii Stepanov)

Pull request description:

  The blocks directory is net specific by definition.

  Also this prevents the side effect of calling `GetBlocksDir(false)` in the non-mainnet environment.
  Currently a new node creates an unused `blocks\` directory in the root of the data directory when `-testnet` or `-regtest` is specified.

  Refs:
  - #12653
  - #12653 (comment) by @laanwj
  - #14595 (comment)

Tree-SHA512: c9957a68a4a200ebd2010823a56db7e61563afedcb7c9828e86b13f3af2990e07854b622c1f3374756f94574acb3ea32de7d2a399eef6c0623f0e11265155627
}

CBlockTreeDB::CBlockTreeDB(size_t nCacheSize, bool fMemory, bool fWipe) : CDBWrapper(GetDataDir() / "blocks" / "index", nCacheSize, fMemory, fWipe) {
CBlockTreeDB::CBlockTreeDB(size_t nCacheSize, bool fMemory, bool fWipe) : CDBWrapper(gArgs.IsArgSet("-blocksdir") ? GetDataDir() / "blocks" / "index" : GetBlocksDir() / "index", nCacheSize, fMemory, fWipe) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this conditional for -- why not always just GetDataDir() / "blocks" / "index" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -blocksdir parameter lets you put blocks in a completely different directory, e.g. on a slow cheap magnetic harddrive, keeping the rest on your datadir on SSD. blocks/index also needs to be on a fast drive. Ideally tough I think datadir/blocks/index should move to datadir/index/blocks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just mean that no matter whether the gArgs.IsArgSet("-blocksdir") ? conditional here is True / False, either way it results in using "DATADIR/blocks/index". So the conditional seems like just extra code ... though this makes me suspect I'm missing something.

Copy link
Member

@Sjors Sjors May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gArgs.IsArgSet("-blocksdir") ? GetDataDir() / "blocks" / "index" : GetBlocksDir() / "index"

If launched without -blocksdir:

false -> GetBlocksDir() / "index" -> ~/.bitcoin/blocks/index

If launched with with -blocksdir=/mnt/magnetic/blocks:

true -> GetDataDir() / "blocks" / "index" ->   /mnt/magnetic/blocks/index

Note that GetBlocksDir() internally also checks the -blocksdir argument, and create a directory if needed, so it won't use DATADIR in that is set.

bitcoin/src/util.cpp

Lines 607 to 634 in a192636

const fs::path &GetBlocksDir(bool fNetSpecific)
{
LOCK(csPathCached);
fs::path &path = fNetSpecific ? g_blocks_path_cache_net_specific : g_blocks_path_cached;
// This can be called during exceptions by LogPrintf(), so we cache the
// value so we don't have to do memory allocations after that.
if (!path.empty())
return path;
if (gArgs.IsArgSet("-blocksdir")) {
path = fs::system_complete(gArgs.GetArg("-blocksdir", ""));
if (!fs::is_directory(path)) {
path = "";
return path;
}
} else {
path = GetDataDir(false);
}
if (fNetSpecific)
path /= BaseParams().DataDir();
path /= "blocks";
fs::create_directories(path);
return path;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I meant to say either way it uses GetBlocksDir()/"blocks"/"index" but wrote DATADIR. Anyway though as you say, the directory check&create does make it slightly different.

Copy link
Contributor

@promag promag Oct 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If launched with with -blocksdir=/mnt/magnetic/blocks:

true -> GetDataDir() / "blocks" / "index" ->   /mnt/magnetic/blocks/index

This is incorrect, right @Sjors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's like this: -blocksdir=/mnt/magnetic will put blocks in /mnt/magnetic/blocks. The block index will remain at ~/.bitcoin/blocks/index and ~/.bitcoin/blocks will otherwise be empty.

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Oct 14, 2020
…blocks storage

a192636 -blocksdir: keep blockindex leveldb database in datadir (Jonas Schnelli)
f38e4fd QA: Add -blocksdir test (Jonas Schnelli)
386a6b6 Allow to optional specify the directory for the blocks storage (Jonas Schnelli)

Pull request description:

  Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks).

  This PR adds a `-blocksdir` option that allows one to keep the blockfiles and the blockindex external from the data directory (instead of creating symlinks).

  I fist had an option to keep the blockindex within the datadir, but seems to make no sense since accessing the index will (always) lead to access (r/w) the block files.

Tree-SHA512: f8b9e1a681679eac25076dc30e45e6e12d4b2d9ac4be907cbea928a75af081dbcb0f1dd3e97169ab975f73d0bd15824c00c2a34638f3b284b39017171fce2409
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request May 1, 2021
… Allow to specify blocks storage directory

b0e0d55 doc: add '-blocksdir' option to release-notes (furszy)
50e66c2 Turn TryCreateDirectory() into TryCreateDirectories() (Marko Bencun)
035d3c2 util: Simplify path argument for CBlockTreeDB ctor (Hennadii Stepanov)
a571d24 Fail if either disk space check fails (Ben Woosley)
e567003 Improve blocksdir functional test. (Hennadii Stepanov)
ff0ae45 Make blockdir always net specific (Hennadii Stepanov)
13a4119 doc: Clarify -blocksdir usage (Daniel McNally)
a4ff899 -blocksdir: keep blockindex leveldb database in datadir (Jonas Schnelli)
3054076 QA: Add -blocksdir test (Jonas Schnelli)
39b8127 Allow to optional specify the directory for the blocks storage (Jonas Schnelli)
a821b0b BugFix: Move TestingSetup::ECCVerifyHandle member to BasicTestingSetup. (furszy)
5844452 Remove unneeded GetTempPath, only use temp directory in unit tests (furszy)
7fc893b [Test] BasicTestingSetup constructor receiving the network param. (furszy)

Pull request description:

  Grouped two different, zero risk, modifications here to not have to create two/three small and straightforward PRs:

  #### 1) Unit Tests:

  * Fixing basic unit test setup issue, was not initializing the libsecp256k1 context. Thus why we have been forced to use the extended unit testing setup for almost every test case (which includes CConnman, scheduler, validation interface, etc.. that are not needed so often).
  * Refactor: `BasicTestingSetup` receiving the network in the constructor.

  #### 2) Back ported the ability to specify a custom directory for the blocks storage.

  > Since the actual block files taking up more and more space, it may be desirable to have them stored in a different location then the data directory (use case: SSD for chainstate, etc., HD for blocks).
  This PR adds a -blocksdir option that allows one to keep the blockfiles external from the data directory (instead of creating symlinks).

  * bitcoin#9895.
  * bitcoin#12653.
  * bitcoin#14364.
  * bitcoin#14409.
  * bitcoin#15124.
  * bitcoin#17059.

ACKs for top commit:
  Fuzzbawls:
    re-ACK b0e0d55
  random-zebra:
    utACK b0e0d55 and merging...

Tree-SHA512: 289e5b826c8fc23a5933a5b30e6c043f82bad827b10f3093cbc5869a9e278d7ad0a325dd19b5ad3c97f23a3dee69f5b0bee3869e486eeed5e8d8342a39919cf5
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 28, 2021
ba8c8b2 Fail if either disk space check fails (Ben Woosley)

Pull request description:

  Rather than both.

  Introduced in 386a6b6, bitcoin#12653

Tree-SHA512: 24765dd3c62b742c491d7d9a751917c2ce6f3819a8764a7725ce84910ef69bffca07f4c0dfbeed8c4f978a12c4b04a2ac3b8c2ff59602330a8a3e8a68878c41b
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 28, 2021
… specific

e4a0c35 Improve blocksdir functional test. (Hennadii Stepanov)
c3f1821 Make blockdir always net specific (Hennadii Stepanov)

Pull request description:

  The blocks directory is net specific by definition.

  Also this prevents the side effect of calling `GetBlocksDir(false)` in the non-mainnet environment.
  Currently a new node creates an unused `blocks\` directory in the root of the data directory when `-testnet` or `-regtest` is specified.

  Refs:
  - bitcoin#12653
  - bitcoin#12653 (comment) by @laanwj
  - bitcoin#14595 (comment)

Tree-SHA512: c9957a68a4a200ebd2010823a56db7e61563afedcb7c9828e86b13f3af2990e07854b622c1f3374756f94574acb3ea32de7d2a399eef6c0623f0e11265155627
Munkybooty pushed a commit to Munkybooty/dash that referenced this pull request Jul 21, 2021
2ab9140 Add tooltips for both datadir and blocksdir (Hennadii Stepanov)
3045704 Add "Blocksdir" to Debug window (Hennadii Stepanov)

Pull request description:

  To get the current `blocksdir` is valuable for debug purposes after
  merging bitcoin#12653.

  ![screenshot from 2018-10-02 23-16-52](https://user-images.githubusercontent.com/32963518/46374770-2ef6f580-c69a-11e8-85c2-44a49fa36b28.png)

Tree-SHA512: a93f2c00ee19cf6acb499d3bd9bccf4be8ef01c53c44d917ad401aa4797db02cbccb71a9c24e05262ea09345e15f9299381367fdc6951f21dd3788a4a58d2132
Munkybooty pushed a commit to Munkybooty/dash that referenced this pull request Jul 23, 2021
2ab9140 Add tooltips for both datadir and blocksdir (Hennadii Stepanov)
3045704 Add "Blocksdir" to Debug window (Hennadii Stepanov)

Pull request description:

  To get the current `blocksdir` is valuable for debug purposes after
  merging bitcoin#12653.

  ![screenshot from 2018-10-02 23-16-52](https://user-images.githubusercontent.com/32963518/46374770-2ef6f580-c69a-11e8-85c2-44a49fa36b28.png)

Tree-SHA512: a93f2c00ee19cf6acb499d3bd9bccf4be8ef01c53c44d917ad401aa4797db02cbccb71a9c24e05262ea09345e15f9299381367fdc6951f21dd3788a4a58d2132
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.