Skip to content

[Performance] LevelDB options.max_open_files = 64 parameter (Windows 10) #12123

@donaloconnor

Description

@donaloconnor

Observations

Bitcoind startup performance (Fully synced node)

While running procmon when starting bitcoind.exe I noticed millions of file open/read/close events to the chainstate leveldb dir. The high frequency file open/close events occurred during this: (init.cpp)

if (!ActivateBestChain(state, chainparams)) ..

Investigating this further led me to level DB's LRUCache. We use a value of 64 for max_open_files:

options.max_open_files = 64; in static leveldb::Options GetOptions(size_t nCacheSize)

As far as I know and read online the default for LevelDB is 1000.

I've noticed some (I consider significant) performance improvements by increasing the max_open_files var to the default 1000. This avoids the overhead of many thousand (per second) open/close operations on the files in the chainstate dir. This also avoids the unnecessary high frequency allocations created each time on the heap (LevelDB's Win32RandomAccessFile objects).

I am not a levelDB expert but from what I can gather this value needs to not exceed the maximum number of file handles that the process can have but 64 seems a bit on the low side.

Results

Here are some of my results while doing 5 iterations of max_open_files = 64 and 1000. The timings are timing the ActivateBestChain function call (Using high resolution timer).

image

Questions:

  1. Why did we chose 64 as the fixed global value of max_open_files?
  2. Should we expose the max_open_files via a command line option or should we be smarter with this value since it has performance benefits (Perhaps even with initial chain sync?)

My system:
Dell XPS 17 9560 i7-7700HQ CPU @ 2.8 GHz, 2801 Mhz, 4 Cores. 16GB Ram, 512GB PCIe SSD, Windows 10

Release build (MSVC optimizations on /O2)

It would be interesting if someone can try some tests on a Linux machine. It could be related to some overhead with Windows' CreateFile while opening the files.

If it's accepted that this is a performance bottle neck then I am happy to propose a solution or do more research on this parameter. At the minimum expose it as a setting or command line option.

Thanks,
Donal

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions