StorageEmbeddedRocksDB#15073
Conversation
447cef8 to
2982f58
Compare
|
How to make |
|
The script which is executed for fast builds is here: If cmake for Rocksdb will respect ClickHouse/docker/test/fasttest/run.sh Line 86 in 4cd7de1 See also #9226 |
d3f8a4e to
c1dc1d8
Compare
|
It will be better to allow arbitrary set of columns and PRIMARY KEY specification. It will greater improve the usability of this feature! |
28dcdcd to
3cec63c
Compare
There was a problem hiding this comment.
Allow passing extra options will be very useful (since there are tons of them, compression, threads, bloom filters, hash index), and not only rocksdb options but also options for column family, there are API that create options from map:
rocksdb::GetDBOptionsFromMaprocksdb::GetColumnFamilyOptionsFromMap- And a separate option for TTL, i.e.
rocksdb::DBWithTTL::Open.
Plus rocksdb exports tons of statistics:
rocksdb::CreateDBStatistics- pus there are histograms (
rocksdb::StatsLevel::kExceptHistogramOrTimers), but they a little bit heavy
worth export it via some metrics orsystem.rocksdb
There was a problem hiding this comment.
We can split the task to two steps and add support for options in the second step.
There was a problem hiding this comment.
Ok, since there are lots of options in rocksdb, I'd like to continue with options support in another PR.
|
@alexey-milovidov |
AFAIR understand rocksdb uses sync_file_range while there is no support in glibc-compatibility layer, so you can:
|
7fcea9a to
0afc8d8
Compare
ecda5cf to
b104217
Compare
|
@sundy-li BTW do you have any numbers? (performance related) |
* unique the keys * add inputstream && outputstream
It's strange that this PR view in github involves lots of changes when I pull rebase from origin/master. But it's normal when I git diff origin/master locally. Maybe it's cached by github. |
Ok.
I don't know the reason but don't worry... |
| M(550, CONDITIONAL_TREE_PARENT_NOT_FOUND) \ | ||
| M(551, ILLEGAL_PROJECTION_MANIPULATOR) \ | ||
| M(552, UNRECOGNIZED_ARGUMENTS) \ | ||
| M(553, ROCKSDB_ERROR) \ |
There was a problem hiding this comment.
@alexey-milovidov Error code is duplicated (after syncing with upstream.)
|
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add StorageEmbeddedRocksdb Engine
Detailed description / Documentation draft:
StorageEmbeddedRocksdb. As the name suggests, it introducesrocksdbas an embedded database with ClickHouse.This is just a simple implementation.
Motivations for adding this feature
If we low down the setting
index_granularity_bytes,ClickHousemay work kind like a key-value LSM-Tree database(key, value are in separated files), yet the pure key-value store can reduce much more extra IO READ than column store.
In bytedance, bigo, suning , they used
rocksdbfor caching theMergeTreeParts, it significantly reduces the loading time for large set tables. Because io sequence reads perform better than random reads in HDD.In bigo, we introduced
pilosatoClickHouse, largebitmapStatewill be stored inrocksdb. It performs better thancount distinctin UV metrics calculation, and we can join the results with normal OLAP queries.Why not
StorageRocksdbinstead ofStorageEmbeddedRocksdb? I think It's ok for both names.