Conversation
|
Workflow [PR], commit [2ca6b0e] Summary: ❌
|
Ergus
left a comment
There was a problem hiding this comment.
Reviewed up to src/Storages/MergeTree/MergeTreeReaderTextIndex.cpp
src/Processors/QueryPlan/Optimizations/optimizeDirectReadFromTextIndex.cpp
Outdated
Show resolved
Hide resolved
| out.query_string = std::make_unique<GinQueryString>(); | ||
| const auto & value = const_value.safeGet<String>(); | ||
| token_extractor->stringToGinFilter(value.data(), value.size(), *out.query_string); | ||
| // const auto & value = const_value.safeGet<String>(); |
There was a problem hiding this comment.
These functions are not supported, so initializing query_string is pointless for them I think.
| } | ||
| } | ||
|
|
||
| if (!index_for_step.empty() && !non_index_column.empty()) |
There was a problem hiding this comment.
IIUC If this happens is already too late and far from the problem source . Could we prevent the issue earlier or at least add a comment on where to look?
There was a problem hiding this comment.
Yes, we can do this before returning from getReadTaskColumns.
This comment was marked as resolved.
This comment was marked as resolved.
7197270 to
00bfa6f
Compare
| static const String ARGUMENT_TOKENIZER = "tokenizer"; | ||
| static const String ARGUMENT_NGRAM_SIZE = "ngram_size"; | ||
| static const String ARGUMENT_SEPARATORS = "separators"; | ||
| static const String ARGUMENT_DICTIONARY_BLOCK_SIZE = "dictionary_block_size"; |
There was a problem hiding this comment.
Once ready for merge, may I ask to update docs/en/engines/table-engines/mergetree-family/invertedindexes.md
- remove mentions of deleted index parameter
segment_digestion_threshold_bytes - add mentino of the two new index parameters (can be brief, so we at least don't forget).
There was a problem hiding this comment.
I forgot to adjust the documentation, let's do it in a separate PR.
|
Stateless tests (amd_debug, AsyncInsert, s3 storage, parallel):
Stateless tests (amd_tsan, sequential, 2/2):
Integration tests (arm_binary, distributed plan, 1/4):
Stress test (amd_tsan): Performance Comparison (amd_release, master_head, 1/3):
|
2fdffe4
| } | ||
| } | ||
|
|
||
| storage_snapshot = std::make_shared<StorageSnapshot>( |
There was a problem hiding this comment.
This will likely break enable_shared_storage_snapshot_in_query = 1 #79471 . I ran into the same issue when developing projection index, and I also haven’t found a good solution yet.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
The inverted text index was reworked from scratch to be scalable for datasets that don't fit into RAM.
Resolves #87003.