Allow data in square brackets in JSONAsString format#25633
Allow data in square brackets in JSONAsString format#25633nikitamikhaylov merged 7 commits intoClickHouse:masterfrom
Conversation
| /// Some input formats can have non trivial readPrefix() and readSuffix(), | ||
| /// so in some cases there is no possibility to use parallel parsing. | ||
| /// The checker should return true if parallel parsing should be disabled. | ||
| using NonTrivialPrefixAndSuffixChecker = std::function<bool(ReadBuffer & buf)>; |
| if (buf.eof() || *buf.position() == '[') | ||
| parallel_parsing = false; /// Disable it for JSONEachRow if data is in square brackets (see JSONEachRowRowInputFormat) | ||
| const auto & non_trivial_prefix_and_suffix_checker = getCreators(name).non_trivial_prefix_and_suffix_checker; | ||
| /// Disable parallel parsing for input formats with non-trivial readPrefix() and readSuffix(). |
There was a problem hiding this comment.
Can we just make it a bool false flag? The amount of crutches here is becoming unmanageable. Instead we should parse the prefix properly when parsing in parallel (this is a topic for a separate PR).
There was a problem hiding this comment.
Disabling it for JSONEachRow would technically be a performance regression, but maybe it'll give us a chance to implement proper parallel parsing already.
Added here #8958
There was a problem hiding this comment.
Can we just make it a bool false flag?
If you are about non_trivial_prefix_and_suffix_checker then I guess no, because we need a function that checks the prefix for a specific format.
but maybe it'll give us a chance to implement proper parallel parsing already
It would be great.
|
@Mergifyio update |
|
Command
|
|
@Mergifyio update |
|
Command
|
|
@Mergifyio update |
|
Command
|
|
Internal documentation ticket: DOCSUP-13988 |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Support the case when the data is enclosed in array in JSONAsString input format. Closes #25517.