Skip to content

Allow data in square brackets in JSONAsString format#25633

Merged
nikitamikhaylov merged 7 commits intoClickHouse:masterfrom
Avogar:json-as-string
Aug 30, 2021
Merged

Allow data in square brackets in JSONAsString format#25633
nikitamikhaylov merged 7 commits intoClickHouse:masterfrom
Avogar:json-as-string

Conversation

@Avogar
Copy link
Copy Markdown
Member

@Avogar Avogar commented Jun 23, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Support the case when the data is enclosed in array in JSONAsString input format. Closes #25517.

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Jun 23, 2021
/// Some input formats can have non trivial readPrefix() and readSuffix(),
/// so in some cases there is no possibility to use parallel parsing.
/// The checker should return true if parallel parsing should be disabled.
using NonTrivialPrefixAndSuffixChecker = std::function<bool(ReadBuffer & buf)>;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Ok

if (buf.eof() || *buf.position() == '[')
parallel_parsing = false; /// Disable it for JSONEachRow if data is in square brackets (see JSONEachRowRowInputFormat)
const auto & non_trivial_prefix_and_suffix_checker = getCreators(name).non_trivial_prefix_and_suffix_checker;
/// Disable parallel parsing for input formats with non-trivial readPrefix() and readSuffix().
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just make it a bool false flag? The amount of crutches here is becoming unmanageable. Instead we should parse the prefix properly when parsing in parallel (this is a topic for a separate PR).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling it for JSONEachRow would technically be a performance regression, but maybe it'll give us a chance to implement proper parallel parsing already.
Added here #8958

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just make it a bool false flag?

If you are about non_trivial_prefix_and_suffix_checker then I guess no, because we need a function that checks the prefix for a specific format.

but maybe it'll give us a chance to implement proper parallel parsing already

It would be great.

@Avogar
Copy link
Copy Markdown
Member Author

Avogar commented Aug 5, 2021

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Aug 5, 2021

Command update: success

Branch has been successfully updated

@nikitamikhaylov
Copy link
Copy Markdown
Member

@Mergifyio update

@nikitamikhaylov nikitamikhaylov self-assigned this Aug 12, 2021
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Aug 12, 2021

Command update: success

Branch has been successfully updated

@nikitamikhaylov
Copy link
Copy Markdown
Member

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Aug 26, 2021

Command update: success

Branch has been successfully updated

@nikitamikhaylov nikitamikhaylov merged commit fb66ab7 into ClickHouse:master Aug 30, 2021
@sevirov
Copy link
Copy Markdown
Contributor

sevirov commented Aug 30, 2021

Internal documentation ticket: DOCSUP-13988

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JSONAsString does not support the case when the data is enclosed in array.

5 participants