Skip to content

Add JSONExtractKeys function#30056

Merged
kitaisreal merged 4 commits intoClickHouse:masterfrom
orloffv:orloffv.JSONExtractKeys
Oct 14, 2021
Merged

Add JSONExtractKeys function#30056
kitaisreal merged 4 commits intoClickHouse:masterfrom
orloffv:orloffv.JSONExtractKeys

Conversation

@orloffv
Copy link
Copy Markdown
Contributor

@orloffv orloffv commented Oct 12, 2021

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added new JSONExtractKeys function
...

Detailed description / Documentation draft:
JSONExtractKeys(json[, indices_or_keys]…)
Returns an array with keys of JSON.
Example:

SELECT JSONExtractKeys('{"a": "hello", "b": [-100, 200.0, 300]}') = ('a','b');

By adding documentation, you'll allow users to try your new feature immediately, not when someone else will have time to document it later. Documentation is necessary for all features that affect user experience in any way. You can add brief documentation draft above, or add documentation right into your patch as Markdown files in docs folder.

If you are doing this for the first time, it's recommended to read the lightweight Contributing to ClickHouse Documentation guide first.

Information about CI checks: https://clickhouse.tech/docs/en/development/continuous-integration/

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Oct 12, 2021
@kitaisreal kitaisreal self-assigned this Oct 12, 2021
auto object = element.getObject();

auto & col_arr = assert_cast<ColumnArray &>(dest);
auto & col_tuple = assert_cast<ColumnTuple &>(col_arr.getData());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you have Array(Tuple(String))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the correct type is Array(String).


static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &)
{
return std::make_unique<DataTypeString>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here it's just String.

@orloffv orloffv force-pushed the orloffv.JSONExtractKeys branch from 8ef7001 to b101412 Compare October 13, 2021 05:39
@orloffv orloffv force-pushed the orloffv.JSONExtractKeys branch from b101412 to 1623bfa Compare October 13, 2021 06:55
@orloffv
Copy link
Copy Markdown
Contributor Author

orloffv commented Oct 13, 2021

My business case is:
get only keys in json for suggest

perfomance test

CREATE TABLE json
(
    `json` String
)
ENGINE = MergeTree
ORDER BY json
INSERT INTO json
SELECT '{"dateTime": "2021-10-13 16:13:36", "additional": {"message_formatted": "Request timeout: search request deadline reached: remaining 2991 ms, exception: ", "thrown_message": "java.util.concurrent.TimeoutException: Request timeout to unknown_host/2a02:31:0:34f00:0:0126:8099 after 1000 ms", "logger": "com.google.test.suggest", "thrown_name": "java.util.concurrent.ExecutionException", "yd_stage": "production-preferences-production-alpha"}}' as json
FROM system.numbers
limit 100000000

JSONExtractKeysAndValuesRaw with arrayMap

SELECT DISTINCT arrayJoin(arrayMap(x -> tupleElement(x, 1), JSONExtractKeysAndValuesRaw(json)))
FROM json
2 rows in set. Elapsed: 12.162 sec. Processed 100.00 million rows, 44.90 GB (8.22 million rows/s., 3.69 GB/s.)

JSONExtractKeys

SELECT DISTINCT arrayJoin(JSONExtractKeys(json))
FROM json
2 rows in set. Elapsed: 3.306 sec. Processed 100.00 million rows, 44.90 GB (30.25 million rows/s., 13.58 GB/s.)

@orloffv
Copy link
Copy Markdown
Contributor Author

orloffv commented Oct 13, 2021

@alexey-milovidov do you know better query for my case?

@orloffv orloffv closed this Oct 14, 2021
@orloffv orloffv reopened this Oct 14, 2021
@kitaisreal kitaisreal merged commit 6c6dd7a into ClickHouse:master Oct 14, 2021
@alexey-milovidov
Copy link
Copy Markdown
Member

@orloffv This query is Ok.
PS. The result is awesome!

@kitaisreal Why don't add it both queries to perf test?

@gyuton
Copy link
Copy Markdown
Contributor

gyuton commented Oct 15, 2021

Internal documentation ticket: DOCSUP-16608.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants