Improve JSONEachRow reading by ignoring the keys case#61750
Improve JSONEachRow reading by ignoring the keys case#61750al13n321 merged 7 commits intoClickHouse:masterfrom
Conversation
|
This is an automated comment for commit 0990a82 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
7159bec to
37f028d
Compare
|
I like the idea, but I'd prefer a few things differently:
|
|
done @Avogar |
39949fa to
93de14c
Compare
|
Could this pr be merged ? and support for other formats and modes can be later @divanik @alexey-milovidov |
|
It does not build, and the tests don't pass.
Not later. The code should be complete and consistent. The user experience should be flawless. |
|
OK |
ef86407 to
70d6a62
Compare
|
Dear @divanik, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
|
This is a good change - please resolve conflicts and merge. |
|
Test failures look unrelated:
|
| {"output_format_csv_serialize_tuple_into_separate_columns", true, true, "A new way of how interpret tuples in CSV format was added."}, | ||
| {"input_format_csv_deserialize_separate_columns_into_tuple", true, true, "A new way of how interpret tuples in CSV format was added."}, | ||
| {"input_format_csv_try_infer_strings_from_quoted_tuples", true, true, "A new way of how interpret tuples in CSV format was added."}, | ||
| {"input_format_json_ignore_key_case", false, false, "Ignore json key case while read json field from string."}, |
What changes were proposed in this pull request? (Please fill in changes proposed in this fix) (Fixes: #6262) this pr is depend on clickhouse pr: ClickHouse/ClickHouse#61750 How was this patch tested? test by ut
|
It is implemented not like we needed. Let's revert. |
| Block::NameMap name_map; | ||
|
|
||
| /// Hash table match `lower_case field name -> field name in the block`. | ||
| std::unordered_map<String, StringRef> lower_case_name_map; |
| { | ||
| String field_name_str = field_name.toString(); | ||
| std::transform(field_name_str.begin(), field_name_str.end(), field_name_str.begin(), | ||
| [](unsigned char c) { return std::tolower(c); }); |
|
This will be reverted, because the review is not addressed: #61750 (comment) |
Revert #61750 "Improve JSONEachRow reading by ignoring the keys case"
… due to ClickHouse/ClickHouse#67484 revert ClickHouse/ClickHouse#61750 This reverts commit 2be6c86.
… due to ClickHouse/ClickHouse#67484 revert ClickHouse/ClickHouse#61750 This reverts commit 2be6c86.
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240808) * Fix build due to ClickHouse/ClickHouse#66279 * Revert "[GLUTEN-6262][CH]Json input format ignore key case #6263" due to ClickHouse/ClickHouse#67484 revert ClickHouse/ClickHouse#61750 This reverts commit 2be6c86. * fix gtest due to #6722 --------- Co-authored-by: kyligence-git <[email protected]> Co-authored-by: Chang Chen <[email protected]>
|
would you open another pr to slove this problem? @alexey-milovidov |
|
We should do it, but I didn't plan it for the next days. It will be very helpful if you resumbit. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Allow matching column names in a case insensitive manner when reading json files (
input_format_json_case_insensitive_column_matching).So we introduce a config to ignore the case problem in such situation.
Documentation entry for user-facing changes
Modify your CI run:
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Set desired options before CI starts or re-push after updates
Run only:
CI options:
Only specified batches in multi-batch jobs: