Merged
Conversation
Member
|
FYI @al13n321, it can remove additional memcpy while reading Parquet files. |
Contributor
|
Workflow [PR], commit [4fa0665] Summary: ❌
|
8fd1cff to
5c6cbc5
Compare
Collaborator
|
It would be perfect to land this before #82850, so that the new serialization layout can be better aligned and optimized. |
Member
Author
|
I've sped up |
This reverts commit 856ed0f.
Avogar
added a commit
to Avogar/ClickHouse
that referenced
this pull request
Aug 26, 2025
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 26, 2025
Fix use-of-unitialized-value and crash introduced in #85063
robot-clickhouse
added a commit
that referenced
this pull request
Aug 26, 2025
This was referenced Aug 27, 2025
Closed
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Remove zero byte. Closes #85062. A few minor bugs were fixed. Functions
structureToProtobufSchema,structureToCapnProtoSchemadidn't correctly put a zero-terminating byte and were using a newline instead of it. That was leading to a missing newline in the output, and could lead to buffer overflows while using other functions that depend on the zero byte (such aslogTrace,demangle,extractURLParameter,toStringCutToZero, andencrypt/decrypt). Theregexp_treedictionary layout didn't support processing strings with zero bytes. TheformatRowNoNewlinefunction, called withValuesformat or with any other format without a newline at the end of rows, erroneously cuts the last character of the output. Functionstemcontained an exception-safety error that could lead to a memory leak in a very rare scenario. Theinitcapfunction worked in the wrong way forFixedStringarguments: it didn't recognize the start of the word at the start of the string if the previous string in a block ended with a word character. Fixed a security vulnerability of the ApacheORCformat, which could lead to the exposure of uninitialized memory. Changed behavior of the functionreplaceRegexpAlland the corresponding alias,REGEXP_REPLACE: now it can do an empty match at the end of the string even if the previous match processed the whole string, such as in the case of^a*|a*$or^|.*- this corresponds to the semantic of JavaScript, Perl, Python, PHP, Ruby, but differs to the semantic of PostgreSQL. Implementation of many functions has been simplified and optimized. Documentation for several functions was wrong and has now been fixed. Keep in mind that the output ofbyteSizefor String columns and complex types, which consisted of String columns, has changed (from 9 bytes per empty string to 8 bytes per empty string), and this is normal.