Solve direct read issue with low cardinality text columns#87994
Merged
Ergus merged 14 commits intoClickHouse:masterfrom Oct 21, 2025
Merged
Solve direct read issue with low cardinality text columns#87994Ergus merged 14 commits intoClickHouse:masterfrom
Ergus merged 14 commits intoClickHouse:masterfrom
Conversation
When the input column is low cardinality, it seems like hasToken and similar functions are reported also as low cardinality result type in the dag. This is strictly true, but for direct read the column will contain only 0 and 1. So it is simpler and easier just set it as UInt8 explicitly in all the cases. Maybe one day we will have some boolean support in columns (storing internally some bitmap) and we can use that specialization.
Contributor
|
Workflow [PR], commit [4ad1a78] Summary: ❌
|
rschu1ze
approved these changes
Oct 1, 2025
ahmadov
reviewed
Oct 2, 2025
rschu1ze
reviewed
Oct 5, 2025
Update test 02346_text_index_bug87887.sql
This comment was marked as resolved.
This comment was marked as resolved.
ahmadov
approved these changes
Oct 6, 2025
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
1 task
Merged
via the queue into
ClickHouse:master
with commit Oct 21, 2025
db4efe2
230 of 361 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When the input column for is low cardinality, it seems like hasToken and similar functions are reported also as low cardinality result type in the dag.
This is strictly true, but for direct read the column will contain only 0 and 1. So it is simpler and easier just set it as UInt8 explicitly in all the cases.
Maybe one day we will have some boolean support in columns (storing internally some bitmap) and we can use that specialization.
Resolves: #87887
Resolves: #88119
Changelog category (leave one):