Optimize inverse dictionary lookup#88971
Conversation
|
Workflow [PR], commit [4369634] Summary: ❌
|
It is OK, that it's under setting, because there is certain problem that a lot of default dictionary layouts quite bad on access not by key column. When people manually do this kind of optimization, sometimes they do it not against dictionary, but "SOURCE" table of it. Another, more complex alternative, is to build cache of color_id -> result of expression: dictGet('colors','name', xxx) = 'red' |
Actually, I think the complex alternative that you mentioned is happening under the hood with this optimization. For example, for this query: EXPLAIN PLAN SELECT color_id, payload FROM t WHERE dictGetString('colors', 'name', color_id) = 'red' ORDER BY color_id, payload;The output is: CreatingSets (Create sets before main query execution)
Expression (Project names)
Sorting (Sorting for ORDER BY)
Expression ((Before ORDER BY + Projection))
Expression ((WHERE + Change column names to column identifiers))
ReadFromMergeTree (default.t)My understanding is that it means |
Co-authored-by: Dmitry Novik <[email protected]>
Co-authored-by: Dmitry Novik <[email protected]>
c607c00
|
@nihalzp hi, please look at this performance run, perf degradation seems to be related |
|
@fm4v Most likely not caused by this. The optimization in this PR is only applied for It is confirmed by |

Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Optimize repeated inverse dictionary lookups by doing faster lookups into a precomputed set of possible key values. Closes #7968.
Documentation entry for user-facing changes
Directly edit documentation source files in the "docs" folder with the same pull-request as code changes
or
Add a user-readable short description of the changes that should be added to docs.clickhouse.com below.
At a minimum, the following information should be added (but add more as needed).
Motivation: Why is this function, table engine, etc. useful to ClickHouse users?
Parameters: If the feature being added takes arguments, options or is influenced by settings, please list them below with a brief explanation.
Example use: A query or command.
Details
Optimize
WHERE dictGetFamily(..., ATTR_COL, COL) = CONSTEXPRintoWHERE COL IN (SELECT ... FROM dictionary WHERE ATTR_COL = CONSTEXPR)Supports:
dictGetfamilies likedictGetString,dictGetInt32, etc.Example:
Result:
Performance:
Before:
Elapsed: 0.198 sec.
Now:
Elapsed: 0.056 sec.
Resolves #7968