insertion deduplication on retries for materialised views#61601
insertion deduplication on retries for materialised views#61601
Conversation
|
This is an automated comment for commit 438fd89 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
eeb0b2a to
750cf61
Compare
1f3ab82 to
b2b4cbf
Compare
6b57e35 to
3436beb
Compare
c828bea to
9cf75d9
Compare
c31ad5f to
4fddb9a
Compare
4fddb9a to
cb94ff8
Compare
Co-authored-by: Kseniia Sumarokova <[email protected]>
Co-authored-by: Kseniia Sumarokova <[email protected]>
Co-authored-by: Kseniia Sumarokova <[email protected]>
|
03172_error_log_table_not_empty -- is flaky, fixing it in #66093 |
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240705) * Fix build due to ClickHouse/ClickHouse#61601 --------- Co-authored-by: kyligence-git <[email protected]> Co-authored-by: Chang Chen <[email protected]>
|
It is interesting
Upgrade Check -- 01275_parallel_mv -- flaks I did not see it in CI here! |
You only run a partial CI, not full. |
|
|
That is sad. I'm reverting this change. |
Implements ideas from #60008
Docs in progress ClickHouse/clickhouse-docs#2394
I improved deduplication by enhancing annotation of chunks on a pipeline level.
Now, each chunk could have several attached structures with base class
ChunkInfowhich are differ by the derived type. That annotation is passing with the chunks through theProcessors. SeeChunk::ChunkInfoCollection,CollectionOfDerivedItems<ChunkInfo>.The deduplication token for each chunk is written as
TokenInfo(derived class fromChunkInfo) withSetInitialTokenTransform. After that token could be updated. SeeDeduplicationToken::TokenInfo::BuildingStage.Initial value for
TokenInfois taken either frominsert_deduplication_tokensetting or it is calculated as a hash from inserted data.In order to distinguish equal blocks which should not be deduplicated,
TokenInfois update with more detailed information about the source of the data, like the names of MV on the way to the table.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
This PR changes how deduplication for MV works.
Fixed a lot of cases like:
Settings
update_insert_deduplication_token_in_dependent_materialized_viewsis depricated. The deduplicated token for inserted blocks in MV is calculated based on source data. Always.