Skip to content

Another fix for cluster copier#46433

Merged
robot-clickhouse-ci-1 merged 1 commit intomasterfrom
fix-cluster-copier
Feb 16, 2023
Merged

Another fix for cluster copier#46433
robot-clickhouse-ci-1 merged 1 commit intomasterfrom
fix-cluster-copier

Conversation

@antonio2368
Copy link
Copy Markdown
Member

@antonio2368 antonio2368 commented Feb 15, 2023

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

It seems #46120 didn't fix the problem.
After checking logs, it simply exhausts the retries.
The first problem is that copy fault exhausts 2 attempts per partition piece number (we do 3 retries in total).
Still, we do retries on the entire partition also so we should have 3 * 3 attempts but because of a mistake a successful piece in the partition after a failed one would mark the entire partition as successful and break the partition loop (because it thinks it finished).
After the partition loop we check the partitions, find an incorrect piece and retry the entire table again. After 3 table retries we fail.

Close #30399 (final time I hope)

cc @nikitamikhaylov

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@robot-ch-test-poll robot-ch-test-poll added the pr-not-for-changelog This PR should not be mentioned in the changelog label Feb 15, 2023
@alexey-milovidov alexey-milovidov self-assigned this Feb 15, 2023
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 merged commit 2fbed70 into master Feb 16, 2023
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 deleted the fix-cluster-copier branch February 16, 2023 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: test_cluster_copier

5 participants