Skip to content

Fix test test_backup_restore_on_cluster/test_disallow_concurrency#67336

Merged
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
vitlibar:fix-test_disallow_concurrency
Jul 30, 2024
Merged

Fix test test_backup_restore_on_cluster/test_disallow_concurrency#67336
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
vitlibar:fix-test_disallow_concurrency

Conversation

@vitlibar
Copy link
Copy Markdown
Member

@vitlibar vitlibar commented Jul 29, 2024

Changelog category (leave one):

  • Not for changelog

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix integration test test_backup_restore_on_cluster/test_disallow_concurrency.

Example of failure: https://s3.amazonaws.com/clickhouse-test-reports/0/670413a69d3ae9971d804914f52f399c702f6df7/integration_tests__asan__[3_4].html

This PR fixes #42719 (comment)

@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-build Pull request with build/testing/packaging improvement label Jul 29, 2024
@robot-clickhouse
Copy link
Copy Markdown
Member

robot-clickhouse commented Jul 29, 2024

This is an automated comment for commit 634c513 with description of existing statuses. It's updated for the latest CI running

✅ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
ClickBenchRuns [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker keeper imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docker server imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docs checkBuilds and tests the documentation✅ success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here✅ success
Flaky testsChecks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests✅ success
Performance ComparisonMeasure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests✅ success
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success
Upgrade checkRuns stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts✅ success

Copy link
Copy Markdown
Member Author

@vitlibar vitlibar Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main part of the fix is here. The test failed sometimes because it expected error

Exception(ErrorCodes::BACKUP_ALREADY_EXISTS, "Backup {} already exists", ...);

whereas the code sometimes throws also error

Exception(ErrorCodes::BACKUP_ALREADY_EXISTS, "Backup {} is being written already", ...)

So it seems it's more reliable to check for error code BACKUP_ALREADY_EXISTS instead.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test used to to start 10 nodes and only 2 nodes were really necessary.

Copy link
Copy Markdown
Member Author

@vitlibar vitlibar Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File ./_gen/cluster_for_concurrency_test.xml is used by a different test named test_backup_restore_on_cluster/test_concurrency. It's safer to use a different file name.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following two assertions were wrong - the first one was wrong if error is Null, and the second one was wrong if error is not Null. I fixed both assertions in this test and other tests, and also added comments and helper functions to make these tests easier to read.

@vitlibar vitlibar added the 🍃 green ci 🌿 Fixing flaky tests in CI label Jul 29, 2024
@vitlibar
Copy link
Copy Markdown
Member Author

vitlibar commented Jul 29, 2024

Integration tests flaky check (asan) shows that test test_backup_restore_on_cluster/test.py::test_system_functions fails if it runs multiple times, I'll fix the cleanup in test_backup_restore_on_cluster/test.py::test_system_functions too.

@vitlibar vitlibar force-pushed the fix-test_disallow_concurrency branch from 67c2b05 to 634c513 Compare July 29, 2024 11:19
@thevar1able thevar1able changed the title Fix test test_backup_restore_on_cluster/test_disallow_concurrency Fix test test_backup_restore_on_cluster/test_disallow_concurrency Jul 29, 2024
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Jul 30, 2024
Merged via the queue into ClickHouse:master with commit 5bb20f4 Jul 30, 2024
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jul 30, 2024
@vitlibar vitlibar deleted the fix-test_disallow_concurrency branch July 30, 2024 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🍃 green ci 🌿 Fixing flaky tests in CI pr-build Pull request with build/testing/packaging improvement pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test test_backup_restore_on_cluster/test_concurrency.py

5 participants