Skip to content

Fix data race on shutdown_called in DatabaseMaterializedPostgreSQL#97554

Merged
alexey-milovidov merged 3 commits intomasterfrom
fix-materialized-postgresql-shutdown-race
Feb 23, 2026
Merged

Fix data race on shutdown_called in DatabaseMaterializedPostgreSQL#97554
alexey-milovidov merged 3 commits intomasterfrom
fix-materialized-postgresql-shutdown-race

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Summary

  • Fix data race on shutdown_called in DatabaseMaterializedPostgreSQL — was a plain bool written by TCPHandler thread during DROP DATABASE (in shutdown) and read concurrently by BackgroundSchedulePool thread (in tryStartSynchronization)
  • TSan detected this race and aborted the server, causing all subsequent test_postgresql_replica_database_engine/test_3.py tests to fail with "Connection refused"
  • Changed to std::atomic<bool>

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97541&sha=9c54f0ea0a9a17ee5cf1bd5077ef581f6419b5c2&name_0=PR&name_1=Integration%20tests%20%28amd_tsan%2C%204%2F6%29

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix data race on shutdown_called in DatabaseMaterializedPostgreSQL.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

🤖 Generated with Claude Code

`shutdown_called` was a plain `bool` written by TCPHandler thread during
`DROP DATABASE` (in `shutdown`) and read concurrently by
BackgroundSchedulePool thread (in `tryStartSynchronization`). TSan
detected this race and aborted the server, causing all subsequent
`test_postgresql_replica_database_engine/test_3.py` tests to fail with
"Connection refused".

Changed to `std::atomic<bool>`.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97541&sha=9c54f0ea0a9a17ee5cf1bd5077ef581f6419b5c2&name_0=PR&name_1=Integration%20tests%20%28amd_tsan%2C%204%2F6%29

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 21, 2026

Workflow [PR], commit [b231049]

Summary:

job_name test_name status info comment
AST fuzzer (amd_debug) failure
Logical error: Bad cast from type A to B (STID: 0250-3d48) FAIL cidb, issue

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Feb 21, 2026
alexey-milovidov added a commit that referenced this pull request Feb 21, 2026
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
alexey-milovidov added a commit that referenced this pull request Feb 21, 2026
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
alexey-milovidov added a commit that referenced this pull request Feb 21, 2026
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@Algunenano Algunenano self-assigned this Feb 21, 2026
Copy link
Copy Markdown
Member

@Algunenano Algunenano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed the TSan report — data race between TCPHandler thread writing shutdown_called in shutdown() and BgSchPool thread reading it in tryStartSynchronization(). The std::atomic<bool> fix is correct and minimal.

@alexey-milovidov alexey-milovidov force-pushed the fix-materialized-postgresql-shutdown-race branch from 3f86a61 to f982a7b Compare February 22, 2026 22:45
@alexey-milovidov alexey-milovidov merged commit fb4b1ac into master Feb 23, 2026
146 of 148 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-materialized-postgresql-shutdown-race branch February 23, 2026 15:38
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Feb 23, 2026
Algunenano pushed a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
ClickHouse#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Algunenano pushed a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
ClickHouse#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Algunenano pushed a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
… link

The "Finish workflow" check requires Bug Fix PRs to include new or
changed tests. This is too strict for PRs that fix CI failures, where
the existing CI already serves as the test. Allow the check to pass
if the PR description contains a link to a CI report
(`s3.amazonaws.com/clickhouse-test-reports`), e.g. as in
ClickHouse#97554

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Algunenano pushed a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
…postgresql-shutdown-race

Fix data race on shutdown_called in DatabaseMaterializedPostgreSQL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants