Skip to content

Unpoison fiber stack before each resume to fix ASan false positives#100281

Merged
alexey-milovidov merged 1 commit intomasterfrom
workaround-asan-fiber-false-positive
Mar 21, 2026
Merged

Unpoison fiber stack before each resume to fix ASan false positives#100281
alexey-milovidov merged 1 commit intomasterfrom
workaround-asan-fiber-false-positive

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

ASan's use-after-scope detection poisons real stack memory when local variables go out of scope. On fiber stacks managed by makecontext/swapcontext, this poisoning persists across context switches (yield/resume), causing false positives when the same stack addresses are reused by new frames in subsequent fiber resumes — notably during exception unwinding in _Unwind_Resume.

The fix: call ASAN_UNPOISON_MEMORY_REGION on the entire fiber stack before each Fiber::resume, clearing any stale scope poisoning. This is complementary to the __sanitizer_start_switch_fiber / __sanitizer_finish_switch_fiber annotations in boost::context, which only manage the fake stack (for use-after-return), not real stack poisoning (for use-after-scope).

See google/sanitizers#189

Example failure: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100041&sha=e3d5fc0647f84f316414376a48aa9ffbf8565b07&name_0=PR&name_1=AST%20fuzzer%20%28arm_asan_ubsan%29

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

ASan's use-after-scope detection poisons real stack memory when local
variables go out of scope. On fiber stacks managed by
`makecontext`/`swapcontext`, this poisoning persists across context
switches (yield/resume), causing false positives when the same stack
addresses are reused by new frames in subsequent fiber resumes -
notably during exception unwinding in `_Unwind_Resume`.

The fix: call `ASAN_UNPOISON_MEMORY_REGION` on the entire fiber stack
before each `Fiber::resume`, clearing any stale scope poisoning. This
is complementary to the `__sanitizer_start_switch_fiber` /
`__sanitizer_finish_switch_fiber` annotations in boost::context, which
only manage the fake stack (for use-after-return), not real stack
poisoning (for use-after-scope).

See google/sanitizers#189

Example: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100041&sha=e3d5fc0647f84f316414376a48aa9ffbf8565b07&name_0=PR&name_1=AST%20fuzzer%20%28arm_asan_ubsan%29

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 21, 2026

Workflow [PR], commit [1f870bb]

Summary:


AI Review

Summary

This PR addresses ASan false positives on fiber stacks by unpoisoning the allocated fiber stack before each Fiber::resume. The change is focused, ASan-scoped, and consistent with the documented sanitizer limitation around use-after-scope on swapcontext-style stacks. I did not find high-confidence correctness, safety, or performance issues requiring changes.

ClickHouse Rules

Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time

Final Verdict

  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-ci label Mar 21, 2026
Copy link
Copy Markdown
Member Author

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 21, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.90% 84.00% +0.10%
Functions 24.70% 24.60% -0.10%
Branches 76.50% 76.50% +0.00%

PR changed lines: PR changed-lines coverage: 100.00% (15/15, 0 noise lines excluded)
Diff coverage report
Uncovered code

@alexey-milovidov alexey-milovidov self-assigned this Mar 21, 2026
@alexey-milovidov alexey-milovidov merged commit 569e7e9 into master Mar 21, 2026
151 of 152 checks passed
@alexey-milovidov alexey-milovidov deleted the workaround-asan-fiber-false-positive branch March 21, 2026 17:25
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 21, 2026
Comment on lines +50 to +51
void * stack_base = nullptr;
size_t stack_allocation_size = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting why it does not trigger that the variable is not used warning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants