Check MADV_DONTNEED (for jemalloc), maybe broken under qemu#15590
Merged
alexey-milovidov merged 6 commits intoClickHouse:masterfrom Oct 5, 2020
Merged
Check MADV_DONTNEED (for jemalloc), maybe broken under qemu#15590alexey-milovidov merged 6 commits intoClickHouse:masterfrom
alexey-milovidov merged 6 commits intoClickHouse:masterfrom
Conversation
f3551dd to
3301dde
Compare
46c2ac2 to
a139a59
Compare
jemalloc relies on working MADV_DONTNEED (that fact that after
madvise(MADV_DONTNEED) returns success, after subsequent access to those
pages they will be zeroed).
However qemu does not support this, yet [1], and you can get very tricky
assert if you will run clickhouse-server under qemu:
<jemalloc>: ../contrib/jemalloc/src/extent.c:1195: Failed assertion: "p[i] == 0"
[1]: https://patchwork.kernel.org/patch/10576637/
But after this patch you will get pretty error:
$ qemu-x86_64-static programs/clickhouse
MADV_DONTNEED does not zeroed page. jemalloc will be broken
261d3e2 to
0a9268c
Compare
0a9268c to
9e3ff34
Compare
programs/main.cpp
Outdated
| } | ||
| } | ||
| /// Macros to avoid using strlen(), since it may fail if SSE is not supported. | ||
| #define writeError(data) \ |
Member
There was a problem hiding this comment.
There is an idiom to write do {} while (0) to avoid issues when someone will write something like if (...) writeError(data);. I'll correct it just in case...
alexey-milovidov
approved these changes
Oct 5, 2020
Contributor
|
Do I understand correctly that this PR was intentionally merged in a broken state? |
Member
Author
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Dec 19, 2020
qemu does not support MADV_DONTNEED, and by not support it simply ignore it (i.e. return 0 -- no error). This issue has been "fixed" in ClickHouse#15590, however it just terminates the process, and completely breaks clickhouse under qemu (see also ClickHouse#15174). But there is no need in such strong protection, we can stop using madvise in jemalloc if MADV_DONTNEED does not work properly. And this is what ClickHouse#18169 was tried to do (by override madvise), however this will break sanitizers, at least TSAN and UBSAN. The problem there is that sanitizers initialization code uses madvise (and there is no way to turn this off with TSAN_OPTIONS) and overwritten madvise function will have sanitizers traits (__tsan_func_entry), while TSAN is not ready for this, and eventually it SIGSEGV. Interesting thing is that in the recent clang-12, madvise was replaced with direct syscall [1]. [1]: llvm/llvm-project@9f8c403 But it is better to make clickhouse compatible with clang < 12 too, so instead of override madvise completely, the runtime check was moved into the jemalloc code [2]. [2]: ClickHouse/jemalloc#1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Check MADV_DONTNEED (for jemalloc), maybe broken under qemu
Detailed description / Documentation draft:
jemalloc relies on working MADV_DONTNEED (that fact that after
madvise(MADV_DONTNEED) returns success, after subsequent access to those
pages they will be zeroed).
However qemu does not support this, yet 1, and you can get very tricky
assert if you will run clickhouse-server under qemu:
But after this patch you will get pretty error:
P.S. I also checked clickhouse with patched qemu, that returns ENOSYS in the linux-user layer, and jemalloc assert no more triggered.
Refs: https://gist.github.com/azat/12ba2c825b710653ece34dba7f926ece
Details
HEAD:
bugprone-macro-parentheses)