MADV_DONTNEED check in runtime for qemu (via patching jemalloc)#18238
Conversation
Does the build hung? Or it will be started eventually? |
qemu does not support MADV_DONTNEED, and by not support it simply ignore it (i.e. return 0 -- no error). This issue has been "fixed" in ClickHouse#15590, however it just terminates the process, and completely breaks clickhouse under qemu (see also ClickHouse#15174). But there is no need in such strong protection, we can stop using madvise in jemalloc if MADV_DONTNEED does not work properly. And this is what ClickHouse#18169 was tried to do (by override madvise), however this will break sanitizers, at least TSAN and UBSAN. The problem there is that sanitizers initialization code uses madvise (and there is no way to turn this off with TSAN_OPTIONS) and overwritten madvise function will have sanitizers traits (__tsan_func_entry), while TSAN is not ready for this, and eventually it SIGSEGV. Interesting thing is that in the recent clang-12, madvise was replaced with direct syscall [1]. [1]: llvm/llvm-project@9f8c403 But it is better to make clickhouse compatible with clang < 12 too, so instead of override madvise completely, the runtime check was moved into the jemalloc code [2]. [2]: ClickHouse/jemalloc#1
d34f0bf to
e379b80
Compare
|
Performance: 7340839 |
It's marked as failed due to "passed: 0", actually it's Ok. |
Don't care. |
|
@azat I think it is ready to merge. |
@alexey-milovidov this should be a correct fix, however I want to wait for the feedback from maintainers, so if i.e. there will some wish to implement it in a different way or some cosmetic changes would rebase the backported patch (to avoid further conflicts and similar issues) So I'm suggesting wait for at least couple of work days (ideally week), and merge |
|
A link to upstream PR: jemalloc/jemalloc#2005 |
|
@azat Let's add a reference to the description of upstream PR, like this:
|
Sure, done (although this does not show the whole problem - any program with jemalloc under qemu is broken now) |
We can always change our mind later. |
qemu does not support MADV_DONTNEED, and by not support it simply ignore
it (i.e. return 0 -- no error).
This issue has been "fixed" in #15590, however it just
terminates the process, and completely breaks clickhouse under qemu
(see also #15174).
But there is no need in such strong protection, we can stop using
madvise in jemalloc if MADV_DONTNEED does not work properly.
And this is what #18169 was tried to do (by override madvise), however
this will break sanitizers, at least TSAN and UBSAN.
The problem there is that sanitizers initialization code uses madvise
(and there is no way to turn this off with TSAN_OPTIONS) and overwritten
madvise function will have sanitizers traits (__tsan_func_entry), while
TSAN is not ready for this, and eventually it SIGSEGV.
Interesting thing is that in the recent clang-12, madvise was replaced
with direct syscall 1.
But it is better to make clickhouse compatible with clang < 12 too, so
instead of override madvise completely, the runtime check was moved into
the jemalloc code 2.
Changelog category (leave one):
Cc: @alexey-milovidov
P.S. I will keep this as draft for a couple of days, to wait for feedback from jemalloc maintainers