Add support for DWARF-5 (resubmit, in attempt to get core dump)#40772
Closed
azat wants to merge 12 commits intoClickHouse:masterfrom
Closed
Add support for DWARF-5 (resubmit, in attempt to get core dump)#40772azat wants to merge 12 commits intoClickHouse:masterfrom
azat wants to merge 12 commits intoClickHouse:masterfrom
Conversation
Signed-off-by: Azat Khuzhin <[email protected]> (cherry picked from commit 444acb9)
I have to do this, since I have libraries compiled with DWARF5 (i.e. glibc). ClickHouse changes: - use camel_case - add NOLINT - avoid using folly:: (use std:: instead) - avoid using boost:: (use std:: instead) Refs: facebook/folly@490b287 Signed-off-by: Azat Khuzhin <[email protected]> (cherry picked from commit ee5696b) (cherry picked from commit e03870b)
Signed-off-by: Azat Khuzhin <[email protected]>
Signed-off-by: Azat Khuzhin <[email protected]>
gcore is a gdb command, that internally uses gdb to dump the core. However with proper configuration of limits (core_dump.size_limit) it should not be required, althought some issues is possible: - non standard kernel.core_pattern - sanitizers So yes, gcore is more "universal", but it is ad-hoc, let's try to switch to more native way. Signed-off-by: Azat Khuzhin <[email protected]>
Member
Author
|
Actually even gdb does not work reliably with DWARF-5: And that's why there are no core files: |
Signed-off-by: Azat Khuzhin <[email protected]>
Merge this patch to preserve coredump w/o gdb, since that version of gdb does not work with DWARF 5 * ci/core-dumps-rework: Rework core collecting on CI (eliminate gcore usage)
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Sep 4, 2022
gcore is a gdb command, that internally uses gdb to dump the core. However with proper configuration of limits (core_dump.size_limit) it should not be required, althought some issues is possible: - non standard kernel.core_pattern - sanitizers So yes, gcore is more "universal" (you don't need to configure any `kernel_pattern`), but it is ad-hoc, and it has drawbacks - **it does not work when gdb fails**. For example gdb may fail with `Dwarf Error: DW_FORM_strx1 found in non-DWO CU` in case of DWARF-5 [1]. [1]: ClickHouse#40772 (comment). Let's try to switch to more native way. Signed-off-by: Azat Khuzhin <[email protected]>
Merged
Member
Author
|
Core dumps did not help either, but: |
This was referenced Sep 6, 2022
Closed
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Sep 11, 2022
ClickHouse changes to the folly parser:
- use camel_case
- add NOLINT
- avoid using folly:: (use std:: instead)
- avoid using boost:: (use std:: instead)
But note, now it has not been enabled by default (like it was
initially), because you may need recent debugger to support DWARF-5
correctly, and to make debugging easier, let's do this later.
A good example is gdb 10, even though it looks like it should support
it, it still produce some errors, like here [1]:
Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /usr/bin/clickhouse]
[1]: ClickHouse#40772 (comment)
And not only it complains, apparently it can "activate" SDT probes
(replace "nop" with "int3"), and I believe this is what happens here
[2].
[2]: ClickHouse#41063 (comment)
There you got int3 in the case when ClickHouse got SIGTRAP:
<details>
```
0x7f494705e093 <+1139>: jne 0x7f494705e450 ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
0x7f494705e099 <+1145>: testl %r13d, %r13d
0x7f494705e09c <+1148>: je 0x7f494705e09f ; <+1151> at dl-open.c:744:6
0x7f494705e09e <+1150>: int3
-> 0x7f494705e09f <+1151>: movl -0x54(%rbp), %eax
0x7f494705e0a2 <+1154>: testl %eax, %eax
0x7f494705e0a4 <+1156>: jne 0x7f494705e410 ; <+2032> at dl-open.c:745:5
But if I repeat the query it does not:
0x7ffff7fe5093 <+1139>: jne 0x7ffff7fe5450 ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
0x7ffff7fe5099 <+1145>: testl %r13d, %r13d
0x7ffff7fe509c <+1148>: je 0x7ffff7fe509f ; <+1151> at dl-open.c:744:6
0x7ffff7fe509e <+1150>: nop
-> 0x7ffff7fe509f <+1151>: movl -0x54(%rbp), %eax
0x7ffff7fe50a2 <+1154>: testl %eax, %eax
0x7ffff7fe50a4 <+1156>: jne 0x7ffff7fe5410 ; <+2032> at dl-open.c:745:5
```
</details>
Test command was:
clickhouse local --stacktrace -q "select * from file('data.capnp', 'CapnProto', 'val1 char') settings format_schema='nonexist:Message'
*P.S. I did this, because I have libraries compiled with DWARF5 (i.e. glibc), and dwarf parser simply fails on my dev env.*
Refs: facebook/folly@490b287
(cherry picked from commit ee5696b)
(cherry picked from commit e03870b)
Signed-off-by: Azat Khuzhin <[email protected]>
Member
Author
|
Last attempt did not help, anyway I've submitted a patch that only adds support of DWARF-5 for the parser, w/o enabling it - #41193 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add support for DWARF-5
This is resubmit of #40710, since I cannot reproduce the SIGTRAP issue locally.