Skip to content

Add support for DWARF-5 (without emitting them in binaries)#41193

Merged
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
azat:DWARF-5-parser-v3
Sep 16, 2022
Merged

Add support for DWARF-5 (without emitting them in binaries)#41193
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
azat:DWARF-5-parser-v3

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Sep 11, 2022

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add support for DWARF-5 (without emitting them in binaries)

ClickHouse changes to the folly parser:

  • use camel_case
  • add NOLINT
  • avoid using folly:: (use std:: instead)
  • avoid using boost:: (use std:: instead)

But note, now it has not been enabled by default (like it was initially), because you may need recent debugger to support DWARF-5 correctly, and to make debugging easier, let's do this later.

A good example is gdb 10, even though it looks like it should support it, it still produce some errors, like here 1:

Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /usr/bin/clickhouse]

And not only it complains, apparently it can "activate" SDT probes (replace "nop" with "int3"), and I believe this is what happens here 2.

There you got int3 in the case when ClickHouse got SIGTRAP:

Details
    0x7f494705e093 <+1139>: jne    0x7f494705e450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7f494705e099 <+1145>: testl  %r13d, %r13d
    0x7f494705e09c <+1148>: je     0x7f494705e09f            ; <+1151> at dl-open.c:744:6
    0x7f494705e09e <+1150>: int3
->  0x7f494705e09f <+1151>: movl   -0x54(%rbp), %eax
    0x7f494705e0a2 <+1154>: testl  %eax, %eax
    0x7f494705e0a4 <+1156>: jne    0x7f494705e410            ; <+2032> at dl-open.c:745:5

But if I repeat the query it does not:

    0x7ffff7fe5093 <+1139>: jne    0x7ffff7fe5450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7ffff7fe5099 <+1145>: testl  %r13d, %r13d
    0x7ffff7fe509c <+1148>: je     0x7ffff7fe509f            ; <+1151> at dl-open.c:744:6
    0x7ffff7fe509e <+1150>: nop
->  0x7ffff7fe509f <+1151>: movl   -0x54(%rbp), %eax
    0x7ffff7fe50a2 <+1154>: testl  %eax, %eax
    0x7ffff7fe50a4 <+1156>: jne    0x7ffff7fe5410            ; <+2032> at dl-open.c:745:5

Test command was:

clickhouse local --stacktrace -q "select * from file('data.capnp', 'CapnProto', 'val1 char') settings format_schema='nonexist:Message'

P.S. I did this, because I have libraries compiled with DWARF5 (i.e. glibc), and dwarf parser simply fails on my dev env.

Refs: facebook/folly@490b287 (cherry picked from commit ee5696b) (cherry picked from commit e03870b)

Refs: #41063
Refs: #41063
Refs: #40710 (cc @alexey-milovidov )
Refs: #40747 (cc @tavplubix )
Refs: #41191

ClickHouse changes to the folly parser:
- use camel_case
- add NOLINT
- avoid using folly:: (use std:: instead)
- avoid using boost:: (use std:: instead)

But note, now it has not been enabled by default (like it was
initially), because you may need recent debugger to support DWARF-5
correctly, and to make debugging easier, let's do this later.

A good example is gdb 10, even though it looks like it should support
it, it still produce some errors, like here [1]:

    Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /usr/bin/clickhouse]

  [1]: ClickHouse#40772 (comment)

And not only it complains, apparently it can "activate" SDT probes
(replace "nop" with "int3"), and I believe this is what happens here
[2].

  [2]: ClickHouse#41063 (comment)

There you got int3 in the case when ClickHouse got SIGTRAP:

<details>

```
    0x7f494705e093 <+1139>: jne    0x7f494705e450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7f494705e099 <+1145>: testl  %r13d, %r13d
    0x7f494705e09c <+1148>: je     0x7f494705e09f            ; <+1151> at dl-open.c:744:6
    0x7f494705e09e <+1150>: int3
->  0x7f494705e09f <+1151>: movl   -0x54(%rbp), %eax
    0x7f494705e0a2 <+1154>: testl  %eax, %eax
    0x7f494705e0a4 <+1156>: jne    0x7f494705e410            ; <+2032> at dl-open.c:745:5

But if I repeat the query it does not:

    0x7ffff7fe5093 <+1139>: jne    0x7ffff7fe5450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7ffff7fe5099 <+1145>: testl  %r13d, %r13d
    0x7ffff7fe509c <+1148>: je     0x7ffff7fe509f            ; <+1151> at dl-open.c:744:6
    0x7ffff7fe509e <+1150>: nop
->  0x7ffff7fe509f <+1151>: movl   -0x54(%rbp), %eax
    0x7ffff7fe50a2 <+1154>: testl  %eax, %eax
    0x7ffff7fe50a4 <+1156>: jne    0x7ffff7fe5410            ; <+2032> at dl-open.c:745:5
```

</details>

Test command was:

    clickhouse local --stacktrace -q "select * from file('data.capnp', 'CapnProto', 'val1 char') settings format_schema='nonexist:Message'

*P.S. I did this, because I have libraries compiled with DWARF5 (i.e. glibc), and dwarf parser simply fails on my dev env.*

Refs: facebook/folly@490b287
(cherry picked from commit ee5696b)
(cherry picked from commit e03870b)
Signed-off-by: Azat Khuzhin <[email protected]>
@robot-ch-test-poll robot-ch-test-poll added the pr-not-for-changelog This PR should not be mentioned in the changelog label Sep 11, 2022
@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 12, 2022

Stress test (asan) — Cannot start clickhouse-server

2022.09.12 02:04:45.060536 [ 805010 ] {} <Error> Application: Caught exception while loading metadata: Code: 479. DB::Exception: Part `all_1_1_0` (`data/test_25/test/all_1_1_0/`) was found on disk `local_cache` which is not defined in the storage policy: Cannot attach table `test_25`.`test` from metadata file /var/lib/clickhouse/metadata/test_25/test.sql from query ATTACH TABLE test_25.test (`a` Int32, `b` Int32) ENGINE = ReplacingMergeTree ORDER BY a SETTINGS index_granularity = 8192. (UNKNOWN_DISK), Stack trace (when copying this message, always include the lines below):

https://s3.amazonaws.com/clickhouse-test-reports/41193/7e130aeb69d156398bb2ac4d7a42759d9ea69ca6/stateless_tests__release__s3_storage_.html

2022-09-12 09:29:20 [0d6981f71a34] 2022.09.12 09:27:57.154422 [ 2572 ] {645e5dcd-5854-4599-9bf4-f97cb4f8f4a1}  AWSClient: AWSErrorMarshaller: Encountered AWSError 'NoSuchKey': The specified key does not exist.
2022-09-12 09:29:20 [0d6981f71a34] 2022.09.12 09:27:57.154521 [ 2572 ] {645e5dcd-5854-4599-9bf4-f97cb4f8f4a1}  AWSClient: HTTP response code: 404

https://s3.amazonaws.com/clickhouse-test-reports/41193/7e130aeb69d156398bb2ac4d7a42759d9ea69ca6/integration_tests__tsan__[3/4].html

  • test_dns_cache/test.py::test_host_is_drop_from_cache_after_consecutive_failures - I saw other failures

@alexey-milovidov alexey-milovidov merged commit 31ebd34 into ClickHouse:master Sep 16, 2022
@azat azat deleted the DWARF-5-parser-v3 branch September 17, 2022 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants