Skip to content

Fix a suspected off-by-one error in parseFDEInstructions()#27

Merged
alexey-milovidov merged 2 commits intomasterfrom
cfa-loc-off-by-one
Jun 23, 2024
Merged

Fix a suspected off-by-one error in parseFDEInstructions()#27
alexey-milovidov merged 2 commits intomasterfrom
cfa-loc-off-by-one

Conversation

@al13n321
Copy link
Copy Markdown
Member

@al13n321 al13n321 commented Jun 14, 2024

We get segfaults in libunwind often, especially on aarch64. From reproducing and investigating ClickHouse/ClickHouse#64915 , the cause of the crash, in that specific instance, is what this PR changes - off-by-one error (?) when determining which unwind info entry's address range covers the current instruction address. Details in the next section.

But this is very suspicious because this is hot code, and not ARM-specific. And there seems to be another bug in how initialState is used (it can be used-after-free if a register is saved/loaded on different iterations of the outer loop). I would've expected unwinding to crash or produce garbage much more often if this were really broken. Maybe compilers emit these ranges incorrectly on x86 but correctly on ARM (always or sometimes)? Maybe the code was supposed to add/subtract one somewhere else (but I checked and it doesn't look so)? Maybe I'm reading the wrong standard (.eh_frame vs .debug_frame?), and the two standards are mostly the same but are off-by-one from each other (that would be so weird, but DWARF is weird)? Maybe the off-by-one is much more likely to crash the program on ARM because return address works differently (there's lr register)? I'm confused.

I guess let's run this in CI and see if it super-breaks.


Details:

From https://dwarfstd.org/doc/DWARF5.pdf :

6.4.3 Call Frame Instruction Usage
To determine the virtual unwind rule set for a given location (L1), search through the
FDE headers looking at the initial_location and address_range values to see if L1
is contained in the FDE. If so, then:

  1. Initialize a register set by reading the initial_instructions field of the associated
    CIE. Set L2 to the value of the initial_location field from the FDE header.
  2. Read and process the FDE’s instruction sequence until a DW_CFA_advance_loc,
    DW_CFA_set_loc, or the end of the instruction stream is encountered.
  3. If a DW_CFA_advance_loc or DW_CFA_set_loc instruction is encountered, then
    compute a new location value (L2). If L1 ≥ L2 then process the instruction and go
    back to step 2.
  4. The end of the instruction stream can be thought of as a DW_CFA_set_loc
    (initial_location + address_range) instruction. Note that the FDE is
    ill-formed if L2 is less than L1.

The rules in the register set now apply to location L1.
For an example, see Appendix D.6 on page 325.

(Notice the '≥', while the code had effectively '>'.)
(The example in Appendix D.6 looks consistent with this understanding as well.)

In this case, the address is 0xbc3b924, CIE and FDE are:

00d4f37c 000000000000001c 00000000 CIE
  Version:               1
  Augmentation:          "zPLR"
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 30
  Augmentation data:     9c 31 84 a1 0f 00 00 00 00 1c 1b
  DW_CFA_def_cfa: r31 (sp) ofs 0

0100b1cc 0000000000000034 002bbe54 FDE cie=00d4f37c pc=000000000bc3b8c0..000000000bc3b998
  Augmentation data:     67 5f 76 fb ff ff ff ff
  DW_CFA_advance_loc: 4 to 000000000bc3b8c4
  DW_CFA_def_cfa_offset: 80
  DW_CFA_advance_loc: 12 to 000000000bc3b8d0
  DW_CFA_def_cfa: r29 (x29) ofs 32
  DW_CFA_offset: r19 (x19) at cfa-8
  DW_CFA_offset: r20 (x20) at cfa-16
  DW_CFA_offset: r30 (x30) at cfa-24
  DW_CFA_offset: r29 (x29) at cfa-32
  DW_CFA_remember_state
  DW_CFA_advance_loc1: 72 to 000000000bc3b918
  DW_CFA_def_cfa: r31 (sp) ofs 80
  DW_CFA_advance_loc: 12 to 000000000bc3b924
  DW_CFA_def_cfa_offset: 0
  DW_CFA_restore: r19 (x19)
  DW_CFA_restore: r20 (x20)
  DW_CFA_restore: r30 (x30)
  DW_CFA_restore: r29 (x29)
  DW_CFA_advance_loc: 4 to 000000000bc3b928
  DW_CFA_restore_state
  DW_CFA_nop

Code is:

Dump of assembler code for function _ZN2DB12_GLOBAL__N_18readULEBERNSt3__117basic_string_viewIcNS1_11char_traitsIcEEEERhS7_:
   0x000000000bc3b8c0 <+0>:     sub     sp, sp, #0x50

   0x000000000bc3b8c4 <+4>:     stp     x29, x30, [sp, #48]
   0x000000000bc3b8c8 <+8>:     stp     x20, x19, [sp, #64]
   0x000000000bc3b8cc <+12>:    add     x29, sp, #0x30

   0x000000000bc3b8d0 <+16>:    mov     x8, xzr
   0x000000000bc3b8d4 <+20>:    strb    wzr, [x1]
   0x000000000bc3b8d8 <+24>:    ldr     x9, [x0, #8]
   0x000000000bc3b8dc <+28>:    cbz     x9, 0xbc3b928 <_ZN2DB12_GLOBAL__N_18readULEBERNSt3__117basic_string_viewIcNS1_11char_traitsIcEEEERhS7_+104>
   0x000000000bc3b8e0 <+32>:    ldr     x10, [x0]
   0x000000000bc3b8e4 <+36>:    sub     x9, x9, #0x1
   0x000000000bc3b8e8 <+40>:    ldrb    w11, [x10], #1
   0x000000000bc3b8ec <+44>:    stp     x10, x9, [x0]
   0x000000000bc3b8f0 <+48>:    strb    w11, [x2]
   0x000000000bc3b8f4 <+52>:    and     x11, x11, #0x7f
   0x000000000bc3b8f8 <+56>:    ldrb    w9, [x1]
   0x000000000bc3b8fc <+60>:    add     w10, w9, #0x7
   0x000000000bc3b900 <+64>:    lsl     x9, x11, x9
   0x000000000bc3b904 <+68>:    strb    w10, [x1]
   0x000000000bc3b908 <+72>:    ldrsb   w10, [x2]
   0x000000000bc3b90c <+76>:    orr     x8, x9, x8
   0x000000000bc3b910 <+80>:    tbnz    w10, #31, 0xbc3b8d8 <_ZN2DB12_GLOBAL__N_18readULEBERNSt3__117basic_string_viewIcNS1_11char_traitsIcEEEERhS7_+24>
   0x000000000bc3b914 <+84>:    mov     x0, x8

   0x000000000bc3b918 <+88>:    ldp     x20, x19, [sp, #64]
   0x000000000bc3b91c <+92>:    ldp     x29, x30, [sp, #48]
   0x000000000bc3b920 <+96>:    add     sp, sp, #0x50

   0x000000000bc3b924 <+100>:   ret

   0x000000000bc3b928 <+104>:   mov     x20, x0
   0x000000000bc3b92c <+108>:   mov     w0, #0x178                      // #376
   0x000000000bc3b930 <+112>:   bl      0xe5b4d54 <__AArch64ADRPThunk___cxa_allocate_exception>
   0x000000000bc3b934 <+116>:   adrp    x8, 0x20db000
   0x000000000bc3b938 <+120>:   add     x8, x8, #0x1fc
   0x000000000bc3b93c <+124>:   mov     w9, #0x2a                       // #42
   0x000000000bc3b940 <+128>:   stp     x8, x9, [sp, #16]
   0x000000000bc3b944 <+132>:   mov     x19, x0
   0x000000000bc3b948 <+136>:   stp     x8, x9, [sp, #32]
   0x000000000bc3b94c <+140>:   ldr     x9, [x20, #8]
   0x000000000bc3b950 <+144>:   mov     w8, #0x1                        // #1
   0x000000000bc3b954 <+148>:   stp     x9, x8, [sp]
   0x000000000bc3b958 <+152>:   add     x2, sp, #0x10
   0x000000000bc3b95c <+156>:   add     x3, sp, #0x8
   0x000000000bc3b960 <+160>:   mov     x4, sp
   0x000000000bc3b964 <+164>:   mov     w1, #0x1d1                      // #465
   0x000000000bc3b968 <+168>:   bl      0xbc0c8a0 <_ZN2DB9ExceptionC2IJmmEEEi22FormatStringHelperImplIJDpNSt3__113type_identityIT_E4typeEEEDpOS5_>
   0x000000000bc3b96c <+172>:   adrp    x1, 0x153e7000 <_ZN12_GLOBAL__N_123clickhouse_applicationsE+448>
   0x000000000bc3b970 <+176>:   add     x1, x1, #0x158
   0x000000000bc3b974 <+180>:   adrp    x2, 0x7aa2000 <_ZNK2DB24FunctionStringOrArrayToTINS_12_GLOBAL__N_118CRCFunctionWrapperIN12_GLOBAL__N_113CRC32ZLibImplEEES4_jLb1EE11executeImplERKNSt3__16vectorINS_21ColumnWithTypeAndNameENS7_9allocatorIS9_EEEERKNS7_10shared_ptrIKNS_9IDataTypeEEEm.3e570ba5df68adedf72ec05448f83d40+1120>
   0x000000000bc3b978 <+184>:   add     x2, x2, #0xfc0
   0x000000000bc3b97c <+188>:   mov     x0, x19
   0x000000000bc3b980 <+192>:   bl      0xe5b4d58 <__AArch64ADRPThunk___cxa_throw>
   0x000000000bc3b984 <+196>:   mov     x20, x0
   0x000000000bc3b988 <+200>:   sub     x0, x19, #0x80
   0x000000000bc3b98c <+204>:   bl      0xe5b51cc <__AArch64ADRPThunk__ZN10__cxxabiv128__aligned_free_with_fallbackEPv>
   0x000000000bc3b990 <+208>:   mov     x0, x20
   0x000000000bc3b994 <+212>:   bl      0xe5b4c80 <__AArch64ADRPThunk__Unwind_Resume>
End of assembler dump.

Clearly the unwind commands

  DW_CFA_def_cfa_offset: 0
  DW_CFA_restore: r19 (x19)
  DW_CFA_restore: r20 (x20)
  DW_CFA_restore: r30 (x30)
  DW_CFA_restore: r29 (x29)

are supposed to apply when instruction pointer is at bc3b924 (i.e. the add sp, sp, #0x50 has happened, but ret hasn't). I stepped through the code and checked that libunwind doesn't apply these commands because of the >.

@al13n321
Copy link
Copy Markdown
Member Author

Would it be better to maintain our patches as a series of commits on top of upstream, and update by rebasing?

@al13n321 al13n321 force-pushed the cfa-loc-off-by-one branch from 92d7bd0 to b2d4e24 Compare June 21, 2024 23:38
@al13n321 al13n321 marked this pull request as ready for review June 21, 2024 23:38
@al13n321
Copy link
Copy Markdown
Member Author

It didn't break CI in ClickHouse/ClickHouse#65257 , so I think we should merge this.

@alexey-milovidov alexey-milovidov self-assigned this Jun 23, 2024
@alexey-milovidov alexey-milovidov merged commit 8f28e64 into master Jun 23, 2024
@al13n321 al13n321 deleted the cfa-loc-off-by-one branch June 27, 2024 21:49
al13n321 added a commit to ClickHouse/ClickHouse that referenced this pull request Jun 27, 2024
@azat
Copy link
Copy Markdown
Member

azat commented Jul 8, 2024

@al13n321 have you considered backporting it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants