Skip to content

Make backtraces aware of inlining#247

Merged
mshinwell merged 14 commits intoocaml:trunkfrom
let-def:inline-backtrace
May 25, 2016
Merged

Make backtraces aware of inlining#247
mshinwell merged 14 commits intoocaml:trunkfrom
let-def:inline-backtrace

Conversation

@let-def
Copy link
Contributor

@let-def let-def commented Sep 26, 2015

This pull request makes the backtrace infrastructure robust to inlining.
It already improves the developer experience, but will be even more important with the stronger inlining that is coming.

While it should be usable as it is, I would like to discuss two points to improve before considering a merge.

Printexc API

Printexc allows the user to observe a backtrace mainly by offering random access to "slots".
An "O(1)" cost is implicitly assumed, but not really practical: for instance with inlining, there are more frames than slots in the backtrace buffer.
Flattening the array is required before indexing, which in turns require to distinguish between normal frames and inlined frames in the raw_backtrace, etc.

The patch tries to hide all differences, but the implementation could be significantly simpler with a few changes.
As @gasche already suggested that we could cleanup Printexc, my idea would be to:

  • deprecate random access, and either make it only observe the 'physical backtrace' (without inlined frames), or explicitly not "O(1)"
  • use iter-like functions as the main way to observe the trace.

Also, coming up with better names and a better 'story' than (raw) backtrace/slot could be helpful for users; though I don't know what other opinions are and I have nothing to suggest :).

Representation of backtraces

In the mainline OCaml compiler, debug information are stored in a packed way at the end of each frame descriptor.
Debug information fits in a word, using relative adressing to store the filename and bit twiddling to pack coordinates.

My understanding is that this was done to maximize locality, efficient traversal of frame table being critical for GC performance.
However I would like to simplify representation by adding an indirection (just a pointer) to access the debug data.
These are not needed on critical path, so indirection shouldn't affect performance. Frame descriptor size is unchanged.

Debug information would be a few bytes larger, but the representation would be simpler and more uniform (in this case, debug info are extended with a linked list of inlined frames).

@gasche
Copy link
Member

gasche commented Sep 26, 2015

I won't have time to review the patch short-term (nor am I the relevant person to do it), but here are the requirements for the backtrace API:

  1. getting the raw trace should be as fast as possible
  2. the non-raw API should be as convenient as possible to users
  3. the API should have leeway for runtime implementors to make changes in the future

Of course (3) is divination work, at which I failed with the previous slot API.

@xavierleroy
Copy link
Contributor

Concerning the in-memory representation of debugging info, I'm OK with the proposed change. The additional indirection makes the data (frame table + debug info) bigger, but the frame table (the "hot" part of this data) is smaller, as a single pointer replaces two words of packed debug info.

This leaves the big, "philosophical" discussion: should inlined functions appear in stack backtraces? I'm on the fence, On the one hand it's more friendly to the programmer. On the other hand, the backtraces thus obtained will significantly differ from those obtained within GDB or any other DWARF-based debugger.

@let-def
Copy link
Contributor Author

let-def commented Sep 29, 2015

Good, I'll push the new representation. It would be nice to squash the history before a possible merge (I can do the cleanup).

About the philosophical question... The inlined frame can be marked as such (the current code append " (inlined)"), so I see no downside to exposing the developer more information.

As for GDB, I'm looking forward for @mshinwell work to bring lots of improvements. Inline information in backtraces will be part of it: I'm really expecting flambda traces without inlining information to be a complete puzzle, but I am maybe pessimistic.
With a compiler directly producing precise DWARF, it should be possible to generate inlining information.

@whitequark
Copy link
Member

Finally! This is so great.

@xavierleroy
Copy link
Contributor

I would be pleasantly surprised if DWARF supports inlined functions in backtraces, but I am too lazy to check the DWARF spec. At any rate, this is certainly not a strong objection to the current proposal.

Concerning backtraces that get unusable in the presence of heavy inlining, I don't know what the user experience is with other languages. E.g. idiomatic C++ has a lot of inline functions and methods; how do programmers cope with that in GDB or other debuggers? Just curious.

@let-def
Copy link
Contributor Author

let-def commented Oct 1, 2015

@whitequark
Copy link
Member

@xavierleroy Sure, DWARF supports it. E.g. see this option of addr2line:

  -i --inlines           Unwind inlined functions

Overall, inlines are not hard to get right in backtraces, since adding the relevant debug info is straightforward. The hard part is tracing variables through the optimizer; avoiding <value optimized out>, the bane of the C++ programmer's existence.

DWARF is actually turing-complete, so you can, in principle, record the instructions for the debugger so that it can reconstruct any value at all, but in practice emitting those is fiendishly difficult. LLVM mostly doesn't even bother to try.

@gasche
Copy link
Member

gasche commented Oct 25, 2015

I still won't have time to review the change, but I remark that there is no testsuite coverage; we could and should have tests with inlined functions appearing in the trace.

@gasche
Copy link
Member

gasche commented Nov 18, 2015

We discussed this at the developer meeting today. The consensus is that while the implementation details are not yet clear enough to get into 4.03, we should try to get API included, or at least restrict the existing API enough so that the new features can be proposed later without conflicts.

@def-lkb , do you think a change in the current API would be beneficial, and if yes could you propose it? The feature freeze is in one month, so it should be ready before that.

@let-def let-def force-pushed the inline-backtrace branch 4 times, most recently from 3aebc94 to 1193745 Compare November 28, 2015 21:26
@let-def
Copy link
Contributor Author

let-def commented Nov 28, 2015

In the latest version (still WIP), I removed inline information from raw_backtrace.
It means the random access API on a raw_backtrace can only access "physical slots", those which really appeared on stack.
A new primitive next_inlined_slot : raw_backtrace_slot -> raw_backtrace_slot option allows manual traversal of inlined callsites.
The higher-level non-raw backtrace API exposes all slots, inlined or not.

A nice benefit is that raw_backtraces are not changed at all: the cost of capturing and using the backtrace is the same as before inline information were introduced.

In parallel, I measured the cost in bytes of those information when buildling merlin with debug symbols:

compiler  |  size |  delta
--------------------------
(w/o -g)  | 7132k | - 354k
trunk     | 7485k |
old-repr  | 7566k |  + 80k
new-repr  | 7866k | + 380k

old-repr is when we use the packed representation for debuginfo.
new-repr is when using the simpler representation with one more indirection.
If I am not mistaken (I didn't reread the code...) the costs are:

  • with old-repr, we pay 16 bytes per inlined site (minus sharing).
  • with new-repr, we pay 16 bytes per inlined site + 8 bytes for all other debug information.

@let-def
Copy link
Contributor Author

let-def commented Nov 29, 2015

We discussed this at the developer meeting today. The consensus is that while the implementation details are not yet clear enough to get into 4.03, we should try to get API included, or at least restrict the existing API enough so that the new features can be proposed later without conflicts.

That was definitely the case for the "reraise" heuristic, but for this PR it's not what I remembered?!

@gasche
Copy link
Member

gasche commented Nov 29, 2015

Indeed, re-reading my notes it is also not quite what I noted. My more precise summary was

API changes okay, support for inlining okay, the rest to be discussed during the next week.

The proverbial week is now gone, and you have until mid-december to converge on inlining support.

@damiendoligez damiendoligez added this to the 4.03.0 milestone Dec 21, 2015
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you precise words -> 32-bit words. It is not two words on 64bit.

@let-def
Copy link
Contributor Author

let-def commented Jan 4, 2016

@bobot: I am finishing port to flambda then I'll look at your suggestions.

@let-def
Copy link
Contributor Author

let-def commented Jan 9, 2016

Here is my wip let-def/ocaml@flambda_trunk...def-lkb:flambda-inline-backtrace .
It is not yet completely cleaned up, but the main change is that a rawbacktrace is now equivalent to a C backtrace_buffer, so that reinstantiating a backtrace as in @bobot 's proposal is almost immediate.

There is still encoding of pointers as OCaml integers to satisfy GC needs, but Printexc no longer makes assumptions on actual representation, rawbacktrace type is just abstract.

@gasche
Copy link
Member

gasche commented Jan 9, 2016

There is still encoding of pointers as OCaml integers to satisfy GC needs

Could we encode it as just a custom block to a statically-allocated C array?

@let-def
Copy link
Contributor Author

let-def commented Jan 9, 2016

Yes, the code is ready for that.
I only changed the interface and not the representation, but this is not much work.

I also keep in mind storing the exception values in the backtraces for better debugging (Raised (Unify _)... Reraised as (Error _)), which are OCaml values so the custom block will not be enough for that, but that's another story.

@let-def
Copy link
Contributor Author

let-def commented Jan 13, 2016

@gasche I plan to switch the implementation from allocating OCaml array with tagged integers to allocating a custom block and memcpying the (copy of) backtrace_buffer. Is that what you meant?

I don't see the purpose of the statically-allocated C array.

@gasche
Copy link
Member

gasche commented Jan 13, 2016

I was thinking of a custom block pointing to a C array, but a custom block with the entries inlined is also fine. I was never very happy with the pointer-encoding trick, so I'm glad to see it removed anyway.

Note that, if I was a release manager, I would feel that changes to the backtrace codepath are now too late for 4.03. We need to see with @damiendoligez but I would personally plan for 4.04-time landing.

@let-def
Copy link
Contributor Author

let-def commented Jan 15, 2016

Thanks to explanation from @chambart I was able to extend support for inlining debuginfo to flambda middle-end.
@gasche : this version sticks to the array representation for reified backtraces.
Current version is on this branch, design is mostly the final one imho: let-def/ocaml@flambda_trunk...def-lkb:flambda-inline-backtrace .

Todo:

  1. user-friendly API for accessing inlined information (right now inlined information are printed when using high-level API, otherwise there is a primitive exposing a linked-list like traversal)
  2. extend testsuites
  3. tail calls are managed differently in the two backends : Clambda will not inline debuginfo when inlining a tail-call (in an attempt to look like bytecode backtraces) while Flambda will always inline debuginfo.

For point 1. and 3. I am unsure what is expected from users.
I simply plan for an API iterating and folding over debuginfo, including inlined ones.
As for tail-calls, I wonder if flambda rewriting might introduce more tail-calls, making backtraces less usable if they get removed?

Misc point 4. : the "slot" terminology is rather vague, why not also use debuginfo when referring to actual locations in Printexc?

@damiendoligez
Copy link
Member

This is a rather big patch to integrate this late in the cycle. Could we have the minimal API changes in a separate PR and leave the other changes (i.e this PR) for 4.04 ?

When I set the milestone, it means "must be resolved before 4.03", not "must be merged before 4.03".

@let-def let-def force-pushed the inline-backtrace branch from ce4da3a to 61a92ac Compare May 23, 2016 22:56
@let-def
Copy link
Contributor Author

let-def commented May 23, 2016

Done, I rebased and added tests.

@let-def let-def force-pushed the inline-backtrace branch from d4a88cf to e51a15b Compare May 25, 2016 10:50
@mshinwell
Copy link
Contributor

I think this is in a good enough state now, so I will merge it.
Xavier has received a signed CLA from @let-def .

@mshinwell mshinwell merged commit 28dc832 into ocaml:trunk May 25, 2016
@gasche
Copy link
Member

gasche commented May 25, 2016

Xavier has received a signed CLA from @let-def .

About time!

@gasche
Copy link
Member

gasche commented May 25, 2016

It seems that some test fail on INRIA's CI, currently only on the ARM and OpenBSD 32-bit machines:

On OpenBSD the following tests fail:

    tests/backtrace/'inline_test.ml' with ocamlopt
    tests/backtrace/'inline_traversal_test.ml' with ocamlopt
    tests/backtrace/'inline_test.ml' with ocamlc
    tests/backtrace/'inline_traversal_test.ml' with ocamlopt -O3
    tests/backtrace/'inline_test.ml' with ocamlopt -O3
    tests/backtrace/'inline_traversal_test.ml' with ocamlc

On ARM the following:


    tests/backtrace/'backtrace.ml' with ocamlopt and argument ''
    tests/backtrace/'backtraces_and_finalizers.ml' with ocamlopt
    tests/backtrace/'backtrace_deprecated.ml' with ocamlopt
    tests/backtrace/'pr6920_why_swallow.ml' with ocamlopt
    tests/backtrace/'pr6920_why_at.ml' with ocamlopt
    tests/backtrace/'raw_backtrace.ml' with ocamlopt
    tests/backtrace/'inline_test.ml' with ocamlopt
    tests/backtrace/'backtrace3.ml' with ocamlopt
    tests/backtrace/'backtrace2.ml' with ocamlopt
    tests/backtrace/'backtrace.ml' with ocamlopt and argument 'd'
    tests/backtrace/'backtrace.ml' with ocamlopt and argument 'c'
    tests/backtrace/'backtrace.ml' with ocamlopt and argument 'b'
    tests/backtrace/'inline_test.ml' with ocamlopt -O3

(Unfortunately I don't think I have access to the corresponding .result files.)

@mshinwell or @let-def , could you maybe have a look at this?

@xavierleroy
Copy link
Contributor

I'm looking at the ARM32 failure right now. It's a segfault in caml_debuginfo_next. Stay tuned.

@xavierleroy
Copy link
Contributor

xavierleroy commented May 27, 2016

I think I tracked down the problem. In ARMv7, code pointers have the low bit set to denote Thumb2 code, or cleared to denote classic ARM code. Thus, the assembler plays with the low bit of symbols when they are given type %function. This causes a wrong frame table to be generated:

camlBacktrace__frametable:
        .word   16
        .type   .L141, %function
        .word   .L141
        .short  9
        .short  0
        .align  2
        .type   .L142, %function
        .word   .L142
[...]
        .align  2
.L142:
        .word   .L157 - . + 0x64000000
        .long   0x12090
        .word   0

Here, .L141 is a code pointer, so it is correct to declare it %function. However, .L142 is a data pointer, hence declaring it %function causes the assembler to add one to it. When dereferenced in the first invocation of caml_debuginfo_next, the wrong data is accessed, a bad pointer is created, and the second invocation of caml_debuginfo_next accesses unallocated memory.

The fix is to ensure that labels such as .L142 are not declared %function. An explicit %object declaration would be good, but no declaration at all should work as well, assuming %object is the default.

@let-def
Copy link
Contributor Author

let-def commented May 27, 2016

As for OpenBSD failure, the grep command line was not compatible with BSD grep.
Following patch fixes it, if someone can merge it: https://gist.github.com/let-def/9c868554e120a651300c82727a024dd3 .

@gasche
Copy link
Member

gasche commented May 27, 2016

I would be happy to merge but I would prefer a proper patch rather than just a diff, so that there is attribution information, a proper commit message, etc. (For example git format-patch HEAD~1 --stdout will produce such a patch from the last commit)

@let-def
Copy link
Contributor Author

let-def commented May 27, 2016

@gasche
Copy link
Member

gasche commented May 27, 2016

Thanks! I merged, and the CI tests now pass as expected.

@xavierleroy
Copy link
Contributor

Tentative fix (for the ARM issue) in commit [trunk 5d02ca6]

@gasche
Copy link
Member

gasche commented Jun 27, 2016

@xavierleroy the CI machines are all passing now (the zsystems one was fixed by dad5809), congratulations.

camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Feb 20, 2020
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants