Skip to content

Proper backtraces upon compiler failure#358

Closed
mshinwell wants to merge 2 commits intoocaml:trunkfrom
mshinwell:flambda_prereq-sys_dot_remove
Closed

Proper backtraces upon compiler failure#358
mshinwell wants to merge 2 commits intoocaml:trunkfrom
mshinwell:flambda_prereq-sys_dot_remove

Conversation

@mshinwell
Copy link
Contributor

The current situation where compiler failure can lead to unhelpful backtraces can be frustrating. This patch aims to prevent code in the compiler from destroying backtraces on their way out.

I understand that a previous version of this patch involved an unsatisfactory fix to Misc.remove_file. This version provides a fix that does not suffer from any (extra) race conditions.

Tested with a forced assertion failure in translcore.ml.

@gasche
Copy link
Member

gasche commented Dec 18, 2015

Looks ok to me.

@gasche
Copy link
Member

gasche commented Dec 20, 2015

Looking more at this, I feel that this PR does not completely solve the problem and could be improved. I would still be in favor of merging it in its current state as I believe that it is already a good step forward.

My first frustration was about not being told in the PR what was this "earlier version of the patch" mentioned: it is MPR#6574, and the present PR should contain a Changes entry about it as merging it would let us close this issue report.

My second frustration was looking at the code of the new primitive added: it is extremely similar to the existing sys_remove_file and I would recommend factoring the two codebases.

My third issue with the proposed PR is that it feels like a bucket of exceptionless water in a see of reraises. Running git grep -C5 " raise [a-z]" -- *.ml will show you plenty of raise sites in the compiler that can destroy traces. Some of the happen to only use remove_file before reraising, but many of them also use close_{in,out} as well, which suffers the same issue of potentially raising an exception. What is the measure of success for this patch, is it a particular failure case that matters, do some potentially-exception-raising functions matter in practice (why is remove_file bad?) and others don't (maybe close_* always succeed in practice?).

Is this even the right approach to get high-quality backtraces? Should we rewrite the {sys,io}.c primitives to not raise exceptions but return failure codes, and only raise exceptions from the OCaml side? (This is made difficult by the fact that the Sys mli sadly exports extern declarations instead of val). Another idea, for example, would be to have a global flag that disables all cleanup handlers, or on the contrary asks for the backtrace in the first cleanup handler and prints it on stderr for example.

@mshinwell
Copy link
Contributor Author

@chambart will have to comment on the details as to why Sys.remove_file in particular seemed to be the place to fix in order to help solve this problem. However, I don't think we should delay for such comment, as he is away for two weeks.

This patch is a good incremental improvement to a significant deficiency in the compiler that hinders day-to-day development. It may not be perfect, but in practice it seems to be adequate. I think we should be careful not to fall into the trap of overthinking issues like this: a certain amount of mediocrity will mean that we have more time to spend on more important things.

@ghost
Copy link

ghost commented Dec 21, 2015

instead of duplicating C stubs, we could have this in Misc:

let no_backtrace f =
  let status = Printexc.backtrace_status () in
  Printexc.record_backtrace false;
  let result =
    match f () with
    | x -> Ok x
    | exception e -> Error e
  in
  Printexc.record_backtrace status;
  result

let ignore_errors f = ignore (no_backtrace f : (unit, exn) result)

@bobot
Copy link
Contributor

bobot commented Dec 21, 2015

@diml I think that will not work since modifying record_backtrace reinitialize the backtrace data:

CAMLprim value caml_record_backtrace(value vflag)
{
  int flag = Int_val(vflag);

  if (flag != caml_backtrace_active) {
    caml_backtrace_active = flag;
    caml_backtrace_pos = 0;
    caml_backtrace_last_exn = Val_unit;
    /* Note: lazy initialization of caml_backtrace_buffer in
       caml_stash_backtrace to simplify the interface with the thread
       libraries */
  }
  return Val_unit;
}

I forgot to write a reduced version of #293 that just add caml_reraise_raw_backtrace. But with this one I think we can modify cleanup code (eg. Misc.finalize) so that they handle well the case when the cleanup code use exception.

@bobot
Copy link
Contributor

bobot commented Dec 27, 2015

The reduced version of #293 is at #374. It doesn't subsume the addition of reraise in report_exception_rec, but with it we don't need a version of remove_file that doesn't use exceptions.

@mshinwell
Copy link
Contributor Author

I argue that we should merge this smaller change for 4.03 and consider larger changes for the next release.

@gasche
Copy link
Member

gasche commented Dec 31, 2015

After looking at the details of MPR#6574 again, I think that its implementation was in fact fine. It would be an even simpler change to merge before the stronger long-term solution of #374 and #378.

@mshinwell
Copy link
Contributor Author

I don't really see how it can be argued that MPR#6574 is more satisfactory than this patch (which was actually written in conjunction with the original author of MPR#6574).

@damiendoligez
Copy link
Member

After looking at MPR#6574, this PR, #374, and #378, I think we should merge MPR#6574 now, as it doesn't interfere with #374 and #378 anyway.

@damiendoligez
Copy link
Member

I've merged MPR#6574.

anmolsahoo25 pushed a commit to anmolsahoo25/ocaml that referenced this pull request Aug 25, 2020
poechsel pushed a commit to poechsel/ocaml that referenced this pull request Mar 22, 2021
chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Jan 4, 2022
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants