Asm_directives: new abstraction layer for assembly directive emission by mshinwell · Pull Request #756 · ocaml/ocaml

mshinwell · 2016-08-11T11:47:14Z

Emission of DWARF debugging information necessitates support for various kinds of assembly directives, some of which are quite tricky with regard to their types (mainly widths of various integers, some of which may depend on the target machine, and some of which may not). This new abstraction layer provides a uniform interface for the emission of assembly directives.

This is currently sufficient for testing the debugging support on x86, but is not sufficient for merging. I think pieces missing at present are:

testing on 32-bit x86 (64-bit x86 has been tested fairly well);
an Asm_directives implementation for backends where there is no "assembly DSL" (i.e. everything except x86) -- there is some commented-out code from an old version of this, but it needs mostly rewriting and checking thoroughly against the current asm_directives.mli interface which has extensively changed since that old code was written;
the hooks in the other backends, following the change in amd64/emit.mlp;
build system support for building the correct Asm_directives_... implementation.

This pull request has #755 as a prerequisite.

If anyone would like to volunteer to help bring this pull request up to scratch, it would be much appreciated. If not then I will get to it in due course. There are unfortunately some tricky, assembler-dependent details.

mshinwell · 2016-10-17T09:51:02Z

This patch has been significantly reworked. By removing the non-x86-specific directives from X86_dsl / X86_ast I've been able to make a uniform assembly directives emission layer that should be usable by all of the backends and the DWARF emitter.

@alainfrisch I believe you have some code that depends on the "directive" type (now Asm_directives.Directive.t with the x86-specific ones still in X86_dsl.D) -- could you check that these modifications won't cause a problem? I don't think they should.

I need to do some more tidying up in the x86-64 backend and then make all of the other backends use Asm_directives. Review comments are welcome now though.

mshinwell · 2016-10-17T09:51:30Z

(This patch also includes #753 for the moment.)

mshinwell · 2016-10-17T11:25:14Z

I've also now tried to tidy up the terminology surrounding "labels" and "symbols". "Labels" are now always of type Linearize.label in the context of assembly code generation; "symbols" are textual entities. (At some point we should consider using the middle-end type Symbol.t, or an equivalent, for the latter.) All mangling of label numbers and symbol names is done in Asm_directives now.

@alainfrisch Could you possibly check that the typ arguments to define_symbol and define_label are correct in amd64/emit.mlp? Only one of these was specified before. I've made the argument non-optional so we have to think about it. Or is this just not important?

alainfrisch · 2016-11-12T22:52:11Z

I've only looked very quickly to this PR. It's not completely clear to which extend the notion of "directive" can be made general. Even if there are many similarities in different backends, isn't it inherently tied to the target system and the specific assembly language syntax? Perhaps it is a bit theoretical, but an ocamlopt backend could use a very different directive syntax than gas or masm, so putting concrete syntax emitters in Asm_directives (which is linked for all backends) does not seem right. Similarly, Target_system, in its current form, seems rather specific to Intel backends.

The rationale for this big refactoring would be much clearer (to me) if non-Intel backends were ported to the new abstraction layer as well. Otherwise it is hard to know how general this layer will be. As long as it remains used only by Intel backends, I don't see the benefits.

could you check that these modifications won't cause a problem?

Thanks for thinking about it, and no, indeed, I don't foresee any problem to adapt our code.

alainfrisch · 2016-11-12T23:01:28Z

asmcomp/amd64/emit.mlp

-  let lbl = emit_symbol lbl in
-  D.global lbl;
-  _label lbl
+  D.define_label lbl ~typ:NONE;


I don't know if this matters but all instances of _label should be translated to ~typ:QWORD, no?

I was hoping you were going to tell me the answer to that :)
I don't know exactly what masm uses this parameter for, but could it always be the natural machine width in our case? Maybe we could delete the parameter in our code.

but could it always be the natural machine width in our case

I'll do some experiments... Stay tuned.

So, simply generating "foo LABEL DWORD/QWORD" does not work (the assembler complains); at least for labels in code (jump targets), one needs to use "foo:". I've tried using "foo:" in code section and "foo LABEL ?DWORD" in data section, and a good fraction of the testsuite passes, except for some non-deterministic segfaults (perhaps depending on where the image is loaded).

alainfrisch · 2016-11-12T23:03:23Z

asmcomp/amd64/emit.mlp

-      _label lbl;
+      D.switch_to_section Jump_tables;
+      D.align ~bytes:4;
+      D.define_label lbl ~typ:NONE;


alainfrisch · 2016-11-12T23:04:18Z

asmcomp/amd64/emit.mlp

  List.iter emit_call_gc !call_gc_sites;
  emit_call_bound_errors ();
+  let end_of_function_label = Cmm.new_label () in
+  D.define_label end_of_function_label ~typ:NONE;


What's the use of this new label?

This is needed for DWARF emission, it shouldn't be in this patch.

alainfrisch · 2016-11-12T23:05:05Z

asmcomp/amd64/emit.mlp

-      efa_string = (fun s -> D.bytes (s ^ "\000"))
+          D.between_this_and_label_offset_32bit ~upper:lbl
+            ~offset_upper:(Targetint.of_int32 ofs));
+      efa_def_label = (fun l -> D.define_label l ~typ:NONE);


Same remark (_label -> ~typ:QWORD).

alainfrisch · 2016-11-12T23:10:24Z

Keeping the two kinds of NewLabel looks confusing. Previously, the "typ" could only be NONE/DWORD/QWORD (and it is only used for the masm syntax). Couldn't this info be kept in the common layer (so as to get rid of the NewLabel in the intel-specific part)?

mshinwell · 2016-11-14T08:28:27Z

Thanks for looking at this. I don't really see the problem with having the concrete syntax emitters in Asm_directives; it seems fairly unlikely that we're going to be targeting a platform in the near future that doesn't use gas, gas-like syntax or masm.

This layer is actually used by all platforms for the DWARF emitters. I agree that we should port the other backends to use it throughout, but that can be done as a separate piece of work.

xavierleroy · 2017-08-14T10:20:55Z

I tried to review and failed because this huge collection of semi-related changes defies reviewing.

A first step would be for @mshinwell to write down a high-level description of what's in this PR, which goes way beyond than factoring out the concrete syntax of directives, but also includes a bunch of other refactorings, new uses of the Targetint interface, many new uses of a Linkage_name interface I'm not familiar with, and probably more.

I also noticed abstractions that are not (abstractions). For example the Target_system interface needs (by construction) to be extended for every new target architecture of CompCert. So, it doesn't belong in asmcomp/. Likewise I'm skeptical that we gain anything by factoring bits and pieces of emit_function in Emitaux.

The game is not to have the smallest possible emitters for each target. The game is to enforce a modular structure where code emitters hide target-related details from the rest of the system. Examples of such details that (I think) this PR leaks outside the emitters are 1- the need to set local symbols to work around MacOS X limitations, and 2- the PPC64 oddities.

mshinwell · 2017-08-14T10:40:07Z

Some of the changes can probably now be pulled out into separate pull requests which will make things easier. I shall try to do this for the Linkage_name and Target_system changes first.

One point on the other comments: things such as the need for set on Mac OS X is exactly what Asm_directives is supposed to abstract from. (This might arguably not be the case for the PPC ELFv1 oddities.) In particular the DWARF emitting code shouldn't need to know about this.

…nto asm_directives

xavierleroy · 2017-08-14T13:41:17Z

I'm not yet asking for separate PRs. As a first step I'd like to see an up to date and comprehensive description of what's in this PR.

mshinwell · 2017-08-14T15:03:16Z

I think I'm going to do separate PRs for some of this anyway, since thinking about some of the changes in isolation has already raised a few issues, which will need thinking about.

As for the whole, this contains:

The new layer which abstracts platform-specific details of assembly directive emission and provides the necessary "measuring" functions needed for DWARF emission.
Changes to use the new layer instead of direct emission of strings.
Factoring out of a large amount of common code into Emitaux from the emitters. This was done because the work in point 2 made substantial chunks of the code in the emitters very similar or the same. This work showed that there existed various inconsistencies and at least one bug (as I recall in CFI offset adjustment).
To match up with the new abstraction layer Asm_directives and hopefully to improve matters, use of specialised types in the backend rather than generic types. There are two examples: the first is extending the use of Targetint; the second is using Linkage_name instead of string.
A Target_system module which is intended to improve upon the current mechanisms for communicating various platform-specific details to the backend. Some tidying up of rather old names in the configure script comes along with this change. Something that I haven't discussed yet (cf. caml-devel discussion) is that I think keeping this module properly in sync with all supported platforms and ensuring that we have CI machines for all such platforms (subject to reasonable restrictions on officially-supported operating systems on such platforms) has the potential to significantly improve our QA.

The extended use of Targetint works towards a goal of having the whole compiler pipeline handle cross-compilation correctly. In this pull request this conversion only started from Mach but I'm working on the whole thing from the middle-end (Closure or Flambda) onwards, since it's not too hard.

The Linkage_name change (I will probably rename this to Asm_symbol) aims to be more precise about names in the backend. The proposed roadmap is that Symbol.t be used in the middle-end (this is currently only used in Flambda) to represent functions or data lifted to toplevel without saying anything about the mangling conventions. Then in the backend Asm_symbol.t represents the OCaml-specific mangling of such names (which in particular should be able to be demangled by the debugger) and an associated type describes such names equipped with relocation (or other similar) information. Finally, in the emitters, we just use string for example as arguments in the x86-specific DSL---at this stage the symbol names are basically unstructured data.

@lpw25 will have in his forthcoming namespaces proposal an effectively analogous set of renamings in the front-end of the compiler (instead of Ident.t everywhere).

Merging some of these changes together would save a reasonable amount of work, which is why they were done together in the first place---in particular the change to use the new Asm_directives layer together with the refactoring of the emitters. However I think some parts (e.g. the integer and symbol changes, plus Target_system) should be split.

mshinwell · 2017-08-14T15:05:19Z

One further comment: the use of Linkage_name.t is kind of wrong in Flambda at the moment and the plan is to disentangle that, so it would only be used (renamed as Asm_symbol.t) in the backend. Then it could be moved to asmcomp/ where it belongs.

DemiMarie · 2017-10-26T17:15:20Z

I would definitely like to see this merged. Right now it is not possible to debug OCaml native code, which is annoying. Bytecode works but

is slow (at runtime)
ocamldebug has rather poor IDE integration (it only speaks to emacs)

bikallem · 2017-10-26T18:39:20Z

I want to echo similar sentiment as @DemiMarie. Debugging native code in ocaml is very painful right now. Having to use a crude mixture of printf/format is very time-consuming. A good debugger would also be very useful when you are trying to understand a large ocaml codebase. Would definitely like to see the debugging GPRs merged.

mshinwell · 2017-10-26T22:58:20Z

Work is underway to split this pull request into pieces that are easier to review. Watch this space.

mshinwell · 2018-09-26T15:10:01Z

This GPR is going to be superceded.

25188da flambda-backend: Missed comment from PR802 (ocaml#887) 9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml#802) d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml#874) 4bbde7a flambda-backend: Simpler symbols (ocaml#753) ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml#862) a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml#869) 045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml#868) 3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml#866) c5b12bf flambda-backend: Remove unnecessary install lines (ocaml#860) ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml#861) c84976c flambda-backend: Static check for noalloc: attributes (ocaml#825) ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml#857) 39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml#854) c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml#850) 6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml#852) f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml#843) 9b78eb2 flambda-backend: Add test for ocaml#820 (include functor soundness bug) (ocaml#841) 8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml#833) 65c2f22 flambda-backend: Add test for ocaml#829 (ocaml#831) 7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml#830) ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml#787) 3ee650c flambda-backend: Fix soundness bug in include functor (ocaml#820) 2f57378 flambda-backend: Static check noalloc (ocaml#778) aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml#812) 17c7173 flambda-backend: Fix .cmt for included signatures (ocaml#803) e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml#800) ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml#784) 14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml#790) 487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml#795) a50a818 flambda-backend: Reduce closure allocation in List (ocaml#792) 96c9c60 flambda-backend: Merge ocaml-jst a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml#767) f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml#757) c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756) b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml#750) 8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml#749) git-subtree-dir: ocaml git-subtree-split: 25188da

Asm_directives

fecdc08

mshinwell added the debugging-prerequisite label Aug 11, 2016

nojb mentioned this pull request Aug 11, 2016

Target_system -> Targetint mshinwell/ocaml#7

Closed

mshinwell added 3 commits October 17, 2016 10:31

Asm_directives and Target_system

a820ab3

GPR#753

5bdabdd

Fixes

045e8c2

mshinwell added this to the 4.05-or-later milestone Oct 17, 2016

mshinwell added 3 commits October 17, 2016 11:36

work

4481135

Labels are labels and symbols are symbols

7fcb602

More fixes

2857c47

mshinwell added the work-in-progress label Oct 17, 2016

mshinwell added 2 commits October 17, 2016 12:22

Fixes

99b5592

Don't use 'sym'

9b73640

Move code

7d2f290

alainfrisch reviewed Nov 12, 2016

View reviewed changes

Merge with trunk

dfad362

damiendoligez modified the milestone: 4.05-or-later Feb 15, 2017

mshinwell mentioned this pull request Mar 7, 2017

Add iOS support #1084

Closed

mshinwell added 2 commits March 31, 2017 10:07

merge

7594262

changes

b2042a2

dra27 and others added 10 commits August 11, 2017 17:00

QUAD -> QWORD

bf2c24a

Slightly better diagnostics for MASM mis-codings

7d60be3

Declare symbols correctly for MASM

dbfbc7c

WIP: ignore Loc and File directives for MASM

4e2f0aa

Problematic msvc64 assembler output

ac5fd76

asm_directives

158a6c0

merge

2a677b2

Merge with trunk

d35452a

fix bug

9040dd0

fix

bbf658a

mshinwell added 4 commits August 14, 2017 12:15

comment

033163f

Merge branch 'asm_directives' of https://github.com/mshinwell/ocaml i…

84f25e3

…nto asm_directives

Merge remote-tracking branch 'ocaml/trunk' into asm_directives

f02ecf1

Merge branch 'asm_directives' of https://github.com/mshinwell/ocaml i…

adea7ea

…nto asm_directives

damiendoligez added this to the 4.07-or-later milestone Oct 11, 2017

damiendoligez added the high-priority label Jun 1, 2018

damiendoligez removed this from the consider-for-4.07 milestone Jun 1, 2018

mshinwell closed this Sep 26, 2018

stedolan added a commit to stedolan/ocaml that referenced this pull request Sep 21, 2022

Bugfix for Cmmgen.expr_size with locals (ocaml#756)

d5ea626

sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Feb 21, 2023

flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756)

c7a46bb

Conversation

mshinwell commented Aug 11, 2016

Uh oh!

mshinwell commented Oct 17, 2016

Uh oh!

mshinwell commented Oct 17, 2016

Uh oh!

mshinwell commented Oct 17, 2016

Uh oh!

alainfrisch commented Nov 12, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alainfrisch commented Nov 12, 2016

Uh oh!

mshinwell commented Nov 14, 2016

Uh oh!

xavierleroy commented Aug 14, 2017

Uh oh!

mshinwell commented Aug 14, 2017

Uh oh!

xavierleroy commented Aug 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshinwell commented Aug 14, 2017

Uh oh!

mshinwell commented Aug 14, 2017

Uh oh!

DemiMarie commented Oct 26, 2017

Uh oh!

bikallem commented Oct 26, 2017

Uh oh!

mshinwell commented Oct 26, 2017

Uh oh!

mshinwell commented Sep 26, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

xavierleroy commented Aug 14, 2017 •

edited

Loading