Skip to content

Conversation

@Timmmm
Copy link
Contributor

@Timmmm Timmmm commented Sep 5, 2025

This adds support for generating C++ code. The model is wrapped in a struct, which means you can create multiple instances of it (e.g. to simulate multicore CPUs).

The structure of the C++ code is like this:

// Header
namespace model {
<types>
struct Model <: optional base classes> {
  <function declarations>
  <variable definitions>
};
}

// Impl
namespace model {
<function definitions>
}

This is the only way I found to do it without major changes. We can't put the types in Model because there's nothing like using namespace <SomeClass>::*; in C++.

@github-actions
Copy link

github-actions bot commented Sep 5, 2025

Test Results

   15 files     34 suites   0s ⏱️
  957 tests   954 ✅  3 💤 0 ❌
4 690 runs  4 648 ✅ 42 💤 0 ❌

Results for commit c3e95fe.

♻️ This comment has been updated with latest results.

@Alasdair
Copy link
Collaborator

Alasdair commented Sep 5, 2025

It includes a commit to remove the --c-generate-header option. It was just too confusing to think about every combination of C/C++/header/no-header, and I'm not really sure there's much use case for not generating a header. If you really want a no-header version I guess you can just concatenate the files. This does make it a breaking change.

It's entirely for backwards compatibility really. I don't think there is much of a use case either.

I haven't really thought about how it interacts with --c-static. Tbh I don't quite follow that option. There are some hard-coded "static"s still... it seems to only apply to some stuff.

Would be fine removing that option honestly, I also don't really see the point. Just need to make sure any uses of static are actually what we really want. It's less code paths for me to maintain the less options we have anyway! 😉

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 9, 2025

Ok I think this is basically done. Almost all the tests pass except a few which have custom C code and are therefore a bit awkward. I marked them as expected failures.

This PR includes earlier commits to remove the --generate-header and --static flags. I will turn them into separate PRs if you think that's ok.

The two things I'm not 100% sure about:

  1. The --no-rts flag. I don't really understand what this is for, and I'm pretty sure it doesn't work, at least not with --generate-header, because it makes extern static declarations which doesn't make sense. Also it isn't tested as far as I can see. Should we remove it?
  2. I didn't really test the coverage related stuff, and it's still global but I think that's probably fine.

@Timmmm Timmmm marked this pull request as ready for review September 9, 2025 14:07
@Alasdair
Copy link
Collaborator

Alasdair commented Sep 9, 2025

I think the intent behind that flag was to just avoid including the Sail runtime headers in case you want to provide your own. I don't think there would be anyone depending on that flag so also likely safe to remove. A better way to do that would just be to have an option that removes all default header includes, then anyone who wants something custom can just use --c-header as needed.

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 9, 2025

Ah ok. Yeah I think just not emitting the includes is easier, though tbh we did custom headers just by copying & pasting all of the C files elsewhere and that worked fine. I'll add a --c-no-default-includes flag.

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 9, 2025

Also is there a way to see the actual output for failed tests in CI?

@Alasdair
Copy link
Collaborator

Alasdair commented Sep 9, 2025

Right now you need to go digging through the logs. Not sure the best way to pass the error messages through the various GitHub CI jobs.

2025-09-09T15:31:13.4348375Z �[91mFailed�[0m:  c++  abstract_type.cpp '/home/runner/work/sail/sail'/lib/*.c -lgmp -I '/home/runner/work/sail/sail'/lib -o abstract_type.bin
2025-09-09T15:31:13.4352560Z �[94mstdout�[0m:
2025-09-09T15:31:13.4356495Z 
2025-09-09T15:31:13.4359944Z �[94mstderr�[0m:
2025-09-09T15:31:13.4364339Z abstract_type.cpp: In function ‘int main(int, char**)’:
2025-09-09T15:31:13.4365696Z abstract_type.cpp:163:3: error: ‘sail_set_abstract_xlen’ was not declared in this scope
2025-09-09T15:31:13.4368421Z   163 |   sail_set_abstract_xlen(32);
2025-09-09T15:31:13.4369007Z       |   ^~~~~~~~~~~~~~~~~~~~~~

and

2025-09-09T15:30:54.0570977Z �[91mFailed�[0m:  c++  xlen_val.cpp '/home/runner/work/sail/sail'/lib/*.c -lgmp -I '/home/runner/work/sail/sail'/lib -o xlen_val.bin
2025-09-09T15:30:54.0571931Z �[94mstdout�[0m:
2025-09-09T15:30:54.0572118Z 
2025-09-09T15:30:54.0572314Z �[94mstderr�[0m:
2025-09-09T15:30:54.0572929Z /usr/bin/ld: /tmp/cczwAazQ.o: warning: relocation against `zxlen_val' in read-only section `.text'
2025-09-09T15:30:54.0573481Z /usr/bin/ld: /tmp/cczwAazQ.o: in function `test':
2025-09-09T15:30:54.0573843Z xlen_val.cpp:(.text+0x44): undefined reference to `zxlen_val'
2025-09-09T15:30:54.0574196Z /usr/bin/ld: warning: creating DT_TEXTREL in a PIE
2025-09-09T15:30:54.0574493Z collect2: error: ld returned 1 exit status

@Alasdair
Copy link
Collaborator

Alasdair commented Sep 9, 2025

Also:

2025-09-09T15:31:29.6419333Z �[91mFailed�[0m:  c++  cabbrev.cpp '/home/runner/work/sail/sail'/lib/*.c -lgmp -I '/home/runner/work/sail/sail'/lib -o cabbrev.bin
2025-09-09T15:31:29.6423787Z �[94mstdout�[0m:
2025-09-09T15:31:29.6427940Z 
2025-09-09T15:31:29.6428566Z �[94mstderr�[0m:
2025-09-09T15:31:29.6429041Z In file included from cabbrev.cpp:5:
2025-09-09T15:31:29.6429888Z includes/pair_fn.h:12:1: error: ‘my_pair_in_c’ does not name a type
2025-09-09T15:31:29.6430593Z    12 | my_pair_in_c mk_pair(unit u)
2025-09-09T15:31:29.6431542Z       | ^~~~~~~~~~~~
2025-09-09T15:31:29.6441753Z cabbrev.cpp: In member function ‘unit model::Model::zmain(unit)’:
2025-09-09T15:31:29.6442686Z cabbrev.cpp:58:8: error: ‘mk_pair’ was not declared in this scope
2025-09-09T15:31:29.6443736Z    58 |   zx = mk_pair(UNIT);
2025-09-09T15:31:29.6444212Z       |        ^~~~~~~

which I think is one of the tests that is doing some custom linking with some C code.

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 9, 2025

Oh weird, I thought I marked those tests as expected failures. Should it still print the error messages for expected failures?

@Alasdair
Copy link
Collaborator

Alasdair commented Sep 9, 2025

Ah, I think you are right. It's this test config_abstract_clash:

Failed:  ./config_abstract_clash.bin > config_abstract_clash.result 2> config_abstract_clash.err_result
stdout:

stderr:

stderr file:
Segmentation fault (core dumped)

The other failure is to do with the Rocq backend due to the change to test/c/struct.sail. I think the change to that test makes it overall a better test, so we should add that test to the list of skipped tests in test_coq in test/c/run_tests.py.

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 10, 2025

Hmm so the segfault was due to the handling of abstract types. I fixed the segfault but it still leaks. The issue is that it used to work like this in C:

sail_int abstract_zxlen;

void sail_set_abstract_xlen(void) {
...
    COPY(sail_int)(&abstract_zxlen, z3zE3);
...
}

And in C++ it changed so abstract_zxlen is a struct member variable.

The segfault was caused because abstract_zxlen is uninitialised and the COPY requires it to be initialised.

In the C version it's fine because it's a global and those are zero-initialised, and a zero-initialised sail_int is apparently valid.

So I fixed that by generating sail_int abstract_zxlen = {}; instead, which mercifully works for C and C++.

However the next issue is that Valgrind now detects that it is never KILL()d. That's true for the C version but I guess it doesn't care about globals leaking since that can't really cause any issues.

The obvious solution is to KILL() the abstract variables in model_fini().... but it feels a bit weird to do that if they aren't CREATE()d in model_init(). But we can't CREATE() them there because they must have already been initialised by sail_set_abstract...() before model_init().

Any ideas?

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 10, 2025

Hold on, actually I can just add a constructor & destructor. I should probably delete the copy constructor too while I'm at it, and I'll change it to class { public: since I think that's more expected.

@Alasdair
Copy link
Collaborator

Looks like it's just the ocamlformat check in the CI now?

(** Global compilation options *)

(** If set generate a C++ struct for the model instead of global C functions/variables. *)
val opt_cpp : bool ref
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these could be an option in the CODEGEN_CONFIG module type? Would avoid adding any mutable variables, as I was trying to move away from these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Yeah that makes sense, I'll have a go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Arg.String (fun prefix -> C_backend.opt_prefix := prefix),
"prefix generated C functions"
);
(Flag.create ~prefix:["c"] "cpp", Arg.Set C_backend.opt_cpp, "generate C++ struct instead of global C functions");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-c --c-cpp seems a bit counterintuitive. We could set this up so it's just --cpp by creating a new target, and the --c-cpp-struct-name would just be --cpp-struct-name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree it is weird. Guess I was just being lazy. I'll have a go at adding a new target.

Copy link
Collaborator

@Alasdair Alasdair Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably easiest way would be to do something like:

let c_target is_cpp out_file { ast; effect_info; env; default_sail_dir; _ } =

and then change the last two lines of sail_plugin_c into:

  Target.register ~name:"c" ~options:c_options ~rewrites:c_rewrites ~supports_abstract_types:true
    ~supports_runtime_config:true (c_target false);
  Target.register ~name:"cpp" ~options:cpp_options ~rewrites:c_rewrites ~supports_abstract_types:true
    ~supports_runtime_config:true (c_target true)

The cpp_options list would just contain the new options for the C++ generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't totally obvious that the --c-foo options also apply to the cpp backend but I guess it will do. I changed some of the help messages from ... C ... to ... C/C++ ... which hopefully makes that clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want we could duplicate options as either --c-xyz or --cpp-xyz when they are shared. I left a comment further down.

Copy link
Contributor

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 10, 2025

Looks like it's just the ocamlformat check in the CI now?

Huh weird, Valgrind was definitely complaining about the leaks (correctly) when I ran it locally. Adding the constructor/destructor wasn't too hard and it fixes the leaks so I think that's best.

I think a backend that fully embraced C++ would be a lot easier to write - all of the CREATE(), CONVERT_OF() etc. stuff could be handled uniformly by constructors/operator overloading. Though maintaining a whole extra backend would probably cancel all that out. :-/

Also I don't know how you remember to run dune fmt... :-D

@Alasdair
Copy link
Collaborator

I was thinking a while ago that the Codegen module could be parameterised by a second module L providing the functions to generate the various lifecycle functions for each type. There's also OCaml's object system which is rarely used but sometimes exactly what is needed. The abstraction tools exist to generate more idiomatic C++ while still sharing as much as possible it's just finding the time to cleanup and refactor things.

As for dune fmt it's mostly just force of habit, but also the CI reports that one fast whenever I do forget.

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 11, 2025

Hmm moving the options seems to have totally changed the codegen in a breaking way. I can't figure it out... any ideas @Alasdair ?

@Alasdair
Copy link
Collaborator

Oh, changing the flag to a separate cpp target means there are a few things in the library that need to change. The $target_set directives here

$target_set emulator_or_isla isla c ocaml interpreter systemverilog

Anything that does $iftarget c ... $endif in a test will also be broken, because that would need to be $iftarget cpp. Probably that's annoying so we should either A) update $iftarget to understand $target_set and add a $target_set c_or_cpp c cpp somewhere in the library, or B) change $iftarget to support a disjunction of targets by default so you can do $iftarget c cpp (this is probably easier).

@Alasdair
Copy link
Collaborator

For the options, you could do something like

let shared_options prefix =
  [
    (Flag.create ~prefix "build", Arg.Set opt_build, "build the generated C output automatically");
  ...

then

let cpp_options = [ ... ] @ shared_options "cpp"

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 11, 2025

Ok I think I fixed $iftarget. There's another issue though I think - extern functions all use val foo : {"c": some_c_function} but now it's looking for "cpp". Any preference on how to fix that? I could just copy & paste them all I guess. I can do it with a regex pretty easily. There are 95 in the Sail repo.

let cpp_options = [ ... ] @ shared_options "cpp"

The slightly odd thing though is it will print --help like this:

Options for targets c, cpp
  -c                            invoke the Sail c target
  --c-build                     build the generated C/C++ output automatically
...
  --c-coverage <file>           Turn on coverage tracking and output information about all branches and functions to a file
  -O                            turn on optimizations for C/C++ compilation
  --Ofixed-int                  assume fixed size integers rather than GMP arbitrary precision integers
  --Ofixed-bits                 assume fixed size bitvectors rather than arbitrary precision bitvectors
  --cpp                         invoke the Sail cpp target
  --cpp-class-name <identifier> C++ class name (default 'Model')
  --cpp-namespace <identifier>  C++ namespace name (default 'model')
  --cpp-derive-from <list>      List of classes/structs to derive the model class from, e.g. 'public foo, private bar' 
  --cpp-build                     build the generated C/C++ output automatically
...
  --cpp-coverage <file>           Turn on coverage tracking and output information about all branches and functions to a file
  -O                            turn on optimizations for C/C++ compilation
  --Ofixed-int                  assume fixed size integers rather than GMP arbitrary precision integers
  --Ofixed-bits                 assume fixed size bitvectors rather than arbitrary precision bitvectors

I was kind of surprised it was smart enough to merge them, but it isn't really what we want in this case. Also the -O flags have their prefix hidden so they'll appear the same.

Also what happens if you do --c-coverage foo --cpp-coverage bar? It definitely won't work properly currently since they both write to the same global.

I think it's probably better just to have the --c- flags affect the --cpp backend for now.

@Alasdair
Copy link
Collaborator

Ok I think I fixed $iftarget. There's another issue though I think - extern functions all use val foo : {"c": some_c_function} but now it's looking for "cpp". Any preference on how to fix that? I could just copy & paste them all I guess. I can do it with a regex pretty easily. There are 95 in the Sail repo.

I think it should be possible to make it so that with --cpp set it falls back to the c externs whenever a separate cpp extern doesn't exist, by modifying the ctx_is_extern and ctx_get_extern functions. On the other hand it might be better to just fix up all the use cases. I think the extern definitions do understand the $target_set directive if you want to avoid duplication. I didn't fully consider these issues when I suggested making --cpp a separate target, but it probably is the right thing to do because it would let us have separate C++ builtins that use C++ features for clarity/performance etc.

@Alasdair
Copy link
Collaborator

Alasdair commented Sep 11, 2025

Also what happens if you do --c-coverage foo --cpp-coverage bar? It definitely won't work properly currently since they both write to the same global.

It would be the same as if you repeated the same flag - the semantics of that depends on how the flags work. I agree it's probably best to just keep it simple for now.

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 11, 2025

Hmm there's still something missing (aside from the failing tests). It used to generate this:


sbits zsub_virtaddr_xlenbits(sbits z3zE442, sbits zoffset)
{
  __label__ fundef_fail_714, fundef_body_715, end_function_716, end_block_exception_717;

  sbits z8zE129;
  sbits zaddr;
  zaddr = z3zE442;
  goto fundef_body_715;
fundef_fail_714: ;
  sail_match_failure("sub_virtaddr_xlenbits");
fundef_body_715: ;
  sbits z2zE128;
  z2zE128 = sub_sbits(zaddr, zoffset);
  z8zE129 = z2zE128;
end_function_716: ;
  return z8zE129;
end_block_exception_717: ;

  return undefined_sbits();
}

But now:


sbits Model::zsub_virtaddr_xlenbits(sbits z3zE442, sbits zoffset)
{
  __label__ fundef_fail_714, fundef_body_715, end_function_716, end_block_exception_717;

  sbits z6zE129;
  sbits zaddr;
  zaddr = z3zE442;
  goto fundef_body_715;
fundef_fail_714: ;
  sail_match_failure("sub_virtaddr_xlenbits");
fundef_body_715: ;
  sbits z2zE128;
  {
    RECREATE(lbits)(&zghz3168);
    CONVERT_OF(lbits, sbits)(&zghz3168, zaddr, true);
    RECREATE(lbits)(&zghz3169);
    CONVERT_OF(lbits, sbits)(&zghz3169, zoffset, true);
    RECREATE(lbits)(&zghz3170);
    sub_vec(&zghz3170, zghz3168, zghz3169);
    z2zE128 = CONVERT_OF(sbits, lbits)(zghz3170, true);
  }
  z6zE129 = z2zE128;
end_function_716: ;
  return z6zE129;
end_block_exception_717: ;

  return undefined_sbits();
}

And there's no definition or declaration of sub_vec(). Hmm.

@Alasdair
Copy link
Collaborator

Is that something that's happening in the RISC-V model? I don't see it in any of the CI failures?

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 11, 2025

Is that something that's happening in the RISC-V model? I don't see it in any of the CI failures?

Yeah exactly, this WIP branch: https://github.com/Timmmm/sail-riscv/tree/user/timh/cpp

@Alasdair
Copy link
Collaborator

Think it might just be that functions in model/prelude.sail need updating: https://github.com/Timmmm/sail-riscv/blob/1dc646dff209a9ed62ad807aa5a782a2b78a366a/model/prelude.sail#L37

I figure I should merge #1446 first then this can be rebased on that?

@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 11, 2025

Oh yeah of course, I forgot about that! That pesky _!

I figure I should merge #1446 first then this can be rebased on that?

Yeah that would be good. I'll make a PR for the commit that removes --static too (and maybe I should also keep that and print a warning like with --c-generate-header?).

@Timmmm Timmmm force-pushed the user/timh/cpp branch 2 times, most recently from 47b067c to ad29423 Compare September 15, 2025 12:55
@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 15, 2025

I've rebased and reorganised things so only the last commit is relevant for this PR. The others are (or will be) separate PRs. I wish Github had proper native support for stacked PRs. :-/

Also I had to make the = {}; initialisation C++-only because it turns out C only supports that syntax in C23.

This adds support for generating C++ code. The model is wrapped in a `struct`, which means you can create multiple instances of it (e.g. to simulate multicore CPUs).

The structure of the C++ code is like this:

```
// Header
namespace model {
<types>
struct Model {
  <function declarations>
  <variable definitions>
};
}

// Impl
namespace model {
<function definitions>
}
```

This is the only way I found to do it without major changes. We can't put the types in `Model` because there's nothing like `using namespace <SomeClass>::*;` in C++.
@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 17, 2025

Ok this is ready now I think!

@Alasdair Alasdair merged commit 6d1c985 into rems-project:sail2 Sep 22, 2025
9 checks passed
@Timmmm
Copy link
Contributor Author

Timmmm commented Sep 22, 2025

Thanks!

@Timmmm Timmmm deleted the user/timh/cpp branch September 23, 2025 09:19
@Timmmm Timmmm restored the user/timh/cpp branch September 23, 2025 10:31
Alasdair added a commit to Alasdair/opam-repository that referenced this pull request Oct 13, 2025
CHANGES:

In addition to bug-fixes and smaller improvements, the following
changes and improvements have been made to the language:

##### Rocq Semantics

The core stepwise semantics of Sail that are currently used by the
interactive REPL and during optimisation passes like constant-folding
are now implemented in Rocq and extracted to OCaml.

##### Improved C++ support

New C++ `--cpp` backend that is a variant of the Sail C backend. It
wraps the generated model in a struct, so when using C++ multiple
instances can be created.
PR: rems-project/sail#1437

As part of the above work, Sail always generates a separate header
file for C or C++, meaning the `--c-generate-header` option is
redundant and deprecated.

The `--c-static` option has been removed.

Small fixes to the parsing of `-` and foreach loops. Slightly breaking
as `from`, `to`, and `downto` are now treated as proper keywords and
cannot be used as identifiers.

##### Other changes

Arguments can now be passed via the `SAIL_FLAGS` environment variable,
or via `SAIL_ENCODED_FLAGS`.
PR: rems-project/sail#1374
@Timmmm Timmmm deleted the user/timh/cpp branch October 30, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants