Make marshalled Custom_tag objects store their length. by stedolan · Pull Request #1683 · ocaml/ocaml

stedolan · 2018-03-29T12:59:32Z

The format used by the Marshal module does not currently store the length of Custom_tag objects, making correct unmarshalling essentially impossible on the multicore branch. This patch adds the length to the marshal format.

The change is backwards- but not forwards-compatible: old marshalled files continue to parse, but new marshalled files containing custom objects will raise an exception when unmarshalled with old OCaml.

Background

In the struct custom_operations, custom objects provide the following two callbacks:

  void (*serialize)(value v,
                    /*out*/ uintnat * bsize_32 /*size in bytes*/,
                    /*out*/ uintnat * bsize_64 /*size in bytes*/);
  uintnat (*deserialize)(void * dst);

To marshal a custom object, serialize writes some bytes using e.g. caml_serialize_int_4 (and the other functions from intext.h), and returns two values in the bsize_32 and bsize_64 pointers: the number of bytes of memory that will be needed to store the unmarshalled value on 32- and 64-bit platforms respectively.

To unmarshal a custom object, deserialize takes a pointer to a piece of memory that is large enough to contain the unmarshalled object (that is, at least bsize_32 or bsize_64 bytes long, depending on word size), and fills it with the unmarshalled value using e.g. caml_deserialize_int_4 and friends. deserialize returns the number of bytes of dst that it has filled in, which should be equal to the value of bsize_32 or bsize_64 (again, depending on word size) that was returned by serialize.

Current marshal format

However, the marshal format does not store bsize_32 and bsize_64, so how does the unmarshaller know how large a buffer to pass to deserialize?

The marshal format's header stores the total number of bytes necessary to unmarshal the entire file. The unmarshaller allocates space for the entire data structure in a single contiguous block, and fills it in sequentially. So, the buffer passed to deserialize is the entire rest of this block, large enough not only for the custom object but for all subsequent values as well. The unmarshaller has no way of knowing in advance how many bytes of this block deserialize will consume, which is why deserialize must return this count so that the unmarshaller can advance its pointer.

This is heavily tied to the current heap layout and memory allocator: it relies on being able to store arbitrarily many objects in a single contiguous block. Currently, the multicore branch does not use this object layout, and needs to know how many bytes to allocate for each custom block encountered during unmarshalling, making this trick unworkable.

Change to the format

This patch adds a new code CODE_CUSTOM_LEN to the marshal format, which differs from the existing CODE_CUSTOM by including the 32-bit and 64-bit size fields. The marshaller (extern.c) now only produces CODE_CUSTOM_LEN, but the unmarshaller still understands CODE_CUSTOM. This means that old marshalled files continue to work, but new ones will not work with old programs.

This makes marshalled custom objects 12 bytes longer (4 byte bsize_32 + 8 byte bsize_64). More complicated and compact schemes are possible, but this is simplest.

This means that the unmarshaller now knows how many bytes of memory a custom deserialiser requires before calling the deserialiser, which is a requirement for the multicore branch. That branch will not support unmarshalling CODE_CUSTOM, so I'd like to get patch merged soon to make transition easier.

Currently, the only use that the unmarshaller makes of the new length fields is error-checking: with this patch, length inconsistencies between custom serialize and deserialize functions will be detected. There is now a test in tests/lib-marshal testing this.

stedolan · 2018-03-29T14:26:56Z

I think the build failure on MSVC64 is an old, minor bug in nativeint_deserialize being uncovered by the new error checking.

I've created #1684 to fix that bug, which I think should make this build pass. (I don't have a Windows machine handy to test on, so I'll just wait for the CI to run again).

stedolan · 2018-04-03T13:41:29Z

The build was indeed fixed by #1684, so I think this is ready to merge.

gasche · 2018-04-03T13:47:10Z

cc @xavierleroy: this looks like something you would be interested in looking at

xavierleroy · 2018-04-03T17:29:57Z

I haven't looked at the code yet. I understand the need but the increase in size (of marshaled data) makes me a bit nervous. Perhaps we could use a variable-length encoding (think UTF-8) for the two sizes.

Take bignums and bigarrays as examples: these are variable-length custom blocks, but the info needed to determine the size come first in the marshaled byte stream, followed by the actual data. So, this gets me thinking: what about a different demarshaling API with one function that determines the size (reading as much of the input data as needed to do so), so that the custom block can be allocated, then another function fills it (reading the rest of the input data).

Symmetrically, if the size of the custom block is always written, it could be passed to the demarshaler function, and may not need to be stored again in the encoding of the custom block. This could work for bignums and one-dimension bigarrays, but not multiple-dimension bigarrays.

xavierleroy · 2018-04-03T17:32:54Z

PS. Variable-length encodings need not be complicated. Here is an encoding that is often used in smart card protocols and is even simpler than UTF-8:

1xxxxxxx 1yyyyyyy ... 0zzzzzzz

i.e. 7 bits per byte, top bit 1 means "there is more to come", top bit 0 means "end of number".

stedolan · 2018-04-03T18:48:16Z

I did consider using variable-length sizes. Coincidentally, I was even planning to use the format you suggest! (which I know under the name "varint") The major annoyance is the serialise callback, which begins writing bytes and only tells the marshaller the length afterwards. This means that if a variable-length format is used, the serialised data must be shifted into the correct place afterwards, since the offset at which the data should be written is not known until the length is determined. I don't think this extra copy would be unworkably slow, but it was sufficiently annoying that I went for a fixed-size format instead.

On the other hand, I don't think the size change matters much. Here are some size numbers for boot/ocamlc, in the new and old marshal formats, with and without compression:

`boot/ocamlc`	raw	gzip	lzma
old	2231.2 K	475.9 K	310.1 K
new	2231.8 K	476.0 K	310.2 K

The most pathological case I could think of was a long int32 array, so I also tested with Array.init 1000 (fun x -> Int32.of_int x):

`int32 array`	raw	gzip	lzma
old	7.8 K	1.85 K	0.66 K
new	19.6 K	2.34 K	0.72 K

So, I don't think the size difference will usually be significant, and in any case using a standard compressor makes far more difference than fiddling with the marshal format.

I agree that the problems could be solved by changing the custom marshalling API instead of changing the marshal format, but I think changing the API would cause more pain than changing the format. Apart from the bootstrap itself, I haven't come across much long-lived data using the Marshal format, but there are quite a few custom serialisers / deserialisers hanging around.

stedolan · 2018-04-13T14:59:34Z

I'm afraid I've lost track of who's waiting for who. Is this ready-to-merge, or waiting for me to implement variable-length-sizes + shift data afterwards? (I'm happy with either option)

xavierleroy · 2018-04-13T16:16:03Z

I'm still pondering the options and I don't have a strong opinion. Other opinions most welcome. At any rate this won't make it int 4.07 so we have some more time to converge on a design.

The boot/ocamlc measurements are not really conclusive because few custom blocks (e.g. int32, int64, bigarrays) are in the static data of this application.

stedolan · 2018-04-13T16:47:36Z

OK. I also don't have a strong opinion. I picked one option for the sake of making a PR, but I'm not particularly attached to it. I don't think any of the options are very much work to implement.

I agree that boot/ocamlc is not a good example of using custom blocks. The point I was trying to make with the measurements is that (a) standard compressors are very good at compressing Marshal files, so anyone size-sensitive should be using them, and (b) after applying a standard compressor, this patch makes the pathological case of int32 array only 10% bigger (25% for gzip).

Adds a new marshal code CODE_CUSTOM_LEN, which stores the lengths (for both 32- and 64-bit) of each marshalled custom block. The marshaller only uses CODE_CUSTOM_LEN, but the unmarshaller still understands CODE_CUSTOM (So, old files still unmarshal, but new ones don't work on old versions). This allows the unmarshaller to know exactly how much memory is needed before calling a custom `deserialize` function.

stedolan · 2018-07-03T12:02:58Z

Following @damiendoligez's suggestion, I added CODE_CUSTOM_FIXED as an optimisation for fixed-length custom blocks. Now int32, int64, and nativeint consume the same number of bytes as before, but lengths are still available to the deserialiser.

I added some error checking in serialisation: I chose to trigger a fatal error during serialization if fixed lengths are specified but serialize produces a different length, and to raise an exception during deserialization if fixed lengths are expected but unavailable. (The idea is that bad input can be handled with an exception, but finding bad C code in the current process is not something that should be recovered from).

I also fixed a typo (wsize -> bsize for a size in bytes) in the manual while adding docs for the new field.

xavierleroy

Looks good to me overall. A few comments below, but nothing that requires a change.

xavierleroy · 2018-07-04T12:33:55Z

runtime/caml/custom.h

  uintnat (*deserialize)(void * dst);
  int (*compare_ext)(value v1, value v2);
+  const struct custom_fixed_length* fixed_length;
 };


I wasn't expecting a pointer to struct custom_fixed_length, just a flat sub-structure.

That was the original plan, but it got lost in a maze of incompatible compilers.

The issue is supporting custom_fixed_length_default, since the custom_foo_default macros are used both in initialiser lists and as rvalues in assignments. As a flat sub-structure, custom_fixed_length_default could be:

{0, 0} - an initialiser list, not valid as an rvalue

(struct custom_fixed_length){0, 0} - a structure literal, valid C99 but rejected by MSVC

empty_fixed_length (some predefined static const object) - when used as an initialiser, deemed "non-constant" and rejected by C99 compilers (although valid C++, so MSVC might be OK with this)

Making it be a pointer and defining custom_fixed_length_default as NULL seemed the easiest way out. If you can think of a portable alternative, I'm happy to change this!

You're right, there is no nice solution in C90. A not very nice solution:

# define custom_fixed_length_default {0, 0}

for use in initializers (the most common use). For assigning to a lvalue e, you could do

{ struct custom_fixed_length cfl = custom_fixed_length_default; e = cfl; }

Is it the case that caml_finalize is the only place where we need an assignment to an expression of type struct custom_fixed_length? If yes there might be other ways to write this function.

I think caml_finalize is indeed the only occurrence of custom_fixed_length_default in an assignment, and changing it isn't hard. I was more worried about external users of the API, who might be used to using custom_foo_default in both assignments and initialisers.

xavierleroy · 2018-07-04T12:35:31Z

runtime/extern.c

+        size_header = extern_ptr;
+        extern_ptr += 12;
+        Custom_ops_val(v)->serialize(v, &sz_32, &sz_64);
+        /* Store length before serialized block */


You could use the serialize variable rather than fetch from Custom_ops_val(v) again. CSE in the C compiler might not be clever enough to reuse it for you.

xavierleroy · 2018-07-04T12:35:46Z

runtime/extern.c

+        write(CODE_CUSTOM_FIXED);
+        writeblock(ident, strlen(ident) + 1);
+        Custom_ops_val(v)->serialize(v, &sz_32, &sz_64);
+        if (sz_32 != fixed_length->bsize_32 ||


Same comment re: reuse of serialize.

xavierleroy · 2018-07-04T12:36:33Z

runtime/extern.c

+        Custom_ops_val(v)->serialize(v, &sz_32, &sz_64);
+        if (sz_32 != fixed_length->bsize_32 ||
+            sz_64 != fixed_length->bsize_64)
+          caml_fatal_error("output_value: incorrect fixed sizes specified by %s",


I'm tempted to put the check in a CAMLassert, so that it is performed by the debug runtime but not by the normal runtime.

It uses caml_fatal_error so that it can specify the name of the offending serialiser (an assertion failure is harder to debug, and looks like a bug in the runtime rather than the user's custom serialiser)

xavierleroy · 2018-07-04T12:37:44Z

runtime/intern.c

+          if (size != expected_size) {
+            intern_cleanup();
+            caml_failwith("input_value: incorrect length of serialized custom block");
+          }


But here I agree an exception is better than an assertion.

damiendoligez

Looks good. I don't have a strong opinion on the "substructure vs pointer" debate.

damiendoligez · 2018-07-06T09:01:56Z

runtime/extern.c

-      write(CODE_CUSTOM);
-      writeblock(ident, strlen(ident) + 1);
-      Custom_ops_val(v)->serialize(v, &sz_32, &sz_64);
+      if (!fixed_length) {


Please write if (fixed_length == NULL), if only to be consistent with the previous code (2 lines up).

stedolan · 2018-07-10T11:52:09Z

Fixed line length to keep check-typo happy.

stedolan · 2018-08-21T14:43:41Z

I didn't notice that a bootstrap had made this unmergeable! Rebased now.

As a space optimisation, custom serializers that always consume the same number of bytes need not send the lengths explicitly.

stedolan · 2018-08-22T09:34:23Z

I accidentally committed a bad bootstrap in yesterday's rebase. Should be fixed now.

stedolan · 2018-09-13T11:44:26Z

I think this is ready to merge!

damiendoligez · 2018-09-14T08:58:00Z

Appveyor's failure is an unrelated timeout. Merging.

stedolan · 2018-09-14T09:03:51Z

Thanks!

Multiway match typechecking and translation for pattern guards (ocaml#2) * multiway typechecking and translation * update jane test output * self-review: format and style in translcore * more translcore/typedtree cleanup * expose value `is_guarded_rhs` * fix typedtree printer * make discussed changes to ocamldoc * format: remove unnecessary parens in pattern * improve parmatch variable naming * explain [exp_attributes] and [exp_extra] weirdness * improve translcore [event_function*] naming * inlined transl_body in transl_rhs * rename pats_exp... to use "rhs" naming * added test for guarded value/exception or-patterns * address ocamldoc CRs --------- Co-authored-by: Nick Roberts <[email protected]>

Co-authored-by: Etienne Millon <[email protected]> Co-authored-by: Christine Rose <[email protected]>

stedolan force-pushed the marshal-custom-length branch from f8521a9 to 4d25bec Compare March 29, 2018 13:04

damiendoligez added high-priority multicore-prerequisite labels Mar 29, 2018

stedolan mentioned this pull request Mar 29, 2018

Fix size computation in nativeint_deserialize on MSVC64. #1684

Merged

stedolan force-pushed the marshal-custom-length branch from 4d25bec to 9e2e626 Compare April 3, 2018 08:17

dra27 added this to the consider-for-4.07 milestone Apr 5, 2018

mshinwell removed this from the consider-for-4.07 milestone Apr 9, 2018

kayceesrk mentioned this pull request May 8, 2018

Marshalling and unmarshalling ocaml-multicore/ocaml-multicore#75

Closed

stedolan force-pushed the marshal-custom-length branch from 9e2e626 to 0d3c193 Compare July 3, 2018 11:58

hhugo mentioned this pull request Jul 3, 2018

Update the marshal runtime once ocaml#1683 is merged ocsigen/js_of_ocaml#680

Closed

damiendoligez requested review from damiendoligez and xavierleroy July 4, 2018 09:49

xavierleroy reviewed Jul 4, 2018

View reviewed changes

stedolan force-pushed the marshal-custom-length branch from 0d3c193 to 1f5c010 Compare July 4, 2018 13:41

damiendoligez approved these changes Jul 6, 2018

View reviewed changes

stedolan force-pushed the marshal-custom-length branch from 1f5c010 to 614bb14 Compare July 6, 2018 14:40

stedolan force-pushed the marshal-custom-length branch from 614bb14 to 0888b63 Compare July 10, 2018 11:52

damiendoligez self-assigned this Jul 11, 2018

stedolan force-pushed the marshal-custom-length branch from 0888b63 to ff0b18f Compare August 21, 2018 14:43

Add CODE_CUSTOM_FIXED for fixed-size custom serializers.

e9b04bb

As a space optimisation, custom serializers that always consume the same number of bytes need not send the lengths explicitly.

stedolan force-pushed the marshal-custom-length branch from ff0b18f to e9b04bb Compare August 22, 2018 09:33

damiendoligez merged commit 2fd0186 into ocaml:trunk Sep 14, 2018

stedolan mentioned this pull request Oct 3, 2018

Custom object intern fixes ocaml-multicore/ocaml-multicore#224

Merged

dhil mentioned this pull request Nov 6, 2018

Compatibility with OCaml trunk (4.08.0+dev0) ocaml/Zarith#41

Merged

Octachron mentioned this pull request Dec 9, 2019

Add "since 4.08.0" for custom_fixed_length in the manual #9172

Merged

stedolan mentioned this pull request Dec 10, 2019

Move a wayward Changes entry #9174

Merged

gasche mentioned this pull request Feb 22, 2020

Performance regression in OCaml 4.08 #9326

Closed

proux01 mentioned this pull request Mar 11, 2020

Fix coqchk for primitive integers on 32bit arch with OCaml >= 4.08 (#11624) rocq-prover/rocq#11805

Merged

gasche added a commit to gasche/ocaml that referenced this pull request Aug 11, 2020

Changes: the ocaml#1683 entry was misplaced

985b8ea

gasche mentioned this pull request Aug 30, 2022

Improve error message when unmarshalling using old custom block format #11223

Merged

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

Add an announcement for Dune 3.11.1 (ocaml#1683)

ec95968

Co-authored-by: Etienne Millon <[email protected]> Co-authored-by: Christine Rose <[email protected]>

Conversation

stedolan commented Mar 29, 2018

Background

Current marshal format

Change to the format

Uh oh!

stedolan commented Mar 29, 2018

Uh oh!

stedolan commented Apr 3, 2018

Uh oh!

gasche commented Apr 3, 2018

Uh oh!

xavierleroy commented Apr 3, 2018

Uh oh!

xavierleroy commented Apr 3, 2018

Uh oh!

stedolan commented Apr 3, 2018

Uh oh!

stedolan commented Apr 13, 2018

Uh oh!

xavierleroy commented Apr 13, 2018

Uh oh!

stedolan commented Apr 13, 2018

Uh oh!

stedolan commented Jul 3, 2018

Uh oh!

xavierleroy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stedolan Jul 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

damiendoligez left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stedolan commented Jul 10, 2018

Uh oh!

stedolan commented Aug 21, 2018

Uh oh!

stedolan commented Aug 22, 2018

Uh oh!

stedolan commented Sep 13, 2018

Uh oh!

damiendoligez commented Sep 14, 2018

Uh oh!

stedolan commented Sep 14, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

stedolan Jul 4, 2018 •

edited

Loading