Skip to content

Add a primitive to compute the "transitive" heap size of a value#560

Merged
alainfrisch merged 3 commits intoocaml:trunkfrom
alainfrisch:objsize
May 2, 2016
Merged

Add a primitive to compute the "transitive" heap size of a value#560
alainfrisch merged 3 commits intoocaml:trunkfrom
alainfrisch:objsize

Conversation

@alainfrisch
Copy link
Contributor

@alainfrisch alainfrisch commented Apr 25, 2016

It is sometimes useful to know how "big" values are. This can serve as a debugging tool (to track the cause of excessive memory usage), but also in production to adjust cache heuristics (e.g. eject "big-but-cheap-to-compute" values sooner). It can also be useful to help people get a good feeling of how to design compact data structures and how not to break sharing of inner nodes.

So I propose to add such a function to the Obj module (to make it a bit scary to use). It computes the total size (in words) of blocks accessible from a given value. Statically allocated blocks are excluded (this can also serve to implement non-regression tests that check that a value is statically allocated).

The implementation was inspired by the technique used in extern.c: when a block is first traversed, its color is turned to "blue". We maintain a queue of pointers which serves as a worklist (blocks yet to inspect) and also to restore original colors. One word of internal storage is required per visited block. (I had a first implementation reusing data structures from extern.c, but they use at least two words per block.)

If this proposal is well received, I'll write some unit tests. [EDIT: done]

@dbuenzli
Copy link
Contributor

MPR #6820

@alainfrisch
Copy link
Contributor Author

Any comment on the proposal? Is Obj the appropriate module to put this function? I'd also appreciate if another pair of eyes could review the code; it's easy to make a mistake...

@dbuenzli
Copy link
Contributor

Is Obj the appropriate module to put this function?

I think it's better than the Gc module proposed in MPR #6820.

@mshinwell
Copy link
Contributor

I only had a very quick look (and am jetlagged) but it looks like caml_stat_alloc should probably be used instead of malloc, no?

I'm a bit torn about Gc versus Obj. In some sense, if this is going to be a safe operation, there's no need for it to be in Obj. However it doesn't seem to quite fit into Gc either.

@alainfrisch
Copy link
Contributor Author

I only had a very quick look (and am jetlagged) but it looks like caml_stat_alloc should probably be used instead of malloc, no?

I got inspired by extern.c which uses malloc. A quick grep-fight in byterun gives an inconclusive:

$ grep malloc *.c  | wc -l
34

$ grep caml_stat_alloc *.c  | wc -l
32

Does anyone know how to choose?

In some sense, if this is going to be a safe operation,

Obj.size is also safe (I think). But such function are very implementation dependent. Typically, it's not clear than javascript backend will support reachable_words (with a close-enough semantics -- I'm thinking about their ability to look into closures). Perhaps Sys?

byterun/obj.c Outdated
The real color is stored in the last 2 bits of the pointer in the
queue. (Same technique as in extern.c.)

The queue is stored as a linked list of blocks. Initially,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is confusing because there are two kinds of blocks.

@damiendoligez
Copy link
Member

I think Obj is better than Sys.

@alainfrisch
Copy link
Contributor Author

@damiendoligez Can you comment about malloc vs caml_stat_alloc?

@alainfrisch alainfrisch force-pushed the objsize branch 2 times, most recently from 2dc899f to 8f6556b Compare May 2, 2016 15:52
@alainfrisch alainfrisch merged commit bd91a79 into ocaml:trunk May 2, 2016
@lefessan
Copy link
Contributor

lefessan commented May 2, 2016

I am really surprised by this merge. I was thinking that we were waiting for Damien's feedback, and that the technical discussion was not over, as I think the primitive you described is far from being as useful as you said, in its current state (at least, for a cache or whatever, I would expect the primitive to come with a threshold argument, that would stop the traversal if the size becomes greater than the threshold).
@alainfrisch @damiendoligez

@alainfrisch
Copy link
Contributor Author

Damien posted some (minor) comments, which I addressed. I guess that he would have said so if he was against the proposal.

The threshold is indeed a good idea for use in caches, but the current version is already very useful for debugging purposes.

@DemiMarie
Copy link
Contributor

@alainfrisch I think that one big difference is that if the runtime becomes unloadable, then caml_stat_alloc will use a memory pool to ensure all allocations are freed.

@DemiMarie
Copy link
Contributor

CI fails with flambda.

@damiendoligez
Copy link
Member

@drbo This means we should be using caml_stat_alloc everywhere instead of malloc, right?

@alainfrisch minor comments don't really mean that I approve of everything else, especially when I don't add the "approved" label...

@DemiMarie
Copy link
Contributor

@damiendoligez I would imagine so (not part of the core team).

@yakobowski
Copy link
Contributor

FWIW, we (the Frama-C team) have been using @backtracking's size library for a long time. This is more or less what is proposed here. Our experience is that more functions are needed very rapidly. In particular, we often need to compute "the size of this value except what is also reachable from that value".

@alainfrisch
Copy link
Contributor Author

In particular, we often need to compute "the size of this value except what is also reachable from that value".

This would be quite easy to implement, but costly if "that" value itself is big. A variant is to "cut" only one (or several) specific values (but not blocks accessible from them).

@xavierleroy
Copy link
Contributor

This means we should be using caml_stat_alloc everywhere instead of malloc, right?

Not always. Only if you're happy with caml_stat_alloc aborting the current function if it runs out of memory. However, this can lead to memory leaks. Consider two out-of-heap allocations in sequence:

  char * buf1 = caml_stat_alloc(size1);
  char * buf2 = caml_stat_alloc(size2);

If the second allocation runs out of memory (but not the first), an exception is raised and the space for buf1 will never be released. In such cases, it is better to write

  char * buf1 = malloc(size1);
  if (buf1 == NULL) caml_raise_out_of_memory();
  char * buf2 = malloc(size2);
  if (buf2 == NULL) { free(buf1); caml_raise_out_of_memory(); }

@damiendoligez
Copy link
Member

OK, so don't use caml_stat_alloc as a replacement for malloc. We'll have to review all uses of malloc to implement the unloadable runtime anyway.

stedolan added a commit to stedolan/ocaml that referenced this pull request May 24, 2022
64235a3 flambda-backend: Change Float.nan from sNaN to qNaN (ocaml#466)
14a8e27 flambda-backend: Track GC work for all managed bigarray allocations (upstream 11022) (ocaml#569)
c3cda96 flambda-backend: Add two new methods to targetint for dwarf (ocaml#560)
e6f1fed flambda-backend: Handle arithmetic overflow in select_addr (ocaml#570)
dab7209 flambda-backend: Add Target_system to ocaml/utils (ocaml#542)
82d5044 flambda-backend: Enhance numbers.ml with more primitive types (ocaml#544)
216be99 flambda-backend: Fix flambda_o3 and flambda_oclassic attributes (ocaml#536)
4b56e07 flambda-backend: Test naked pointer root handling (ocaml#550)
40d69ce flambda-backend: Stop local function optimisation from moving code into function bodies; opaque_identity fixes for class compilation (ocaml#537)
f08ae58 flambda-backend: Implemented inlining history and use it inside inlining reports (ocaml#365)
ac496bf flambda-backend: Disable the local keyword in typing (ocaml#540)
7d46712 flambda-backend: Bugfix for Typedtree generation of arrow types (ocaml#539)
61a7b47 flambda-backend: Insert missing page table check in roots_nat.c (ocaml#541)
323bd36 flambda-backend: Compiler error when -disable-all-extensions and -extension are used (ocaml#534)
d8956b0 flambda-backend: Persistent environment and reproducibility (ocaml#533)
4a0c89f flambda-backend: Revert "Revert bswap PRs (480 and 482)" (ocaml#506)
7803705 flambda-backend: Cause a C warning when CAMLreturn is missing in C stubs. (ocaml#376)
6199db5 flambda-backend: Improve unboxing during cmm for Flambda (ocaml#295)
96b9e1b flambda-backend: Print diagnostics at runtime for Invalid (ocaml#530)
42ab88e flambda-backend: Disable bytecode compilers in ocamltest (ocaml#504)
58c72d5 flambda-backend: Backport ocaml#10595 from upstream/trunk (ocaml#471)
1010539 flambda-backend: Use C++ name mangling convention (ocaml#483)
81881bb flambda-backend: Local allocation test no longer relies on lifting (ocaml#525)
f5c4719 flambda-backend: Fix an assertion in Closure that breaks probes (ocaml#505)
c2cf2b2 flambda-backend: Add some missing command line arguments to ocamlnat (ocaml#499)

git-subtree-dir: ocaml
git-subtree-split: 64235a3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants