Conversation
stdlib/ephemeron.mli
Outdated
There was a problem hiding this comment.
thx, the fix will be autosquashed
|
About the slowdown. In the patch of 2011, ephemerons used a specific tag and so codes that didn't use ephemerons where not impacted. Here weak pointers are implemented using ephemerons (one additional word) and none of them use a specific tag. I tried to reduce the slowdown but any attempt had the opposite effect. The only way I can think of is to separate in two different lists ephemerons and weak pointers. Currently you can go from one to the other, a weak pointer is just an ephemeron with an empty data. |
|
I had wished for inclusion of ephemerons for a long time, and it's great to see some progress in this direction! For users who don't care about the seed property, would it make sense to expose "Make" functors (with hash: t -> int), assuming this can be done without too much code bloat? I'm also slightly reluctant to keeping Ephemeron.Obj in the same unit as Ephemeron. What about moving it to a submode of Obj (hence: Obj.Ephemeron), or creating a different unit? The documentation of Ephemeron.Obj.create should explain the meaning of the argument (number of keys). |
stdlib/ephemeron.mli
Outdated
There was a problem hiding this comment.
minor typo: "This function as" -> "This function has"
|
Those are only documentation comments:
Other than that:
|
stdlib/hashtbl.mli
Outdated
There was a problem hiding this comment.
Shouldn't that rather be is_randomized ?
There was a problem hiding this comment.
Usually I use get_A, set_A, is_A. But perhaps here is_randomized is better, if someone else agree I will change that.
There was a problem hiding this comment.
is_randomized is a better name in this case.
There was a problem hiding this comment.
ok, fixed in bobot/ocaml@46d1105 and bobot/ocaml@cb6ff07
|
The last push took, I hope, into account the majority of the current remarks:
@dbuenzli doc:
@dbuenzli code:
|
|
I added a note about marshalling in bobot/ocaml@ec11031 |
|
Where does the restriction about marshaling come from? |
|
Weak pointers already have this restriction.I think the problems are:
But if you are guided by the type and you keep the sharing there is no problem, I think. Another point I think we can lift the restriction about integer in Weak and Ephemeron. We can say that an integer value is always alive. That could remove some complication in the case of |
|
|
|
About integers, the first version of Weak allowed integers as always-live values but then people complained that the integers they put in their weak hash sets would never disappear, so I removed them. I don't think it's a good idea to add them. |
|
I just added commit dc49f2e to remove code duplication between caml_ref_table and caml_ephe_ref_table. Needs to be reviewed. |
|
Since |
Move examples to the test suite
Reset realloc'ed stack to avoid useless scanning.
There was a problem hiding this comment.
In stats_alive: Why is the number of bindings in bucket b counted via the bucket_length function?
Is the function expected to compute the number of bindings that are still alive? In that case, should the call to bucket_length be replaced by a call of bucket_length_alive?
I expect this would produce the actual number of bindings that are still alive within the bucket?
There was a problem hiding this comment.
Well spotted. In fact the new warnings used during the compilation of the trunk also found it. Fixed.
a6883e0 to
bc77353
Compare
|
@damiendoligez The branch is rebased on trunk. There is no big differences compared to before, more comments, I hope better naming. Just c038923, which factorize the short-circuiting of Forward tag, have been heavily modified. We discussed it before, but I think we forget that there are different use of it. I used flags for selecting the different mode. What do you think of it? I need to review #297 before rebasing above it. Some general remarks:
|
|
I only spent two minutes looking at this patch, but I have two comments:
|
|
@mshinwell For point 1. I have not read the assembly code obtained to see if the refactoring is inlined. However in the performance test I've done (in the header of this MR) there is no regression for the first commits (0,1,2). If people prefer I can try to force inlining. For point 2. You are completely right, I do the substitution By mail I received some comments for the documentation of the ephemerons by William (Romait Troit) that I will take into account shortly. |
byterun/major_gc.c
Outdated
|
The MR #515 add the missing changes and modification of the documentation. (It target 4.03 but is tagged 4.04 because it contained other more invasive changes) |
|
Hmm. Whilst that change does break the current code, I cannot see how it would cause the specific test failure. Has anything else changed about when weak pointers are cleared? It seems like a single |
|
If there is a finalizer attached to it you need indeed two rounds. In the first the value is marked as to be finalized and alive ( since the finalizer will use the value). In the second it is marked dead and so can be removed from the weak pointer and swept. Previously you needed only one round for the value to be removed from the weak pointer, but two rounds for the value to be swept. Since for fixing the semantic of weak pointer the value is removed from weak pointer only when dead so swept, it is done in the second round. |
|
I have to say that this change to the ordering between finalisers and weak references is quite annoying. It is not now possible to get something to run after a weak has been emptied, which makes it hard to tidy up the data structure (e.g. hashtable) which contains the now empty weak reference. I considered trying to replace the weak reference with an ephemeron pointing to a fresh block, and then place the finaliser on that block instead of on the target of the weak reference -- but looking through the patch it is clear that the finaliser may still run before we have entered the clean phase which means the ephemeron will not yet be cleared. I appreciate the reasons for changing the order, but it would have been good to simultaneously add support for the other kind of finalisers (i.e. the ones which do not receive their target as an argument and so can safely be run after weak has been cleared) since it is now quite difficult to handle clean up of data structures containing weak pointers. |
|
As a side note, whilst looking around the patch, it seemed to me that line 479 of major_gc.c: ephes_to_check = ephes_checked_if_pure;(the one just after calling
Either way it looks like it will tend to scan the ephemeron list one more time than is necessary. Simply removing that line would fix it. |
I want to keep this problem on the radar. Could you open a Mantis PR? |
|
@damiendoligez Doesn't MPR 7210 is enough for that ? @lpw25 Do you tried to use
I agree. It will be too late for 4.03 but for 4.04 I'm willing to propose patch to add the support if it is seen as needed in MPR 7210.
It is even more sure than not, if I understand what you do correctly. The ephemeron is unset in the same gc round than the key is swept, so at least one round after the "finalization" of the key.
It is an interesting remark. I agree that it is unnecessary. I will do the GPR for 4.04. |
Yes it is, thanks. |
Always update budget_left in major_collection_slice
* Add intrinsics for ext_pointer and native_pointer
* Add two_args
* remove XCR
* Add intrinisics for storing and loading int64,int32,nativeint
Fix up names
* Remove "unsafe" from the name of one of the intrinsics
* Address review comments
Use Word_int instead of Word_val for {load,store}_immediate intrinsics
Co-authored-by: Mark Shinwell <[email protected]>
ce88833 Merge flambda-backend changes b7506bb Revert "Cherry-pick of ocaml/ocaml 1eeb0e7 (ocaml#12)" 183f688 Add config option to enable/disable stack allocation (ocaml#22) ee7c849 If both the type and mode of an ident are wrong, complain about the type. (ocaml#19) 44bade0 Allow submoding during module inclusion checks (ocaml#21) de3bec9 Add subtyping between arrows of related modes (ocaml#20) 93d8615 Enable the local keywords even when the local extension is off (ocaml#18) 81dd85e Documentation for local allocations b05519f Fix a GC bug in local stack scanning (ocaml#17) 9f879de Fix __FUNCTION__ (ocaml#15) a78975e Optimise "include struct ... end" in more cases (ocaml#11134) b819c66 Cherry-pick of ocaml/ocaml 1eeb0e7 (ocaml#12) bb363d4 Optimise the allocation of optional arguments (ocaml#11) git-subtree-dir: ocaml git-subtree-split: ce88833
* Add config option to enable/disable stack allocation The local extension can now be turned on independently of whether anything actually gets allocated on the stack.
* also start using unicode strings and workaround for next/link not working with react child components
Ephemerons
The current OCaml runtime proposes two kinds of pointers:
Apoints hardly toB: ifAis alive,Bis alive (usual pointers)Apoints weakly toB: when the GC decide thatBstops to be alive,Astops to point toBWeak pointers are very handy when you want to be able to access a data without creating memory leaks. For example hashconsing can be done using the weaksets defined in
Weak.Make. However you can't create general weaktable with the current runtime: associative table that keep a data until the key is not needed anymore. The problematic case is when the data points to the key(cf 10).Barry Hayes proposed in 1997 (cf 9) the notion of ephemerons which allow to code such table. In its simplest form an ephemeron
Awith keyKand dataDkeepsDalive whileAandKare alive. To sum up ephemerons add the possibility to express conjunctions where you normally have only disjunctions.Request for Comment and Merge-Request
This merge-request adds ephemerons to ocaml with particularly a modification of the runtime, 4, and a new module in the standard library, 6. An untyped API with an arbitrary number of keys is given in
Ephemerons.Objand two specialized API for one and two keys are provided inEphemerons.K1andEphemerons.K2. Each of these three APIs define a functor for constructing weaktables with the same signature thanHashtbl.Before an actual review of the C or OCaml implementation parts, comments about the OCaml interface are more than welcomed.
The modifications are splitted as follows:
The code have been evaluated on Why3, "don't use" correspond to why3 commit 11 and "use" to why3 commit 12 that replace the half-satisfying weak-table solution by weak-table provided by Ephemeron.
We can see that the slowdown for a code that heavily use weak pointers is of 1-2%, and the speedup when using weak-pointers (weaksets) and weaktable is of 8% without counting the fact that the weaktable is "perfect" and simple to use.
nb: the branch is not directly mergeable because of
boot/ocaml..., but at the end of the review process I will rebase the branch and doing the bootstrap.How to reproduce the bench
clone why3,
configure --enable-local,bin/why3config --detect. Add this section towhy3.conf:Bench with the command: