Make type-checking of "open" constant-time#834
Conversation
30d4a4d to
ef0d736
Compare
|
Timings for ocamlc.opt, compiling: module A = struct
let x1 = 1
...
let xn = n
end
let x = A.(x1)
...
let x = A.(xn)Results:
|
|
And a worst-case scenario, with many lookups that need to traverse a lot of "open" (here, 40): module A = struct end
let x = 42
open A (* 1 *)
....
open A (* 40 *)
let y1 = x
...
let yn = xTimings (seconds):
With only 10 opens:
|
|
Code typical of let r1 = Pervasives.(ref, succ)
....
let rn = Pervasives.(ref, succ)
|
|
An even worst-case scenario: let x = 42
open List (* 1 *)
....
open List (* n *)
let y1 = x
...
let y40000 = xHere one needs to look for
I've tried to memoize the lookup results in the environment tables (so that further lookups on the same name don't need to traverse all the chain), and this reduces the overhead by about 1/2. Considering how small the overhead is for typical number of nested opens, I don't believe this is worth the extra complexity. |
|
I don't understand the problem with the testsuite (tests/typing-poly/poly.ml). This is an "expect" test. The current source is: module M
: sig val f : (<m : 'b. 'b * ('b * <m:'c. 'c * 'bar> as 'bar)>) -> unit end
= struct let f (x : <m : 'a. 'a * ('a * 'foo)> as 'foo) = () end;;
module M
: sig type t = <m : 'b. 'b * ('b * <m:'c. 'c * 'bar> as 'bar)> end
= struct type t = <m : 'a. 'a * ('a * 'foo)> as 'foo end;;
[%%expect {|
Line _, characters 2-64:
Error: Signature mismatch:
Modules do not match:
sig val f : (< m : 'a. 'a * ('a * 'b) > as 'b) -> unit end
is not included in
sig
val f : < m : 'b. 'b * ('b * < m : 'c. 'c * 'a > as 'a) > -> unit
end
Values do not match:
val f : (< m : 'a. 'a * ('a * 'b) > as 'b) -> unit
is not included in
val f : < m : 'b. 'b * ('b * < m : 'c. 'c * 'a > as 'a) > -> unit
|}];;and the correct version is: module M
: sig val f : (<m : 'b. 'b * ('b * <m:'c. 'c * 'bar> as 'bar)>) -> unit end
= struct let f (x : <m : 'a. 'a * ('a * 'foo)> as 'foo) = () end;;
module M
: sig type t = <m : 'b. 'b * ('b * <m:'c. 'c * 'bar> as 'bar)> end
= struct type t = <m : 'a. 'a * ('a * 'foo)> as 'foo end;;
[%%expect {|
Line _, characters 2-64:
Error: Signature mismatch:
...
Values do not match:
val f : (< m : 'a. 'a * ('a * 'b) > as 'b) -> unit
is not included in
val f : < m : 'b. 'b * ('b * < m : 'c. 'c * 'a > as 'a) > -> unit
|}, Principal{|
Line _, characters 2-64:
Error: Signature mismatch:
Modules do not match:
sig val f : (< m : 'a. 'a * ('a * 'b) > as 'b) -> unit end
is not included in
sig
val f : < m : 'b. 'b * ('b * < m : 'c. 'c * 'a > as 'a) > -> unit
end
Values do not match:
val f : (< m : 'a. 'a * ('a * 'b) > as 'b) -> unit
is not included in
val f : < m : 'b. 'b * ('b * < m : 'c. 'c * 'a > as 'a) > -> unit
|}];;The trouble is that I cannot reproduce the new "bad" behavior (showing "..." instead of "Module do not match") manually outside the expect tool, even in the toplevel. @diml : do you see how to investigate that? |
|
Argh, the choice to print "..." in Includemod.report_error is based on the size of the marshaling of the internal representation of the object to be printed (here, a |
|
The size in the example is 501 bytes (with a limit at 500). This is probably related to a change of sharing (perhaps related to the fact that some Path.t, including simple Pident cases, are now recomputed on each access instead of being stored in the Env.t). I'm tempted to simply validate the test, since the current failure is more the sign of a weakness of the printer heuristics. |
|
For the test you should be able to raise the limit by writing this in poly.ml: Clflags.error_size := 1000;;(I just discovered this variable) |
|
I was actually considering setting it to 0 (no limit), since any value can make the test quite fragile. This seems to work fine for the current tests. |
|
So I've set error_size to 0 directly in expect_test.ml (and no test had to be adjusted). The rationale is that any arbitrary limit is likely to create irrelevant failures in the testsuite at some point (or be useless if it is set too high), considering how fragile the criterion is (which might be ok for interactive use, not for non-regression testing). |
|
Does anyone want to review this? @xavierleroy : you told me that you were concerned with degrading performances of lookups. How do the benchmarks above look to you? |
|
I started reviewing. Overall it seems fine. As for the performance profile, it seems to better reflect the actual use of open (a few of them in a given branch, sometime used for very short-time). (Except maybe a code generator that outputs a huge number opening of small modules?) |
|
Great, thanke @let-def .
I cannot see the comments. If you started a review, you need to "submit" to make comments visible. |
typing/env.ml
Outdated
| end | ||
|
|
||
|
|
||
| module EnvTbl2 = |
There was a problem hiding this comment.
Maybe use more descriptive names than EnvTbl & EnvTbl2?
There was a problem hiding this comment.
Do you have a suggestion?
There was a problem hiding this comment.
Good question... The distinction is between components that have physical representation and those which are "floating"?
Something like LabelTable & RootedTable?
I don't know of a general term for distinguishing those two categories in the compiler, though it would be relevant.
There was a problem hiding this comment.
One is for labels and constructors, for which several definitions can exist in the same module (and all need to be retrieved by name).
There was a problem hiding this comment.
Ok, proposing IdTbl for components with a physical representation (i.e. identified by an id) and TycompTbl for "components of types", i.e. labels and constructors. I'll happily rename if better names are proposed (this is purely internal to env.ml, so trivial to rename).
typing/env.ml
Outdated
| @@ -281,29 +492,35 @@ let is_in_signature env = env.flags land in_signature_flag <> 0 | |||
| let is_implicit_coercion env = env.flags land implicit_coercion_flag <> 0 | |||
|
|
|||
| let diff_keys is_local tbl1 tbl2 = | |||
There was a problem hiding this comment.
diff_keys(2) could be lifted inside the EnvTbl(2) module.
Also, the only purpose of EnvTbl.local_keys is to implement diff_keys.
So all could be replaced by just an export of EnvTbl.diff_keys & EnvTbl2.diff_keys.
typing/env.ml
Outdated
| env | ||
| with Not_found -> | ||
| env | ||
| (* update summary?? *) |
There was a problem hiding this comment.
I think summary should be updated too. The updated environment appears in a Typedtree, so if inspected from a CMT the reconstructed environment would be wrong.
(I cannot think of an existing codepath in Merlin that can fail because of that, but summary affects cmt processing and its implementation of short-paths)
There was a problem hiding this comment.
I've pushed a partial fix to that. It's not clear to me what to do when changing the value_description of an identifier imported from an open statement. In the old version, the id of these values was "hidden" (lookup can only be done by name); now there is no dummy id for these values so I don't see how to record the information in the summary (except by introducing a new Env_value_by_name of summary * string * value_description). But I'm not even sure that rewriting the type of such values is required... Do you know? @garrigue?
There was a problem hiding this comment.
Ok, I've pushed a more complete fix: now the "copy_types" operation is exposed by Env and represented as such in the "summary".
After discussing with @garrigue, it seems the current situation is not ideal anyway, since the type of all values should be copied in non-principal mode, including those accessed with a module qualifier (which is not the case now, yielding to a different behavior when a value is accessed through an "open" or not). Cleaning that would be nice but is quite independent of this PR, so I prefer to keep the current behavior untouched, and it seems cleaner to make the "copy" operation supported directly by Env.
|
@alainfrisch thanks, I never used that :) |
eabc2e6 to
afe0e41
Compare
|
I really think this is worth merging, as it enables a style with many small local opens (which are arguably less risky than global ones) without degrading much the style with global opens; moreover, it makes the support for open-related warnings arguably simpler. @let-def is happy with the current implementation. Does someone else want to review? |
|
(cc @garrigue who may be interested in reviewing and may or may not have been notified about this PR.) |
|
@alainfrisch you should remove the "work-in-progress" label if you think this is ready for merge. |
Yes, indeed. Thanks! |
|
@garrigue Are you indeed interested in reviewing this? |
|
Sorry. I read the diff, and answered @alainfrisch personally, but did not write here. I understand the goal, but I am not convinced by the implementation. |
I really believe that the implementation makes the treatment of open-related warnings much simpler to understand, and removes a weird notion of "signature substitution". It also opens the door to removing some of the current caching logic. Moreover, the "extra complexity" is tiny compared to other parts of the compiler (cf recent huge changes for flambda or spacetime); the new data structure is relatively simple and documented. This extra complexity seems well deserved to me considering the performance gains. @garrigue Do you see a simpler way to achieve the same effect? |
|
Not simpler. I was just thinking about how to avoid the overhead with lots of open's. One solution would be to amortize the cost by copying each definition to the current environment after accessing it. I.e. some kind of lazy copying. |
I'm a bit puzzled with the argument that the PR makes "open" faster, and could thus encourage people to use it too much, and so we should make "open" even faster. This kind of arguments could be used against most kinds of optimizations. Honestly, if we make open faster, I don't think that people will realistically start nesting hundreds of open; this would be terrible for code readibility anyway. What they could do is to use a lot of local opens instead of a few global ones, which is, in the eyes of some at least, an improvement.
In the previous note, you wrote "My main concern is that env.ml is already fairly complex, and this adds extra complexity." Are you now ok with the complexity of the suggested new implementation?
I've tried it. A naive implementation slows down the common case (few nested opens) quite a bit, since it complexifies the data structure and yield some extra mutations. Moreover, treatment of the open warnings is made more complex. More complex strategies could certainly be considered to get better worst-case complexity without degrading performances in most cases. Can we keep that for later, if the need arise? |
|
@alainfrisch My point is that we should be careful that using open does not slow down the subsequent typechecking. However your tests seems to show the contrary, so maybe it's fine. Also, my feeling is that at some point we will need a big clean up of env.ml, since we have been piling optimization over optimization for about 20 years... |
…hadowed identifiers.
…y related to this PR).
…tension constructors, the actual position is extracted from tje Ctr_extension tag).
…tation in summary), instead of a more generic 'update_value'.
a6803b3 to
f91f561
Compare
More changes addressing review comments on ocaml-10831
…l#834) * Add priority computation to the dataflow. * Don't use polymorphic [min] on ints
This PR is built on top of #828 and implements the idea in http://caml.inria.fr/mantis/view.php?id=6826, namely that type-checking an "open" statement should take constant time, independently of the size of the opened module.
This is done by keeping a layered representation for the environment: a set of local bindings, and possibly a symbolic representation of an open (just its path and its computed set of components, which is memoized) + the bindings before that open. Each lookup traverses this linked list of partial environments. This might be a bit slower, but of course, each map is smaller. For instance, previously, a lookup in an environment obtained by opening two modules with 1000 components would take O(log(2000)), while now it takes O(2*log(1000)).
Moreover, the new representation allows a lighter representation, since e.g. support for warnings on open don't pollute the representation for local bindings. In addition to that,
find_samelookups don't need to go through parts representing components imported by an open statement, so they might actually be faster than in the current implementation.All in all, I could not find a case where the new strategy slows down lookup enough to make the entire type-checking slower.