Fix float array specialisation in flambda#1169
Conversation
Can you elaborate on that part? Do you mean storing a float in boxed form in a regular array? How could this be done (without using unsafe tricks or the C API)? |
Yes
It can't, but in practice it is done using unsafe tricks and up until now that has been perfectly safe. |
If you have an array whose static type is "float array", it is unsound for it to be represented as a regular array (since accessing the array will not perform a dynamic check for its representation). I guess you are referring to code that encapsulate some local data structure and is safe for the "client code". Are you aware of such tricks used in publicly available libraries, or are you referring to some code internal at Jane Street? |
|
Here's one from core_kernel: There are others in core_kernel as well as in our internal code. |
|
Doesn't look like a bug to me. Looks like an unsafe trick that needs to be changed, since that's implicit in using unsafe tricks. Right? |
|
In general, it's not a bug if code using unsafe In this particular case, though, I agree with @lpw25 that the behaviour should be changed: the runtime is careful to only do the float array optimisation based on the tag of the array, never on the tag of the element, and the optimiser should behave the same way. |
|
Yeah, I think this was a mistake (partially on my part). Leo's patch appears to be ok. |
|
I'm not completely certain than we want to remove this. It mainly depend on the incentive we want on the float array hack. I'm not certain whether I was responsible for this, but I knew the behavior and willingly kept it that way. This is I think an optimization that triggers quite a lot. The idea is that there are seldom any information about the array kind because arrays are, in practice, always created with externals (this is the reason why the fatal error in the current code almost never trigger on incorrect cases). But in numerical code values stored in the array on the other hand can often be known as float. For instance after inlining The kind of code you are suggesting is highly unsafe and already needs some deep understanding of the compiler to write properly. Assuming that this can be updated does not strike me as a major problem. In fact I would rather argue that this code is missing a case to prevent the use of the dynamic version when the argument is known to be a block. |
Well the reason this unsafe code is needed in the first place is to avoid the cost of the float array hack by creating genuine uniform arrays. I'm not particularly keen on optimising generic code using float arrays at the cost of making it very difficult to work around them.
I disagree that a deep understanding is required, this is a fairly simple property that has always previously been guaranteed by OCaml.
I have my doubts about the ease of finding all these cases and updating them. It is certainly the case that most of Janestreet's libraries cannot safely be used with 4.04+flambda because of this change. I'll try and find all the cases in our code-base, possibly if I make the optimisation into an error I can reliably find them.
I think that would definitely be preferable to using |
|
@lpw25 Have you made progress on finding all cases in JS code? @chambart Just because something is not documented doesn't mean that we go around and break existing code without carefully considering the matter. So it comes down to the usefulness of this particular optimization vs. the inconvenience of breaking lots of code. I don't have a firm opinion yet, but it needs to be argued either way. |
|
@damiendoligez in that case, I think it is quite a useful one, especially if we consider propagating more type information to the backend, like #1029 Also potential breakage must be quite well contained. Abusing Obj for that needed to be done extremely carefully, even without potential additional float array specialization. For instance not being bitten by GC specialization problems requires careful type annotation on every use: There is no way that could have been written in some non clearly separated and abstracted module, hence quite easy to track. From what I have seen, there is probably only that place (the whole file) in Core that needs fixing. And I didn't encounter any other library doing anything similar (but of course I don't know every library). |
|
@lpw25 @mshinwell Do you have any info on how hard it would be to find and fix all occurrences in JS code? |
|
I've been working through them for the last couple of days. It's a pain but it's not too bad. If we were discussing adding this optimisation for 4.05 then I would suggest holding off, but since 4.04 already contains it we might as well leave it in 4.05. |
* Apply ocamlformat to the playground code * Add a Makefile target instead * ocamlformat.0.25.1 * update make deps * Update ocamlformat version on CI
In 4.04 flambda tries to optimise array set operations based on the kind of value being put into the array. This differs from how these operations actually behave -- which is always to look at the kind of the array itself to decide whether the set operation is for a
Double_arrayor not. I think this constitutes a bug as quite a bit of high performance code relies on it being safe to place a float into a normal array.This PR adjusts the optimisation to only consider the kind of the array and not the value being stored into the array. I've made it against the 4.05 branch since this is a bug.