Eliminate all uses of the page table from the GC (for no-naked-pointers mode)#203
Eliminate all uses of the page table from the GC (for no-naked-pointers mode)#203mshinwell wants to merge 153 commits intoocaml:trunkfrom
Conversation
|
You should probably base this work on trunk, because of the pervasive changes in emit.mlp for i386/amd64 (and the merger between emit.ml/emit_nt.mlp on those architectures). |
|
alainfrisch: Agreed, but multicore is based on 4.02 at the moment, so we'll need both patches. Incidentally, for multicore, it looks like we're going to have to add a new tag (Code_tag) so that code pointers can be distinguished in the absence of the page table. |
byterun/major_gc.c
Outdated
There was a problem hiding this comment.
Why not make atom headers black and get rid of this test? Then you can use Nnp_or_is_in_heap and remove the #ifdef entirely.
|
I have reviewd the GC changes and I believe they are correct. |
I don't think there are plans to create another release on the 4.02 branch. So I don't see the point of maintaining two patches. Unless people plan to push hard to multicore part of a (bug-fix -- haha) minor release... I can see the benefit of providing multicore and your current proposal for 4.02 so that people can test them more easily, but do you agree that there is no point merging the current 4.02-based version? |
This means decreasing the number of normal tags by 4, so the maximum number of constant constructors in a type will go from 246 to 242. This is probably not a big deal, but we'll have to check if it breaks some user programs. |
|
alainfrisch: I wasn't necessarily proposing merging this patch into 4.02; I can make another one for trunk. However, the decision has been taken that the multicore branch isn't going to be rebased to trunk for some significant period of time, so it made sense to develop against 4.02. |
|
Xavier: are you OK in principle with this patch (without the change to add Code_tag), for trunk? |
|
Just to spell out the obvious: if you're not adding Code_tag headers before code, you need a special case for scanning closures, right? Because I don't see this in the commits. I haven't said "no" yet to Code_tag headers. It's just that we need more (performance) information to choose between that and alternative closure scanning strategies. |
|
When I said "without the change to add Code_tag", I meant: doing what the patch does at the moment, namely writing headers into the text section, but with Abstract_tag. |
|
(And I don't think any special case is needed in the GC with the patch in its current state.) |
|
Ah, OK, I misunderstood. Still, I'm curious to see how much performance impact this has, because of the "bubbles" introduced in the code cache. |
|
(pinging in case Mark would be interested in having some benchs ready in time for the meeting) |
|
Yes, I would be interested too in cache-related measurements, to compare with specific code for scanning closures. These headers in the code might trigger invalidations of lines in the instruction cache when these lines are loaded in the data cache for header-checks, but is it frequent enough to have an impact on performances ? |
|
I would be interested in having some benchmarks available, but I doubt I will have time to do so ready for the meeting. However this could still be approved pending such benchmarking. |
|
We discussed this change at the meeting. There was agreement that keeping the pagetable check for function pointers would be a safer approach that preserves most of the general benefits of the patch. My notes are not completely clear on what we decided to do for 4.03 (it was proposed to be conservative and keep it opt-in for 4.03, but no strong consensus; we may need a decision before freeze time in about a month), but it was noted that better tools to help users transition to the no-naked-pointers work would be needed. |
|
I said I would try to write some kind of checking tool with a view to potentially making no-naked-pointers the default in a future release (not 4.03). |
|
I believe the tool already exists (almost): Then compile OCaml and your program. Any non-black naked pointer will trigger an assertion. |
|
I believe some architectures do not support the headers-in-code strategy, so this will need more work. @xavierleroy what is the latest word on this? |
|
@damiendoligez : Which architectures, do you know? |
|
IIRC it was some version of IBM's POWER. I don't remember the OS. |
|
Oh, I thought the trouble @xavierleroy had for ppc64 could have been solved by a similar approach to this (for a slightly different problem). I may be misremembering though. |
|
I've rebased this patch on top of GPR#756, which cleans things up somewhat. I believe I have a solution for POWER ELF64v1 ABI systems which involves using the optional third word in the official procedure descriptor ( I have set up benchmarks on x86-64 to see if there are any noticeable performance implications of this. Once that has been done I will also remove some more |
|
two and a half years later, my understanding is that the "headers in code" approach was found to degrade performance noticeably, and that we have decided to move towards a self-describing function closure representation instead. Based on this understanding, I'm closing this PR. Yell at me and reopen if I'm wrong. |
Continuation refactor
The first part of this patch adds GC block headers in the text segments immediately prior to function entry points. This enables the GC to safely stop scanning when code pointers are reached in no-naked-pointers mode, rather than requiring a page table test, or indeed any special handling for closures. (Another option would be to alter the closure representation to make it possible to identify which fields of a closure contain code pointers, but you then still need a special case in the GC.) I think I've convinced myself this is safe. In addition to being a useful optimization now, this work is a prerequisite for the multicore branch.
The second part of this patch eliminates all other references to the page table from the GC in no-naked-pointers mode. Damien, I'd appreciate it if you could check this part.