Skip to content

Eliminate all uses of the page table from the GC (for no-naked-pointers mode)#203

Closed
mshinwell wants to merge 153 commits intoocaml:trunkfrom
mshinwell:4.02-no-naked-pointers
Closed

Eliminate all uses of the page table from the GC (for no-naked-pointers mode)#203
mshinwell wants to merge 153 commits intoocaml:trunkfrom
mshinwell:4.02-no-naked-pointers

Conversation

@mshinwell
Copy link
Contributor

The first part of this patch adds GC block headers in the text segments immediately prior to function entry points. This enables the GC to safely stop scanning when code pointers are reached in no-naked-pointers mode, rather than requiring a page table test, or indeed any special handling for closures. (Another option would be to alter the closure representation to make it possible to identify which fields of a closure contain code pointers, but you then still need a special case in the GC.) I think I've convinced myself this is safe. In addition to being a useful optimization now, this work is a prerequisite for the multicore branch.

The second part of this patch eliminates all other references to the page table from the GC in no-naked-pointers mode. Damien, I'd appreciate it if you could check this part.

@alainfrisch
Copy link
Contributor

You should probably base this work on trunk, because of the pervasive changes in emit.mlp for i386/amd64 (and the merger between emit.ml/emit_nt.mlp on those architectures).

@mshinwell
Copy link
Contributor Author

alainfrisch: Agreed, but multicore is based on 4.02 at the moment, so we'll need both patches.

Incidentally, for multicore, it looks like we're going to have to add a new tag (Code_tag) so that code pointers can be distinguished in the absence of the page table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make atom headers black and get rid of this test? Then you can use Nnp_or_is_in_heap and remove the #ifdef entirely.

mshinwell pushed a commit to ocaml-multicore/ocaml-multicore that referenced this pull request Jun 22, 2015
@damiendoligez
Copy link
Member

I have reviewd the GC changes and I believe they are correct.

@alainfrisch
Copy link
Contributor

Agreed, but multicore is based on 4.02 at the moment, so we'll need both patches.

I don't think there are plans to create another release on the 4.02 branch. So I don't see the point of maintaining two patches. Unless people plan to push hard to multicore part of a (bug-fix -- haha) minor release...

I can see the benefit of providing multicore and your current proposal for 4.02 so that people can test them more easily, but do you agree that there is no point merging the current 4.02-based version?

@damiendoligez
Copy link
Member

@mshinwell

Incidentally, for multicore, it looks like we're going to have to add a new tag (Code_tag) so that code pointers can be distinguished in the absence of the page table.

This means decreasing the number of normal tags by 4, so the maximum number of constant constructors in a type will go from 246 to 242. This is probably not a big deal, but we'll have to check if it breaks some user programs.

@mshinwell
Copy link
Contributor Author

alainfrisch: I wasn't necessarily proposing merging this patch into 4.02; I can make another one for trunk. However, the decision has been taken that the multicore branch isn't going to be rebased to trunk for some significant period of time, so it made sense to develop against 4.02.

@mshinwell
Copy link
Contributor Author

Xavier: are you OK in principle with this patch (without the change to add Code_tag), for trunk?
It only affects things when the compiler is configured to disallow naked pointers.

@xavierleroy
Copy link
Contributor

Just to spell out the obvious: if you're not adding Code_tag headers before code, you need a special case for scanning closures, right? Because I don't see this in the commits.

I haven't said "no" yet to Code_tag headers. It's just that we need more (performance) information to choose between that and alternative closure scanning strategies.

@mshinwell
Copy link
Contributor Author

When I said "without the change to add Code_tag", I meant: doing what the patch does at the moment, namely writing headers into the text section, but with Abstract_tag.

@mshinwell
Copy link
Contributor Author

(And I don't think any special case is needed in the GC with the patch in its current state.)

@xavierleroy
Copy link
Contributor

Ah, OK, I misunderstood. Still, I'm curious to see how much performance impact this has, because of the "bubbles" introduced in the code cache.

@gasche
Copy link
Member

gasche commented Nov 15, 2015

(pinging in case Mark would be interested in having some benchs ready in time for the meeting)

@lefessan
Copy link
Contributor

Yes, I would be interested too in cache-related measurements, to compare with specific code for scanning closures. These headers in the code might trigger invalidations of lines in the instruction cache when these lines are loaded in the data cache for header-checks, but is it frequent enough to have an impact on performances ?

@mshinwell
Copy link
Contributor Author

I would be interested in having some benchmarks available, but I doubt I will have time to do so ready for the meeting. However this could still be approved pending such benchmarking.

@gasche gasche mentioned this pull request Nov 19, 2015
@gasche
Copy link
Member

gasche commented Nov 19, 2015

We discussed this change at the meeting. There was agreement that keeping the pagetable check for function pointers would be a safer approach that preserves most of the general benefits of the patch. My notes are not completely clear on what we decided to do for 4.03 (it was proposed to be conservative and keep it opt-in for 4.03, but no strong consensus; we may need a decision before freeze time in about a month), but it was noted that better tools to help users transition to the no-naked-pointers work would be needed.

@mshinwell
Copy link
Contributor Author

I said I would try to write some kind of checking tool with a view to potentially making no-naked-pointers the default in a future release (not 4.03).

@damiendoligez
Copy link
Member

I believe the tool already exists (almost):
Configure with -no-naked-pointers -with-debug-runtime, then

setenv OCAMLPARAM=runtime-variant=d
setenv OCAMLRUNPARAM=v=0

Then compile OCaml and your program. Any non-black naked pointer will trigger an assertion.
This doesn't work well today because the GC is still verbose even with v=0, but the GC latency patch will change that.

@damiendoligez
Copy link
Member

I believe some architectures do not support the headers-in-code strategy, so this will need more work. @xavierleroy what is the latest word on this?

@damiendoligez damiendoligez added this to the 4.04-or-later milestone Jan 22, 2016
@mshinwell
Copy link
Contributor Author

@damiendoligez : Which architectures, do you know?

@damiendoligez
Copy link
Member

IIRC it was some version of IBM's POWER. I don't remember the OS.

@mshinwell
Copy link
Contributor Author

Oh, I thought the trouble @xavierleroy had for ppc64 could have been solved by a similar approach to this (for a slightly different problem). I may be misremembering though.

@DemiMarie DemiMarie mentioned this pull request Jun 15, 2016
@mshinwell mshinwell modified the milestones: 4.05-or-later, 4.04 Sep 7, 2016
@mshinwell
Copy link
Contributor Author

I've rebased this patch on top of GPR#756, which cleans things up somewhat.

I believe I have a solution for POWER ELF64v1 ABI systems which involves using the optional third word in the official procedure descriptor (.opd) array as a GC header. It is being tested at the moment.

I have set up benchmarks on x86-64 to see if there are any noticeable performance implications of this. Once that has been done I will also remove some more Is_in_value_area tests in favour of Nnp_or_is_in_value_area.

@xavierleroy
Copy link
Contributor

two and a half years later, my understanding is that the "headers in code" approach was found to degrade performance noticeably, and that we have decided to move towards a self-describing function closure representation instead.

Based on this understanding, I'm closing this PR. Yell at me and reopen if I'm wrong.

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Feb 20, 2020
chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants