|
1 | | -# Parallel Type-Checking in F# |
| 1 | +# Parallel type-checking in F# |
2 | 2 |
|
3 | | -This document describes the idea and implementation details for parallel type-checking of files in the F# compiler. |
| 3 | +This document describes the idea and implementation details for parallel type-checking of independent files in the F# compiler. |
4 | 4 |
|
5 | | -## Context and the current state of the compiler |
6 | | - |
7 | | -Performance of F# compilation and code analysis is not very high. |
| 5 | +Performance of F# compilation and code analysis is one of the concerns for big codebases. |
8 | 6 |
|
9 | | -Recent improvements not-withstanding, there are still low-hanging fruit that can be introduced. |
| 7 | +Despite several recent improvements in this area, there is still appetite and potential for change. |
10 | 8 |
|
11 | 9 | One idea for such an improvement was originally described in https://github.com/dotnet/fsharp/discussions/11634 by @kerams . |
12 | | -This is going to be the main topic of this page. |
| 10 | +It is going to be the main topic of this page. |
| 11 | + |
| 12 | +But before we dive into the details of the proposal, let's first discuss how the things work at the moment. |
| 13 | + |
| 14 | +## Context and the current state of the compiler |
13 | 15 |
|
14 | 16 | ### Current state of type-checking |
15 | 17 |
|
16 | | -One of the main phases of compilation is type-checking. Depending on the project in question, it can take up to and exceed 50% of the total compilation time. |
| 18 | +One of the main phases of compilation is type-checking. Depending on the project in question, it can take as much as 50% of the total compilation time. |
17 | 19 | Currently, by default all files in a project are type-checked in sequence, one-by-one, leading to increased compilation wall-clock time. |
18 | 20 |
|
| 21 | +The same is true about code analysis (used by the IDEs), but to an even higher degree - since code analysis skips some of the expensive compilation phases, type-checking represents a bigger fraction of the total wall-clock time, hence any improvements in this area can lead to more drastic total time reduction. |
| 22 | + |
| 23 | +### Maintaining type-checking state |
| 24 | + |
| 25 | +There is a lot of information associated with type-checking individual files and groups of them. |
| 26 | + |
| 27 | +Currently, due to the (mostly) sequential nature of the processing, it is sufficient to maintain a single instance of such information for the whole project. |
| 28 | +This instance is incrementally built on as more and more files have been processed. |
| 29 | + |
19 | 30 | ### Recent addition - "Parallel type checking for impl files with backing sig files" |
20 | 31 |
|
21 | | -A recent [change](https://github.com/dotnet/fsharp/pull/13737) introduced in the compiler allows for parallel type-checking of all `.fs` files backed by `.fsi` files. Such `.fs` files by definition cannot be depended upon by any other files w.r.t. type-checking, since all the type-checking information is exposed by the corresponding `.fsi` files. |
| 32 | +A recent [change](https://github.com/dotnet/fsharp/pull/13737) introduced in the compiler introduced a level of parallelism in type-checking. |
| 33 | +It allows for parallel type-checking of all `.fs` files backed by `.fsi` files. Such `.fs` files by definition cannot be depended upon by any other files w.r.t. type-checking, since all the necessary information is exposed by the corresponding `.fsi` files. |
22 | 34 |
|
23 | 35 | The new feature, when enabled, allows partial parallelisation of type-checking as follows: |
24 | 36 | 1. All `.fsi` files and `.fs` files without backing `.fsi` files are type-checked in sequence, as before. |
@@ -51,4 +63,50 @@ Below is an example showing the difference it can make for a parallel workflow. |
51 | 63 |
|
52 | 64 | For more details see https://github.com/dotnet/fsharp/pull/13521 |
53 | 65 |
|
54 | | -### |
| 66 | +## Parallel type-checking of independent files |
| 67 | + |
| 68 | +### Background |
| 69 | +Files in an F# project are ordered and processed from the top (first) and the bottom (last) file. |
| 70 | +The compiler ensures that no information, including type information, flows upwards. |
| 71 | + |
| 72 | +Consider the following list of files in a project: |
| 73 | +```fsharp |
| 74 | +A.fs |
| 75 | +B.fs |
| 76 | +C.fs |
| 77 | +D.fs |
| 78 | +``` |
| 79 | +By default, they are type-checked in the order of appearance: `[A.fs, B.fs, C.fs, D.fs]` |
| 80 | + |
| 81 | +Let's define `allowed dependency` as follows: |
| 82 | +> If the contents of 'B.fs' _can_, based on its position in the project hierarchy, influence the type-checking process of 'A.fs', then 'A.fs' -> 'B.fs' is an _allowed dependency_ |
| 83 | +
|
| 84 | +The graph of dependencies between the files that the compiler _allows_ looks as follows: |
| 85 | +``` |
| 86 | +A.fs -> [] |
| 87 | +B.fs -> [A.fs] |
| 88 | +C.fs -> [B.fs; A.fs] |
| 89 | +D.fs -> [C.fs; B.fs; A.fs] |
| 90 | +``` |
| 91 | + |
| 92 | +Type-checking files in the appearance order guarantees that when processing a given file, any files it _might_ depend on w.r.t. type-checking have already been type-checked and their type information is available. |
| 93 | + |
| 94 | +### Necessary dependencies |
| 95 | + |
| 96 | +Let's define a `necessary dependency` too: |
| 97 | +> File 'A.fs' _necessarily depends_ on file B for type-checking purposes, if the lack of type-checking information from 'B.fs' would influence the results of type-checking 'A.fs' |
| 98 | +
|
| 99 | +And finally a `dependency graph` as follows: |
| 100 | +> A _dependency graph_ is any graph that is a subset of the `allowed dependencies` graph and a superset of the `necessary dependencies` graph |
| 101 | +
|
| 102 | +A few slightly imprecise/vague statements about all the graphs: |
| 103 | +1. The _Necessary dependencies_ graph is a subgraph of the _allowed dependencies_ graph. |
| 104 | +2. If there is no path between 'B.fs' and 'C.fs' in the _necessary dependencies_ graph, they can be type-checked in parallel, as long as there is a way to maintain and merge more than one instance of type-checking information. |
| 105 | +3. Type-checking _must_ process files in an order that is compatible with the topological order in the _necessary dependencies_ graph. |
| 106 | +4. If using a dependency graph as an ordering mechanism for (parallel) type-checking, the closer it is to the _necessary dependencies_ graph, the higher parallelism is possible. |
| 107 | +5. Type-checking files in appearance order is equivalent to using the `allowed dependencies` graph for ordering. |
| 108 | +6. Removing an edge from a _dependency_ graph _can_ increase (but not decrease) the level of parallelism possible and improve wall-clock time. |
| 109 | + |
| 110 | +Let's look at point `5.` in detail. |
| 111 | + |
| 112 | +### The impact of reducing the dependency graph |
0 commit comments