forked from scala/scala
-
Notifications
You must be signed in to change notification settings - Fork 0
Additional test case for Lukas' fix to annotated originals. #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
lrytz
merged 1 commit into
lrytz:annotatedRetyping
from
retronym:topic/annotatedRetyping-test
Feb 18, 2013
Merged
Additional test case for Lukas' fix to annotated originals. #1
lrytz
merged 1 commit into
lrytz:annotatedRetyping
from
retronym:topic/annotatedRetyping-test
Feb 18, 2013
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was inspired by the regression that Mark encountered when upgrading SBT from 2.10.0 to 2.10.1-RC1.
lrytz
added a commit
that referenced
this pull request
Feb 18, 2013
Additional test case for Lukas' fix to annotated originals.
lrytz
pushed a commit
that referenced
this pull request
Mar 6, 2013
We actually need to call normalize here, otherwise we don't progress through #1 below. [infer implicit] scala.this.Predef.implicitly[Higher[Foo.Bar]] with pt=Higher[Foo.Bar] in object Foo 1. tp=Foo.Bar tp.normalize=[A <: <?>]Foo.Bar[A] tp.dealias=Foo.Bar 2. tp=Foo.Bar[A] tp.normalize=Box[A] tp.dealias=Box[A]
lrytz
pushed a commit
that referenced
this pull request
Jan 31, 2014
When an application of a blackbox macro expands into a tree `x`, the expansion is wrapped into a type ascription `(x: T)`, where `T` is the declared return type of the blackbox macro with type arguments and path dependencies applied in consistency with the particular macro application being expanded. This invalidates blackbox macros as an implementation vehicle of type providers.
lrytz
pushed a commit
that referenced
this pull request
Jan 31, 2014
While fixing the problem with the order of typechecks for whitebox expansions,
I realized that we’re doing redundant work when expanding blackbox macros.
Concretely, typechecking blackbox expansions looked as follows:
val expanded1 = atPos(enclosingMacroPosition.focus)(Typed(expanded0, TypeTree(innerPt)))
val expanded2 = typecheck("blackbox typecheck #1", expanded1, innerPt)
typecheck("blackbox typecheck #2", expanded1, outerPt)
Or, if we reformulate it using quasiquotes (temporarily not taking
positions into account, since they aren’t important here):
val expanded2 = typed(q”$expanded: $innerPt”, innerPt)
typed(expanded2, outerPt)
In this formulation, it becomes apparent that the first typecheck is
redundant. If something is ascribed with some type, then typechecking
the ascription against that type does nothing useful.
This is also highlights one of the reasons why it would be really nice
to have quasiquotes used in the compiler. With them, it’s easy to notice
things that would otherwise remain buried behind swaths of boilerplate.
lrytz
pushed a commit
that referenced
this pull request
Feb 25, 2014
[Parts of this patch and some of the commentary are from @paulp] This took me so long to figure out I can't even tell you. Partly because there were two different bugs, one which only arose for trait forwarders and one for mirror class forwarders, and every time I'd make one set of tests work another set would start failing. The runtime failures associated with these bugs were fairly well hidden because you usually have to go through java to encounter them: scala doesn't pay that much attention to generic signatures, so they can be wrong and scala might still generate correct code. But java is not so lucky. Bug #1) During mixin composition, classes which extend traits receive forwarders to the implementations. An attempt was made to give these the correct info (in method "cloneBeforeErasure") but it was prone to giving the wrong answer, because: the key attribute which the forwarder must capture is what the underlying method will erase to *where the implementation is*, not how it appears to the class which contains it. That means the signature of the forwarder must be no more precise than the signature of the inherited implementation unless additional measures will be taken. This subtle difference will put on an unsubtle show for you in test run/t3452.scala. trait C[T] trait Search[M] { def search(input: M): C[Int] = null } object StringSearch extends Search[String] { } StringSearch.search("test"); // java // java.lang.NoSuchMethodError: StringSearch.search(Ljava/lang/String;)LC; The principled thing to do here would be to create a pair of methods in the host class: a mixin forwarder with the erased signature `(String)C[Int]`, and a bridge method with the same erased signature as the trait interface facet. But, this turns out to be pretty hard to retrofit onto the current setup of Mixin and Erasure, mostly due to the fact that mixin happens after erasure which has already taken care of bridging. For a future, release, we should try to move all bridging after mixin, and pursue this approach. But for now, what can we do about `LinkageError`s for Java clients? This commit simply checks if the pre-erasure method signature that we generate for the trait forward erases identically to that of the interface method. If so, we can be precise. If not, we emit the erased signature as the generic signature. Bug #2) The same principle is at work, at a different location. During genjvm, objects without declared companion classes are given static forwarders in the corresponding class, e.g. object Foo { def bar = 5 } which creates these classes (taking minor liberties): class Foo$ { static val MODULE$ = new Foo$ ; def bar = 5 } class Foo { static def bar = Foo$.MODULE$.bar } In generating these, genjvm circumvented the usual process whereby one creates a symbol and gives it an info, preferring to target the bytecode directly. However generic signatures are calculated from symbol info (in this case reusing the info from the module class.) Lacking even the attempt which was being made in mixin to "clone before erasure", we would have runtime failures of this kind: abstract class Foo { type T def f(x: T): List[T] = List() } object Bar extends Foo { type T = String } Bar.f(""); // java // java.lang.NoSuchMethodError: Bar.f(Ljava/lang/String;)Lscala/collection/immutable/List; Before/after this commit: < signature f (Ljava/lang/String;)Lscala/collection/immutable/List<Ljava/lang/String;>; --- > signature f (Ljava/lang/Object;)Lscala/collection/immutable/List<Ljava/lang/Object;>; This takes the warning count for compiling collections under `-Ycheck:jvm` from 1521 to 26.
lrytz
pushed a commit
that referenced
this pull request
Feb 25, 2014
Swathes of important logic are duplicated between `findMember`
and `findMembers` after they separated on grounds of irreconcilable
differences about how fast they should run:
d905558 Variation scala#10 to optimze findMember
fcb0c01 Attempt scala#9 to opimize findMember.
71d2ceb Attempt scala#8 to opimize findMember.
77e5692 Attempty scala#7 to optimize findMember
275115e Fixing problem that caused fingerprints to fail in
e94252e Attemmpt scala#6 to optimize findMember
73e61b8 Attempt scala#5 to optimize findMember.
04f0b65 Attempt scala#4 to optimize findMember
0e3c70f Attempt #3 to optimize findMember
41f4497 Attempt #2 to optimize findMember
1a73aa0 Attempt #1 to optimize findMember
This didn't actually bear fruit, and the intervening years have
seen the implementations drift.
Now is the time to reunite them under the banner of `FindMemberBase`.
Each has a separate subclass to customise the behaviour. This is
primarily used by `findMember` to cache member types and to assemble
the resulting list of symbols in an low-allocation manner.
While there I have introduced some polymorphic calls, the call sites
are only bi-morphic, and our typical pattern of compilation involves
far more `findMember` calls, so I expect that JIT will keep the
virtual call cost to an absolute minimum.
Test results have been updated now that `findMembers` correctly
excludes constructors and doesn't inherit privates.
Coming up next: we can actually fix SI-7475!
lrytz
pushed a commit
that referenced
this pull request
Feb 25, 2014
As per discussion at https://groups.google.com/forum/#!topic/scala-internals/nf_ooEBn6-k, this commit introduces the new c.enclosingOwner API that is going to serve two purposes: 1) provide a better controlled alternative to c.enclosingTree, 2) enable low-level tinkering with owner chains without having to cast to compiler internals. This solution is not ideal, because: 1) symbols are much more than I would like to expose about enclosing lexical contexts (after the aforementioned discussion I’m no longer completely sure whether exposing nothing is the right thing to do, but exposing symbol completers is definitely something that should be avoided), 2) we shouldn’t have to do that low-level stuff in the first place. However, let’s face the facts. This change represents both an improvement over the state of the art wrt #1 and a long-awaited capability wrt #2. I think this pretty much warrants its place in trunk in the spirit of gradual, evolutionary development of reflection API.
lrytz
pushed a commit
that referenced
this pull request
May 2, 2014
SI-7937 In for, semi before guard never required
lrytz
pushed a commit
that referenced
this pull request
Apr 16, 2015
Under `-Ydelambdafy:method`, a public, static accessor method is
created to expose the private method containing the body of the
lambda.
Currently this accessor method has its parameters in the same order
structure as those of the lambda body method.
What is this order? There are three categories of parameters:
1. lambda parameters
2. captured parameters (added by lambdalift)
3. self parameters (added to lambda bodies that end up in trait
impl classes by mixin, and added unconditionally to the static
accessor method.)
These are currently emitted in order #3, #1, #2.
Here are examples of the current behaviour:
BEFORE (trait):
```
% cat sandbox/test.scala && scalac-hash v2.11.5 -Ydelambdafy:method sandbox/test.scala && javap -private -classpath . 'Test$class'
trait Member; class Capture; trait LambdaParam
trait Test {
def member: Member
def foo {
val local = new Capture
(arg: LambdaParam) => "" + arg + member + local
}
}
Compiled from "test.scala"
public abstract class Test$class {
public static void foo(Test);
private static final java.lang.String $anonfun$1(Test, LambdaParam, Capture);
public static void $init$(Test);
public static final java.lang.String accessor$1(Test, LambdaParam, Capture);
}
```
BEFORE (class):
```
% cat sandbox/test.scala && scalac-hash v2.11.5 -Ydelambdafy:method sandbox/test.scala && javap -private -classpath . Test
trait Member; class Capture; trait LambdaParam
abstract class Test {
def member: Member
def foo {
val local = new Capture
(arg: LambdaParam) => "" + arg + member + local
}
}
Compiled from "test.scala"
public abstract class Test {
public abstract Member member();
public void foo();
private final java.lang.String $anonfun$1(LambdaParam, Capture);
public Test();
public static final java.lang.String accessor$1(Test, LambdaParam, Capture);
}
```
Contrasting the class case with Java:
```
% cat sandbox/Test.java && javac -d . sandbox/Test.java && javap -private -classpath . Test
public abstract class Test {
public static class Member {};
public static class Capture {};
public static class LambaParam {};
public static interface I {
public abstract Object c(LambaParam arg);
}
public abstract Member member();
public void test() {
Capture local = new Capture();
I i1 = (LambaParam arg) -> "" + member() + local;
}
}
Compiled from "Test.java"
public abstract class Test {
public Test();
public abstract Test$Member member();
public void test();
private java.lang.Object lambda$test$0(Test$Capture, Test$LambaParam);
}
```
We can see that in Java 8 lambda parameters come after captures. If we
want to use Java's LambdaMetafactory to spin up our anoymous FunctionN
subclasses on the fly, our ordering must change.
I can see three options for change:
1. Adjust `LambdaLift` to always prepend captured parameters,
rather than appending them. I think we could leave `Mixin` as
it is, it already prepends the self parameter. This would result
a parameter ordering, in terms of the list above: #3, #2, #1.
2. More conservatively, do this just for methods known to hold
lambda bodies. This might avoid needlessly breaking code that
has come to depend on our binary encoding.
3. Adjust the parameters of the accessor method only. The body
of this method can permute params before calling the lambda
body method.
This commit implements option #2.
In also prototyped #1, and found it worked so long as I limited it to
non-constructors, to sidestep the need to make corresponding
changes elsewhere in the compiler to avoid the crasher shown
in the enclosed test case, which was minimized from a bootstrap
failure from an earlier a version of this patch.
We would need to defer option #1 to 2.12 in any case, as some of
these lifted methods are publicied by the optimizer, and we must
leave the signatures alone to comply with MiMa.
I've included a test that shows this in all in action. However, that
is currently disabled, as we don't have a partest category for tests
that require Java 8.
lrytz
pushed a commit
that referenced
this pull request
Aug 24, 2015
The log messages intented to chronicle implicit search were always being filtered out by virtue of the fact that the the tree passed to `printTyping` was already typed, (e.g. with an implicit MethodType.) This commit enabled printing in this case, although it still filters out trees that are deemed unfit for typer tracing, such as `()`. In the context of implicit search, this happens to filter out the noise of: ``` | | | [search #2] start `()`, searching for adaptation to pt=Unit => Foo[Int,Int] (silent: value <local Test> in Test) implicits disabled | | | [search #3] start `()`, searching for adaptation to pt=(=> Unit) => Foo[Int,Int] (silent: value <local Test> in Test) implicits disabled | | | \-> <error> ``` ... which I think is desirable. The motivation for this fix was to better display the interaction between implicit search and type inference. For instance: ``` class Foo[A, B] class Test { implicit val f: Foo[Int, String] = ??? def t[A, B](a: A)(implicit f: Foo[A, B]) = ??? t(1) } ``` ```` % scalac -Ytyper-debug sandbox/instantiate.scala ... | |-- t(1) BYVALmode-EXPRmode (site: value <local Test> in Test) | | |-- t BYVALmode-EXPRmode-FUNmode-POLYmode (silent: value <local Test> in Test) | | | [adapt] [A, B](a: A)(implicit f: Foo[A,B])Nothing adapted to [A, B](a: A)(implicit f: Foo[A,B])Nothing | | | \-> (a: A)(implicit f: Foo[A,B])Nothing | | |-- 1 BYVALmode-EXPRmode-POLYmode (site: value <local Test> in Test) | | | \-> Int(1) | | solving for (A: ?A, B: ?B) | | solving for (B: ?B) | | [search #1] start `[A, B](a: A)(implicit f: Foo[A,B])Nothing` inferring type B, searching for adaptation to pt=Foo[Int,B] (silent: value <local Test> in Test) implicits disabled | | [search #1] considering f | | [adapt] f adapted to => Foo[Int,String] based on pt Foo[Int,B] | | [search #1] solve tvars=?B, tvars.constr= >: String <: String | | solving for (B: ?B) | | [search #1] success inferred value of type Foo[Int,=?String] is SearchResult(Test.this.f, TreeTypeSubstituter(List(type B),List(String))) | | |-- [A, B](a: A)(implicit f: Foo[A,B])Nothing BYVALmode-EXPRmode (site: value <local Test> in Test) | | | \-> Nothing | | [adapt] [A, B](a: A)(implicit f: Foo[A,B])Nothing adapted to [A, B](a: A)(implicit f: Foo[A,B])Nothing | | \-> Nothing ```
lrytz
pushed a commit
that referenced
this pull request
Oct 21, 2016
Manually tested with:
```
% cat sandbox/test.scala
package p {
object X { def f(i: Int) = ??? ; def f(s: String) = ??? }
object Main {
val res = X.f(3.14)
}
}
% qscalac -Ytyper-debug sandbox/test.scala
|-- p EXPRmode-POLYmode-QUALmode (site: package <root>)
| \-> p.type
|-- object X BYVALmode-EXPRmode (site: package p)
| |-- super EXPRmode-POLYmode-QUALmode (silent: <init> in X)
| | |-- this EXPRmode (silent: <init> in X)
| | | \-> p.X.type
| | \-> p.X.type
| |-- def f BYVALmode-EXPRmode (site: object X)
| | |-- $qmark$qmark$qmark EXPRmode (site: method f in X)
| | | \-> Nothing
| | |-- Int TYPEmode (site: value i in X)
| | | \-> Int
| | |-- Int TYPEmode (site: value i in X)
| | | \-> Int
| | \-> [def f] (i: Int)Nothing
| |-- def f BYVALmode-EXPRmode (site: object X)
| | |-- $qmark$qmark$qmark EXPRmode (site: method f in X)
| | | \-> Nothing
| | |-- String TYPEmode (site: value s in X)
| | | [adapt] String is now a TypeTree(String)
| | | \-> String
| | |-- String TYPEmode (site: value s in X)
| | | [adapt] String is now a TypeTree(String)
| | | \-> String
| | \-> [def f] (s: String)Nothing
| \-> [object X] p.X.type
|-- object Main BYVALmode-EXPRmode (site: package p)
| |-- X.f(3.14) EXPRmode (site: value res in Main)
| | |-- X.f BYVALmode-EXPRmode-FUNmode-POLYmode (silent: value res in Main)
| | | |-- X EXPRmode-POLYmode-QUALmode (silent: value res in Main)
| | | | \-> p.X.type
| | | \-> (s: String)Nothing <and> (i: Int)Nothing
| | |-- 3.14 BYVALmode-EXPRmode (silent: value res in Main)
| | | \-> Double(3.14)
| | [search #1] start `<?>`, searching for adaptation to pt=Double => String (silent: value res in Main) implicits disabled
| | [search #2] start `<?>`, searching for adaptation to pt=(=> Double) => String (silent: value res in Main) implicits disabled
| | [search #3] start `<?>`, searching for adaptation to pt=Double => Int (silent: value res in Main) implicits disabled
| | 1 implicits in companion scope
| | [search scala#4] start `<?>`, searching for adaptation to pt=(=> Double) => Int (silent: value res in Main) implicits disabled
| | 1 implicits in companion scope
| | second try: <error> and 3.14
| | [search scala#5] start `p.X.type`, searching for adaptation to pt=p.X.type => ?{def f(x$1: ? >: Double(3.14)): ?} (silent: value res in Main) implicits disabled
| | [search scala#6] start `p.X.type`, searching for adaptation to pt=(=> p.X.type) => ?{def f(x$1: ? >: Double(3.14)): ?} (silent: value res in Main) implicits disabled
sandbox/test.scala:4: error: overloaded method value f with alternatives:
(s: String)Nothing <and>
(i: Int)Nothing
cannot be applied to (Double)
val res = X.f(3.14)
^
```
lrytz
pushed a commit
that referenced
this pull request
Mar 22, 2018
Moved `Ev` from a generic type parameter to a member type in CIPT.
lrytz
pushed a commit
that referenced
this pull request
May 31, 2018
Before: ``` implicitly[Foo[String]] BYVALmode-EXPRmode (site: value <local Test> in Test) |-- implicitly BYVALmode-EXPRmode-FUNmode-POLYmode-TAPPmode (site: value <local Test> in Test) | \-> [T](implicit e: T)T |-- Foo[String] TYPEmode (site: value <local Test> in Test) | |-- String TYPEmode (site: value <local Test> in Test) | | [adapt] String is now a TypeTree(String) | | \-> String | \-> Foo[String] [search #1] start `[T](implicit e: T)T`, searching for adaptation to pt=Foo[String] (silent: value <local Test> in Test) implicits disabled [search #1] considering Foo.javaEnumFoo solving for (T: ?T) [adapt] [T]=> Foo[T] adapted to [T]=> Foo[T] based on pt Foo[String] [search #1] success inferred value of type Foo[String] is SearchResult(Foo.javaEnumFoo[String], ) |-- [T](implicit e: T)T BYVALmode-EXPRmode (site: value <local Test> in Test) | \-> Foo[String] [adapt] [T](implicit e: T)T adapted to [T](implicit e: T)T ``` After: ``` implicitly[Foo[String]] BYVALmode-EXPRmode (site: value <local Test> in Test) |-- implicitly BYVALmode-EXPRmode-FUNmode-POLYmode-TAPPmode (site: value <local Test> in Test) | \-> [T](implicit e: T)T |-- Foo[String] TYPEmode (site: value <local Test> in Test) | |-- String TYPEmode (site: value <local Test> in Test) | | [adapt] String is now a TypeTree(String) | | \-> String | \-> Foo[String] [search #1] start `[T](implicit e: T)T`, searching for adaptation to pt=Foo[String] (silent: value <local Test> in Test) implicits disabled [search #1] considering Foo.stringFoo [search #1] success inferred value of type Foo[String] is SearchResult(Foo.stringFoo, ) |-- [T](implicit e: T)T BYVALmode-EXPRmode (site: value <local Test> in Test) | \-> Foo[String] [adapt] [T](implicit e: T)T adapted to [T](implicit e: T)T \-> Foo[String] ```
lrytz
pushed a commit
that referenced
this pull request
Sep 28, 2018
Before: ``` implicitly[Foo[String]] BYVALmode-EXPRmode (site: value <local Test> in Test) |-- implicitly BYVALmode-EXPRmode-FUNmode-POLYmode-TAPPmode (site: value <local Test> in Test) | \-> [T](implicit e: T)T |-- Foo[String] TYPEmode (site: value <local Test> in Test) | |-- String TYPEmode (site: value <local Test> in Test) | | [adapt] String is now a TypeTree(String) | | \-> String | \-> Foo[String] [search #1] start `[T](implicit e: T)T`, searching for adaptation to pt=Foo[String] (silent: value <local Test> in Test) implicits disabled [search #1] considering Foo.javaEnumFoo solving for (T: ?T) [adapt] [T]=> Foo[T] adapted to [T]=> Foo[T] based on pt Foo[String] [search #1] success inferred value of type Foo[String] is SearchResult(Foo.javaEnumFoo[String], ) |-- [T](implicit e: T)T BYVALmode-EXPRmode (site: value <local Test> in Test) | \-> Foo[String] [adapt] [T](implicit e: T)T adapted to [T](implicit e: T)T ``` After: ``` implicitly[Foo[String]] BYVALmode-EXPRmode (site: value <local Test> in Test) |-- implicitly BYVALmode-EXPRmode-FUNmode-POLYmode-TAPPmode (site: value <local Test> in Test) | \-> [T](implicit e: T)T |-- Foo[String] TYPEmode (site: value <local Test> in Test) | |-- String TYPEmode (site: value <local Test> in Test) | | [adapt] String is now a TypeTree(String) | | \-> String | \-> Foo[String] [search #1] start `[T](implicit e: T)T`, searching for adaptation to pt=Foo[String] (silent: value <local Test> in Test) implicits disabled [search #1] considering Foo.stringFoo [search #1] success inferred value of type Foo[String] is SearchResult(Foo.stringFoo, ) |-- [T](implicit e: T)T BYVALmode-EXPRmode (site: value <local Test> in Test) | \-> Foo[String] [adapt] [T](implicit e: T)T adapted to [T](implicit e: T)T \-> Foo[String] ``` (cherry picked from commit 68799b9)
lrytz
pushed a commit
that referenced
this pull request
Oct 17, 2019
This commit takes care of speeding up analysis of type dependencies as much as possible. In both `ExtractUsedNames` and `Dependency`, we have a cache function associated with a source symbol. This source symbol is the "key" of the cache in the sense that from it we detect how a dependency should be tracked. `Dependency`, for instance, adds a dependency from `X` to `Y`, where X is the origin symbol and `Y` is the destination symbol. However, only `X` determines how to a dependency should be added (and on which data structure). The same happens for `ExtractAPI`, but whose case is simpler because there is no destination symbol: only the origin symbol is the necessary to cache -- we have a set of names for a given symbol. Our previous type analysis had a type cache, but this type cache only lasted one type traversal. The algorihtm was very pessimistic -- we cleared the `visited` cache with `reinitializeVisited` after every traversal so that members would be correctly recognized if the origin symbol changed. However, the origin symbol usually stays the same, especially when traversing bodies of methods and variables, which contain a high proportion of types. Taking this into account, we arrive to the conclusion that we can keep type caches around as long as the `currentOwner` doesn't change, because dependencies are only registered for top-level classes in both cases (`ExtractAPI` and `Dependency`). The introduced solution allows every phase to implement their own `TypeTraverser` and override the function that takes care of adding a dependency. This is necessary because the functions to add dependencies depend on the context (origin symbols and more stuff), which ultimately varies in `ExtractAPI` and `Dependency`. The following benchmark has been obtained by the same formula as the commit mentioned before, and benchmarks the compilation of the Scala standard library. BEFORE ``` [info] Benchmark (_tempDir) Mode Cnt Score Error Units [info] HotScalacBenchmark.compile /tmp/sbt_b9131bfb sample 18 21228.771 ± 521.207 ms/op [info] HotScalacBenchmark.compile:compile·p0.00 /tmp/sbt_b9131bfb sample 20199.768 ms/op [info] HotScalacBenchmark.compile:compile·p0.50 /tmp/sbt_b9131bfb sample 21256.733 ms/op [info] HotScalacBenchmark.compile:compile·p0.90 /tmp/sbt_b9131bfb sample 21931.177 ms/op [info] HotScalacBenchmark.compile:compile·p0.95 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p0.99 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p0.999 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p1.00 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_b9131bfb sample 18 284.115 ± 6.036 MB/sec [info] HotScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_b9131bfb sample 18 6474818679.556 ± 42551265.360 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_b9131bfb sample 18 283.385 ± 23.147 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_b9131bfb sample 18 6455703779.556 ± 483463770.519 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_b9131bfb sample 18 12.857 ± 12.406 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_b9131bfb sample 18 297978002.222 ± 287556197.389 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_b9131bfb sample 18 6.901 ± 2.092 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_b9131bfb sample 18 158212212.444 ± 50375116.805 B/op [info] HotScalacBenchmark.compile:·gc.count /tmp/sbt_b9131bfb sample 18 105.000 counts [info] HotScalacBenchmark.compile:·gc.time /tmp/sbt_b9131bfb sample 18 21814.000 ms [info] WarmScalacBenchmark.compile /tmp/sbt_b9131bfb sample 3 55924.053 ± 16257.754 ms/op [info] WarmScalacBenchmark.compile:compile·p0.00 /tmp/sbt_b9131bfb sample 54895.051 ms/op [info] WarmScalacBenchmark.compile:compile·p0.50 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.90 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.95 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.99 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.999 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p1.00 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_b9131bfb sample 3 117.417 ± 27.439 MB/sec [info] WarmScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_b9131bfb sample 3 6999695530.667 ± 608845574.720 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_b9131bfb sample 3 111.263 ± 90.263 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_b9131bfb sample 3 6633605792.000 ± 5698534573.516 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_b9131bfb sample 3 0.001 ± 0.040 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_b9131bfb sample 3 74741.333 ± 2361755.471 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_b9131bfb sample 3 2.478 ± 7.592 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_b9131bfb sample 3 147881869.333 ± 475964254.946 B/op [info] WarmScalacBenchmark.compile:·gc.count /tmp/sbt_b9131bfb sample 3 73.000 counts [info] WarmScalacBenchmark.compile:·gc.time /tmp/sbt_b9131bfb sample 3 9581.000 ms [info] ColdScalacBenchmark.compile /tmp/sbt_b9131bfb ss 10 45562.453 ± 836.977 ms/op [info] ColdScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_b9131bfb ss 10 147.126 ± 2.229 MB/sec [info] ColdScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_b9131bfb ss 10 7163351651.200 ± 57993163.779 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_b9131bfb ss 10 137.407 ± 6.810 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_b9131bfb ss 10 6692512710.400 ± 429243418.572 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_b9131bfb ss 10 2.647 ± 0.168 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_b9131bfb ss 10 128840603.200 ± 7324571.862 B/op [info] ColdScalacBenchmark.compile:·gc.count /tmp/sbt_b9131bfb ss 10 245.000 counts [info] ColdScalacBenchmark.compile:·gc.time /tmp/sbt_b9131bfb ss 10 29462.000 ms [success] Total time: 1595 s, completed Feb 26, 2017 1:42:55 AM [success] Total time: 0 s, completed Feb 26, 2017 1:42:55 AM ``` AFTER ``` [info] Benchmark (_tempDir) Mode Cnt Score Error Units [info] HotScalacBenchmark.compile /tmp/sbt_c8a4806b sample 18 20757.144 ± 519.221 ms/op [info] HotScalacBenchmark.compile:compile·p0.00 /tmp/sbt_c8a4806b sample 19931.333 ms/op [info] HotScalacBenchmark.compile:compile·p0.50 /tmp/sbt_c8a4806b sample 20786.971 ms/op [info] HotScalacBenchmark.compile:compile·p0.90 /tmp/sbt_c8a4806b sample 21615.765 ms/op [info] HotScalacBenchmark.compile:compile·p0.95 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p0.99 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p0.999 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p1.00 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_c8a4806b sample 18 290.476 ± 7.069 MB/sec [info] HotScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_c8a4806b sample 18 6476081869.778 ± 18700713.424 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_c8a4806b sample 18 290.409 ± 20.336 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_c8a4806b sample 18 6478102528.000 ± 468310673.653 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_c8a4806b sample 18 13.261 ± 12.790 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_c8a4806b sample 18 301324965.333 ± 290518111.715 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_c8a4806b sample 18 6.735 ± 2.338 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_c8a4806b sample 18 150953349.778 ± 54074639.209 B/op [info] HotScalacBenchmark.compile:·gc.count /tmp/sbt_c8a4806b sample 18 101.000 counts [info] HotScalacBenchmark.compile:·gc.time /tmp/sbt_c8a4806b sample 18 21267.000 ms [info] WarmScalacBenchmark.compile /tmp/sbt_c8a4806b sample 3 54380.549 ± 24064.367 ms/op [info] WarmScalacBenchmark.compile:compile·p0.00 /tmp/sbt_c8a4806b sample 53552.873 ms/op [info] WarmScalacBenchmark.compile:compile·p0.50 /tmp/sbt_c8a4806b sample 53687.091 ms/op [info] WarmScalacBenchmark.compile:compile·p0.90 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.95 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.99 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.999 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p1.00 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_c8a4806b sample 3 120.159 ± 52.914 MB/sec [info] WarmScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_c8a4806b sample 3 6963979373.333 ± 137408036.138 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_c8a4806b sample 3 113.755 ± 135.915 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_c8a4806b sample 3 6588595392.000 ± 5170161565.753 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_c8a4806b sample 3 0.002 ± 0.048 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_c8a4806b sample 3 90400.000 ± 2856554.534 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_c8a4806b sample 3 2.623 ± 7.378 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_c8a4806b sample 3 151896768.000 ± 399915676.894 B/op [info] WarmScalacBenchmark.compile:·gc.count /tmp/sbt_c8a4806b sample 3 73.000 counts [info] WarmScalacBenchmark.compile:·gc.time /tmp/sbt_c8a4806b sample 3 10070.000 ms [info] ColdScalacBenchmark.compile /tmp/sbt_c8a4806b ss 10 45613.670 ± 1724.291 ms/op [info] ColdScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_c8a4806b ss 10 147.106 ± 4.973 MB/sec [info] ColdScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_c8a4806b ss 10 7165665000.000 ± 68500786.134 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_c8a4806b ss 10 138.633 ± 12.612 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_c8a4806b ss 10 6749057403.200 ± 438983252.418 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_c8a4806b ss 10 2.716 ± 0.298 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_c8a4806b ss 10 132216236.800 ± 11751803.094 B/op [info] ColdScalacBenchmark.compile:·gc.count /tmp/sbt_c8a4806b ss 10 247.000 counts [info] ColdScalacBenchmark.compile:·gc.time /tmp/sbt_c8a4806b ss 10 29965.000 ms [success] Total time: 1593 s, completed Feb 26, 2017 11:54:01 AM [success] Total time: 0 s, completed Feb 26, 2017 11:54:01 AM ``` Machine info: ``` jvican in /data/rw/code/scala/zinc [22:24:47] > $ uname -a [±as-seen-from ●▴▾] Linux tribox 4.9.11-1-ARCH #1 SMP PREEMPT Sun Feb 19 13:45:52 UTC 2017 x86_64 GNU/Linux jvican in /data/rw/code/scala/zinc [23:15:57] > $ cpupower frequency-info [±as-seen-from ●▴▾] analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 400 MHz - 3.40 GHz available cpufreq governors: performance powersave current policy: frequency should be within 3.20 GHz and 3.20 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 3.32 GHz (asserted by call to kernel) boost state support: Supported: yes Active: yes jvican in /data/rw/code/scala/zinc [23:16:14] > $ cat /proc/meminfo [±as-seen-from ●▴▾] MemTotal: 20430508 kB MemFree: 9890712 kB MemAvailable: 13490908 kB Buffers: 3684 kB Cached: 4052520 kB SwapCached: 0 kB Active: 7831612 kB Inactive: 2337220 kB Active(anon): 6214680 kB Inactive(anon): 151436 kB Active(file): 1616932 kB Inactive(file): 2185784 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 12582908 kB SwapFree: 12582908 kB Dirty: 124 kB Writeback: 0 kB AnonPages: 6099876 kB Mapped: 183096 kB Shmem: 253488 kB Slab: 227436 kB SReclaimable: 152144 kB SUnreclaim: 75292 kB KernelStack: 5152 kB PageTables: 19636 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 22798160 kB Committed_AS: 7685996 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 5511168 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 136620 kB DirectMap2M: 4970496 kB DirectMap1G: 15728640 kB jvican in /data/rw/code/scala/zinc [23:16:41] > $ cat /proc/cpuinfo [±as-seen-from ●▴▾] processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3297.827 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5618.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3296.459 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5620.22 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3399.853 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5621.16 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3210.327 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5620.33 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: ``` In comparison with df308723, the new changes improve the running time of Zinc by half a second in hot and warm benchmarks, and a decrease of 100ms for cold benchmarks, which seems to be product of the variation given the number of ms/op. It is a success taking into account that now we're traversing more types and symbols than before, so these changes allow us to do more work and still decrease the running time of Zinc. These changes are likely to have a bigger effect on huge industrial codebases in which the ratio of types is very high, and with a lot of rich types like poly types, method types, refinements and existential types that have lots of constraints.
lrytz
pushed a commit
that referenced
this pull request
Nov 5, 2019
This commit takes care of speeding up analysis of type dependencies as much as possible. In both `ExtractUsedNames` and `Dependency`, we have a cache function associated with a source symbol. This source symbol is the "key" of the cache in the sense that from it we detect how a dependency should be tracked. `Dependency`, for instance, adds a dependency from `X` to `Y`, where X is the origin symbol and `Y` is the destination symbol. However, only `X` determines how to a dependency should be added (and on which data structure). The same happens for `ExtractAPI`, but whose case is simpler because there is no destination symbol: only the origin symbol is the necessary to cache -- we have a set of names for a given symbol. Our previous type analysis had a type cache, but this type cache only lasted one type traversal. The algorihtm was very pessimistic -- we cleared the `visited` cache with `reinitializeVisited` after every traversal so that members would be correctly recognized if the origin symbol changed. However, the origin symbol usually stays the same, especially when traversing bodies of methods and variables, which contain a high proportion of types. Taking this into account, we arrive to the conclusion that we can keep type caches around as long as the `currentOwner` doesn't change, because dependencies are only registered for top-level classes in both cases (`ExtractAPI` and `Dependency`). The introduced solution allows every phase to implement their own `TypeTraverser` and override the function that takes care of adding a dependency. This is necessary because the functions to add dependencies depend on the context (origin symbols and more stuff), which ultimately varies in `ExtractAPI` and `Dependency`. The following benchmark has been obtained by the same formula as the commit mentioned before, and benchmarks the compilation of the Scala standard library. BEFORE ``` [info] Benchmark (_tempDir) Mode Cnt Score Error Units [info] HotScalacBenchmark.compile /tmp/sbt_b9131bfb sample 18 21228.771 ± 521.207 ms/op [info] HotScalacBenchmark.compile:compile·p0.00 /tmp/sbt_b9131bfb sample 20199.768 ms/op [info] HotScalacBenchmark.compile:compile·p0.50 /tmp/sbt_b9131bfb sample 21256.733 ms/op [info] HotScalacBenchmark.compile:compile·p0.90 /tmp/sbt_b9131bfb sample 21931.177 ms/op [info] HotScalacBenchmark.compile:compile·p0.95 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p0.99 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p0.999 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:compile·p1.00 /tmp/sbt_b9131bfb sample 22112.371 ms/op [info] HotScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_b9131bfb sample 18 284.115 ± 6.036 MB/sec [info] HotScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_b9131bfb sample 18 6474818679.556 ± 42551265.360 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_b9131bfb sample 18 283.385 ± 23.147 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_b9131bfb sample 18 6455703779.556 ± 483463770.519 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_b9131bfb sample 18 12.857 ± 12.406 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_b9131bfb sample 18 297978002.222 ± 287556197.389 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_b9131bfb sample 18 6.901 ± 2.092 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_b9131bfb sample 18 158212212.444 ± 50375116.805 B/op [info] HotScalacBenchmark.compile:·gc.count /tmp/sbt_b9131bfb sample 18 105.000 counts [info] HotScalacBenchmark.compile:·gc.time /tmp/sbt_b9131bfb sample 18 21814.000 ms [info] WarmScalacBenchmark.compile /tmp/sbt_b9131bfb sample 3 55924.053 ± 16257.754 ms/op [info] WarmScalacBenchmark.compile:compile·p0.00 /tmp/sbt_b9131bfb sample 54895.051 ms/op [info] WarmScalacBenchmark.compile:compile·p0.50 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.90 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.95 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.99 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.999 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:compile·p1.00 /tmp/sbt_b9131bfb sample 56438.555 ms/op [info] WarmScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_b9131bfb sample 3 117.417 ± 27.439 MB/sec [info] WarmScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_b9131bfb sample 3 6999695530.667 ± 608845574.720 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_b9131bfb sample 3 111.263 ± 90.263 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_b9131bfb sample 3 6633605792.000 ± 5698534573.516 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_b9131bfb sample 3 0.001 ± 0.040 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_b9131bfb sample 3 74741.333 ± 2361755.471 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_b9131bfb sample 3 2.478 ± 7.592 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_b9131bfb sample 3 147881869.333 ± 475964254.946 B/op [info] WarmScalacBenchmark.compile:·gc.count /tmp/sbt_b9131bfb sample 3 73.000 counts [info] WarmScalacBenchmark.compile:·gc.time /tmp/sbt_b9131bfb sample 3 9581.000 ms [info] ColdScalacBenchmark.compile /tmp/sbt_b9131bfb ss 10 45562.453 ± 836.977 ms/op [info] ColdScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_b9131bfb ss 10 147.126 ± 2.229 MB/sec [info] ColdScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_b9131bfb ss 10 7163351651.200 ± 57993163.779 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_b9131bfb ss 10 137.407 ± 6.810 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_b9131bfb ss 10 6692512710.400 ± 429243418.572 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_b9131bfb ss 10 2.647 ± 0.168 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_b9131bfb ss 10 128840603.200 ± 7324571.862 B/op [info] ColdScalacBenchmark.compile:·gc.count /tmp/sbt_b9131bfb ss 10 245.000 counts [info] ColdScalacBenchmark.compile:·gc.time /tmp/sbt_b9131bfb ss 10 29462.000 ms [success] Total time: 1595 s, completed Feb 26, 2017 1:42:55 AM [success] Total time: 0 s, completed Feb 26, 2017 1:42:55 AM ``` AFTER ``` [info] Benchmark (_tempDir) Mode Cnt Score Error Units [info] HotScalacBenchmark.compile /tmp/sbt_c8a4806b sample 18 20757.144 ± 519.221 ms/op [info] HotScalacBenchmark.compile:compile·p0.00 /tmp/sbt_c8a4806b sample 19931.333 ms/op [info] HotScalacBenchmark.compile:compile·p0.50 /tmp/sbt_c8a4806b sample 20786.971 ms/op [info] HotScalacBenchmark.compile:compile·p0.90 /tmp/sbt_c8a4806b sample 21615.765 ms/op [info] HotScalacBenchmark.compile:compile·p0.95 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p0.99 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p0.999 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:compile·p1.00 /tmp/sbt_c8a4806b sample 21676.163 ms/op [info] HotScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_c8a4806b sample 18 290.476 ± 7.069 MB/sec [info] HotScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_c8a4806b sample 18 6476081869.778 ± 18700713.424 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_c8a4806b sample 18 290.409 ± 20.336 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_c8a4806b sample 18 6478102528.000 ± 468310673.653 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_c8a4806b sample 18 13.261 ± 12.790 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_c8a4806b sample 18 301324965.333 ± 290518111.715 B/op [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_c8a4806b sample 18 6.735 ± 2.338 MB/sec [info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_c8a4806b sample 18 150953349.778 ± 54074639.209 B/op [info] HotScalacBenchmark.compile:·gc.count /tmp/sbt_c8a4806b sample 18 101.000 counts [info] HotScalacBenchmark.compile:·gc.time /tmp/sbt_c8a4806b sample 18 21267.000 ms [info] WarmScalacBenchmark.compile /tmp/sbt_c8a4806b sample 3 54380.549 ± 24064.367 ms/op [info] WarmScalacBenchmark.compile:compile·p0.00 /tmp/sbt_c8a4806b sample 53552.873 ms/op [info] WarmScalacBenchmark.compile:compile·p0.50 /tmp/sbt_c8a4806b sample 53687.091 ms/op [info] WarmScalacBenchmark.compile:compile·p0.90 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.95 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.99 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.999 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p0.9999 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:compile·p1.00 /tmp/sbt_c8a4806b sample 55901.684 ms/op [info] WarmScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_c8a4806b sample 3 120.159 ± 52.914 MB/sec [info] WarmScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_c8a4806b sample 3 6963979373.333 ± 137408036.138 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_c8a4806b sample 3 113.755 ± 135.915 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_c8a4806b sample 3 6588595392.000 ± 5170161565.753 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen /tmp/sbt_c8a4806b sample 3 0.002 ± 0.048 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm /tmp/sbt_c8a4806b sample 3 90400.000 ± 2856554.534 B/op [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_c8a4806b sample 3 2.623 ± 7.378 MB/sec [info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_c8a4806b sample 3 151896768.000 ± 399915676.894 B/op [info] WarmScalacBenchmark.compile:·gc.count /tmp/sbt_c8a4806b sample 3 73.000 counts [info] WarmScalacBenchmark.compile:·gc.time /tmp/sbt_c8a4806b sample 3 10070.000 ms [info] ColdScalacBenchmark.compile /tmp/sbt_c8a4806b ss 10 45613.670 ± 1724.291 ms/op [info] ColdScalacBenchmark.compile:·gc.alloc.rate /tmp/sbt_c8a4806b ss 10 147.106 ± 4.973 MB/sec [info] ColdScalacBenchmark.compile:·gc.alloc.rate.norm /tmp/sbt_c8a4806b ss 10 7165665000.000 ± 68500786.134 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space /tmp/sbt_c8a4806b ss 10 138.633 ± 12.612 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm /tmp/sbt_c8a4806b ss 10 6749057403.200 ± 438983252.418 B/op [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space /tmp/sbt_c8a4806b ss 10 2.716 ± 0.298 MB/sec [info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm /tmp/sbt_c8a4806b ss 10 132216236.800 ± 11751803.094 B/op [info] ColdScalacBenchmark.compile:·gc.count /tmp/sbt_c8a4806b ss 10 247.000 counts [info] ColdScalacBenchmark.compile:·gc.time /tmp/sbt_c8a4806b ss 10 29965.000 ms [success] Total time: 1593 s, completed Feb 26, 2017 11:54:01 AM [success] Total time: 0 s, completed Feb 26, 2017 11:54:01 AM ``` Machine info: ``` jvican in /data/rw/code/scala/zinc [22:24:47] > $ uname -a [±as-seen-from ●▴▾] Linux tribox 4.9.11-1-ARCH #1 SMP PREEMPT Sun Feb 19 13:45:52 UTC 2017 x86_64 GNU/Linux jvican in /data/rw/code/scala/zinc [23:15:57] > $ cpupower frequency-info [±as-seen-from ●▴▾] analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: Cannot determine or is not supported. hardware limits: 400 MHz - 3.40 GHz available cpufreq governors: performance powersave current policy: frequency should be within 3.20 GHz and 3.20 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 3.32 GHz (asserted by call to kernel) boost state support: Supported: yes Active: yes jvican in /data/rw/code/scala/zinc [23:16:14] > $ cat /proc/meminfo [±as-seen-from ●▴▾] MemTotal: 20430508 kB MemFree: 9890712 kB MemAvailable: 13490908 kB Buffers: 3684 kB Cached: 4052520 kB SwapCached: 0 kB Active: 7831612 kB Inactive: 2337220 kB Active(anon): 6214680 kB Inactive(anon): 151436 kB Active(file): 1616932 kB Inactive(file): 2185784 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 12582908 kB SwapFree: 12582908 kB Dirty: 124 kB Writeback: 0 kB AnonPages: 6099876 kB Mapped: 183096 kB Shmem: 253488 kB Slab: 227436 kB SReclaimable: 152144 kB SUnreclaim: 75292 kB KernelStack: 5152 kB PageTables: 19636 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 22798160 kB Committed_AS: 7685996 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 5511168 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 136620 kB DirectMap2M: 4970496 kB DirectMap1G: 15728640 kB jvican in /data/rw/code/scala/zinc [23:16:41] > $ cat /proc/cpuinfo [±as-seen-from ●▴▾] processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3297.827 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5618.00 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3296.459 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5620.22 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3399.853 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5621.16 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 78 model name : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz stepping : 3 microcode : 0x88 cpu MHz : 3210.327 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp bugs : bogomips : 5620.33 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: ``` In comparison with df308723, the new changes improve the running time of Zinc by half a second in hot and warm benchmarks, and a decrease of 100ms for cold benchmarks, which seems to be product of the variation given the number of ms/op. It is a success taking into account that now we're traversing more types and symbols than before, so these changes allow us to do more work and still decrease the running time of Zinc. These changes are likely to have a bigger effect on huge industrial codebases in which the ratio of types is very high, and with a lot of rich types like poly types, method types, refinements and existential types that have lots of constraints. Rewritten from sbt/zinc@1cb2382
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This was inspired by the regression that Mark encountered
when upgrading SBT from 2.10.0 to 2.10.1-RC1.