Cross platform weak bag implementation#2673
Cross platform weak bag implementation#2673djspiewak merged 14 commits intotypelevel:series/3.3.xfrom vasilmkd:weak-bag
Conversation
Co-authored-by: Arman Bilge <[email protected]>
build.sbt
Outdated
| ("org.scala-js" %%% "scalajs-weakreferences" % JsWeakReferencesVersion) | ||
| .cross(CrossVersion.for3Use2_13) |
There was a problem hiding this comment.
Some interesting reading about this scala-js/scala-js-weakreferences#7 (comment)
"io.vasilev" %% "cats-effect" % "3.3-87-1d1f0ff" |
|
Benchmark results: Baseline on benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=none --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=false ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 316.579 ± 3.402 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1417.315 ± 15.392 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 4929586.675 ± 579.390 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1429.037 ± 22.488 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 4970221.334 ± 47058.032 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.222 ± 0.022 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 771.322 ± 75.550 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 919.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1325.000 msBaseline on benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=cached --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=true ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 204.719 ± 19.250 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1044.091 ± 97.878 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5616091.869 ± 2648.642 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1052.138 ± 110.314 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5665606.945 ± 358765.601 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.814 ± 0.502 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 4325.508 ± 2692.273 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 92.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 17783.000 msThis PR with tracing on: benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=cached --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=true ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 278.251 ± 1.785 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1391.663 ± 9.091 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5507256.392 ± 575.319 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1407.085 ± 15.138 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5568406.214 ± 57756.867 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.246 ± 0.020 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 972.881 ± 79.171 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 812.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1386.000 msThis PR improves performance significantly and reduces GC time to the levels of the case without tracing. |
|
Under the advice of @armanbilge, this PR shades https://github.com/scala-js/scala-js-weakreferences by copying the source code (~100 LOC, not including license headers). This is due to the existence of https://github.com/scala-js/scala-js-fake-weakreferences which are implemented in terms of strong references and would wreak havoc if selected over the weak reference implementations. |
|
For reference these results are from With tracing: benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 -prof gc --jvmArgs -Dcats.effect.tracing.mode=cached --jvmArgs -Dcats.effect.tracing.exceptions.enhanced=true ParallelBenchmark.par
Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 290.942 ± 2.351 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1521.536 ± 12.438 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5758396.046 ± 580.724 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1536.991 ± 20.210 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5816806.929 ± 53534.035 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.303 ± 0.024 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 1147.693 ± 89.317 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 887.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1316.000 msWithout tracing: Benchmark (cpuTokens) (size) Mode Cnt Score Error Units
ParallelBenchmark.parTraverse 10000 1000 thrpt 20 295.814 ± 2.993 ops/s
ParallelBenchmark.parTraverse:·gc.alloc.rate 10000 1000 thrpt 20 1559.392 ± 10.000 MB/sec
ParallelBenchmark.parTraverse:·gc.alloc.rate.norm 10000 1000 thrpt 20 5804947.416 ± 40176.019 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space 10000 1000 thrpt 20 1575.089 ± 15.339 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Eden_Space.norm 10000 1000 thrpt 20 5863488.584 ± 67068.118 B/op
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space 10000 1000 thrpt 20 0.318 ± 0.025 MB/sec
ParallelBenchmark.parTraverse:·gc.churn.G1_Survivor_Space.norm 10000 1000 thrpt 20 1181.857 ± 90.029 B/op
ParallelBenchmark.parTraverse:·gc.count 10000 1000 thrpt 20 909.000 counts
ParallelBenchmark.parTraverse:·gc.time 10000 1000 thrpt 20 1338.000 ms |
Completely replaces the
WeakHashMapmechanism, on eachWorkerThread, the fallback and JS.A possible remedy for #2634.