Recover some of the size lost with AsyncMethodBuilder/Task unification #6913

MichalStrehovsky · 2019-01-29T19:40:12Z

When we unified the implementations of these across all of our runtimes, we lost all size optimizations people have been doing on the Project N side over the past six years.

This restores a bit of the loss. For one sample app with lots of async usage, this removes 2.1 MB of generic instantiations.

There is more we can do, but I can't spend time on that right now. These two things jumped out on me when I was looking at it back in December and were an easy fix I wanted to do for a while.

When we unified the implementations of these across all of our runtimes, we lost all size optimizations people have been doing on the Project N side over the past six years. This restores a bit of the loss. For one sample app with lots of async usage, this removes 2.1 MB of generic instantiations. There is more we can do, but I can't spend time on that right now. These two things jumped out on me when I was looking at it back in December and were an easy fix I wanted to do for a while.

jkotas · 2019-01-29T20:36:42Z

BTW: A few callbacks have been converted from explicit ones to lamdas in dotnet/coreclr#22172 . Hopefully, this cleanup is fine since it is in non-generic type.

jkotas · 2019-01-29T20:57:37Z

@jaredpar The Roslyn creates dummy singleton helper objects for "static" lambdas. I assume that it is a micro-optimization to avoid overhead of shuffle thunks. Do you happen to know the PR or piece of code that introduced this optimization?

We have been wondering whether the Roslyn policy on when to produce these dummy static singletons should be revisited since it adds significant static overhead to frequently instantiated generics.

marek-safar · 2019-01-29T21:21:43Z

src/System.Private.CoreLib/shared/System/Runtime/CompilerServices/AsyncMethodBuilder.cs

        {
            /// <summary>Delegate used to invoke on an ExecutionContext when passed an instance of this box type.</summary>
-            private static readonly ContextCallback s_callback = s =>
+            private static readonly ContextCallback s_callback = ExecutionContextCallback;


Have you tried to move the lambda to the body? This forces static ctor which looks like is not a common path here.

This forces static ctor which looks like is not a common path here.

This callback is common path. Unless you've explicitly suppressed ExecutionContext flow with ExecutionContext.SuppressFlow(), this callback will be used every time the async state machine resumes after an await.

Also, before-field-init static ctors are executed lazily in CoreCLR so they pose no problem there. IIRC, it is different in Mono - before-field-init static ctors are eager in Mono that makes use of the before-field-init as a startup and static footprint optimization ineffective.

jaredpar · 2019-01-30T18:53:18Z

@jkotas

I assume that it is a micro-optimization to avoid overhead of shuffle thunks. Do you happen to know the PR or piece of code that introduced this optimization?

This change was done long before Roslyn went OSS. Hence you're unlikely to find it in our history anywhere.

We have been wondering whether the Roslyn policy on when to produce these dummy static singletons should be revisited since it adds significant static overhead to frequently instantiated generics.

@VSadov originally did this change hence he may have more feedback here. But my understannding is that the savings were pretty significant for Roslyn + VS when the change was made.

VSadov · 2019-01-30T22:35:28Z

Yes. this is a very old change by now. It was prior Roslyn v1.0
It achieves 2 goals as I remember.

improves throughput of lambdas at the cost of some space (metadata, instantiations).
Basically a space vs. speed tradeoff. As I remember the speed gain was significant. Especially on small lambdas, which is common in Linq and similar frameworks.
enables caching of generic lambdas. Prior to that we would allocate a new delegate every time for those. That was expensive. People were doing unnatural tricks to avoid mixing generics and lambdas on hot paths.
As I remember @stephentoub simplified more than one place where this was happening in FX after the change went in.

stephentoub · 2019-01-30T22:55:50Z

As I remember @stephentoub simplified more than one place where this was happening in FX after the change went in.

I didn't remember that the ability to cache generic lambdas was tied to that. We do rely on the generic lambda caching in a large number of places; those could be unwound, but it results in writing code in a fairly ugly / pre-lambda manner.

VSadov · 2019-01-30T22:56:07Z

I can imagine using different lowering strategy if we had -Os switch (as in for "code size")

There could be other places where codegen could be less generic and/or more compact (possible at the cost of some throughput) - anonymous types, async stack spilling....

VSadov · 2019-01-31T03:32:48Z

@stephentoub generic lambda caching came naturally from the change since generic "static" lambdas are no longer backed by static delegates and there is a containing type to put the static cache field.
It basically just worked, but was a welcomed change. The primary goal was the throughput improvement.

It might be possible to untangle just generic lambda caching from the rest of the change, if needed.

MichalStrehovsky requested review from jkotas and stephentoub January 29, 2019 19:40

jkotas requested a review from marek-safar January 29, 2019 19:40

stephentoub approved these changes Jan 29, 2019

View reviewed changes

jkotas approved these changes Jan 29, 2019

View reviewed changes

marek-safar reviewed Jan 29, 2019

View reviewed changes

jkotas merged commit 325f6b9 into dotnet:master Jan 29, 2019

MichalStrehovsky deleted the taskSize branch January 29, 2019 21:58

MichalStrehovsky mentioned this pull request Dec 5, 2022

[NativeAOT] Investigate AOT size savings opportunities around async dotnet/runtime#79204

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recover some of the size lost with AsyncMethodBuilder/Task unification #6913

Recover some of the size lost with AsyncMethodBuilder/Task unification #6913

Uh oh!

MichalStrehovsky commented Jan 29, 2019

Uh oh!

jkotas commented Jan 29, 2019

Uh oh!

jkotas commented Jan 29, 2019 •

edited

Loading

Uh oh!

marek-safar Jan 29, 2019

Uh oh!

stephentoub Jan 29, 2019

Uh oh!

jkotas Jan 29, 2019 •

edited

Loading

Uh oh!

jaredpar commented Jan 30, 2019

Uh oh!

VSadov commented Jan 30, 2019

Uh oh!

stephentoub commented Jan 30, 2019

Uh oh!

VSadov commented Jan 30, 2019

Uh oh!

VSadov commented Jan 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Recover some of the size lost with AsyncMethodBuilder/Task unification #6913

Recover some of the size lost with AsyncMethodBuilder/Task unification #6913

Uh oh!

Conversation

MichalStrehovsky commented Jan 29, 2019

Uh oh!

jkotas commented Jan 29, 2019

Uh oh!

jkotas commented Jan 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marek-safar Jan 29, 2019

Choose a reason for hiding this comment

Uh oh!

stephentoub Jan 29, 2019

Choose a reason for hiding this comment

Uh oh!

jkotas Jan 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaredpar commented Jan 30, 2019

Uh oh!

VSadov commented Jan 30, 2019

Uh oh!

stephentoub commented Jan 30, 2019

Uh oh!

VSadov commented Jan 30, 2019

Uh oh!

VSadov commented Jan 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jkotas commented Jan 29, 2019 •

edited

Loading

jkotas Jan 29, 2019 •

edited

Loading