NativeAOT size conscious authoring observations

# NativeAOT authoring observations

As I'm continuing to iterate to shave off overhead from the new serializer for Npgsql, and slowly getting more familiar with the code patterns that work well with NativeAOT I wanted to report on my findings so far. I will be using this issue to centralize future observations of a similar nature as well.

### Async

One class of issues in particular I want to highlight is the cost of everything relating to `async` code. Having any of these methods in generic types is one major source of bloat due to their inherent IL type + codegen explosion` multiplied by the number of canonical instantiations.

It might be worthwhile for the C# and runtime team to take a look at what can be done to reduce its footprint, as this interaction is supremely problematic for more casual authors. If there is no significant improvement to be made here it could help to expose relevant tools for authors to reduce it on a case by case basis.

Some of those tools could be for instance having public methods on ValueTask and ValueTask<TResult> to move from its generic form to non generic and back, allowing us external authors to support pooled IValueTaskSource(T) while doing so. 
Tasks have inherent support of up/downcasting to do this. For ValueTask these apis could be used in the same way as up/downcasting Tasks, for the purpose of pushing this bulky async codegen to non generic types. There already is the internal method `ValueTask.DangerousCreateFromTypedValueTask` as a validation of this being useful internally. 

### Async Types

Next up I started looking at the general cost of even having apis returning ValueTask types, it's not cheap... Not only do you pay for the Task types but also for `ValueTask<T>`, `IValueTaskSource<T>`, `ValueTask<T>.ValueTaskSourceAsTask`, `ValueTaskAwaiter<T>` and others.

I have two links here, one before dropping about 20(?) mostly *reference type* instantiations of `ValueTask<T>` and one report after doing so. The difference is a hefty 80kb, with methods taking up just 24kb of that difference. The remainder is mostly types and type dictionary metadata.

Before: https://github.com/NinoFloris/Slon/actions/runs/4507705639
After: https://github.com/NinoFloris/Slon/actions/runs/4507808228

The mstats are attached if you want to explore the System.Private.Corelib types in more detail.

### Reference Types

Onto the next papercut, reference type instantiations in particular. Now their methods are shared via __Canon, however all their concrete types are still required to exist in full, take a look:

<img width="1101" alt="Screenshot 2023-03-24 at 19 44 15" src="https://user-images.githubusercontent.com/4218809/227612831-19faeccc-e683-490d-bce6-8b29b8915c3c.png">

These all add to the binary size, but I'm not sure what they're exactly uniquely adding. Could these EETypes/method tables (what is the correct name of this?) be shared in any way via __Canon? Only keeping a concrete generic context around per reference type instantiation? I understand the latter would always be needed to have correct type testing etc. inside methods.

Improving this situation seems like it may also reduce the previously discussed `ValueTask<T>` type bloat?

### Value Type Code Sharing

Finally I would urge runtime devs to consider what it would take to share code across same size/layout value types, a la gc shapes in golang. For a type like int that could allow eligible code for say `List<T>` to be reused across int, uint, enums,`ValueTuple<int>`, DateOnly and other types wrapping a single int. The same would go for other primitives like long that have a lot of representationally isomorphic types to share code with.

I understand this cuts into the ability to do runtime intrinsic optimizations (like uints never being negative values etc). I also see how this may complicate codegen as this sharing is only possible when the different instantiations don't actually produce different bodies (i.e. int.Equals will produce different code than DateOnly.Equals), however there will be many methods that don't depend on the generic type's methods at all, just their data representation. For an initial *good enough* experiment it might just be sufficient to add a stage that aliases same-type method instantiations by their body being byte for byte identical? 

Such a stage may also help with sharing code across all instantiations when it doesn't make use of the generic context at all.
IIRC there is already some global deduplication mode (which impacts stack traces) but only sharing per generic type seems to be more suitable to be enabled by default?

I can see the theoretical version of these things working but I'm obviously not sure what this would mean more concretely. (and the practical problems flowing from this, which I'm surely glossing over here)

### Conclusion

If we're really serious about NativeAOT being 'effortlessly' competitive (so no crazy authoring) these issues must be explored. If only just to understand the problematic elements better.

All in all it's been *challenging* to keep size down to acceptable levels in this particular area of generics, async and serializer-like code.

@DamianEdwards Is there a world in which we drive https://github.com/dotnet/aspnetcore/issues/45910 stage 2 efforts across internal and external collaborators more effectively than just github issues? Is that something you're open to?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NativeAOT size conscious authoring observations #83902

NativeAOT authoring observations

Async

Async Types

Reference Types

Value Type Code Sharing

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NativeAOT size conscious authoring observations #83902

Description

NativeAOT authoring observations

Async

Async Types

Reference Types

Value Type Code Sharing

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions