This repository was archived by the owner on Jan 23, 2023. It is now read-only.
Optimize StackTrace.ToString() generation#20448
Merged
jkotas merged 1 commit intodotnet:masterfrom Oct 17, 2018
Merged
Conversation
When looking for compiler generated async state machine types and methods, we need not look at parents in the reflection path. This introduces additional overhead in both .IsDefined() and .GetCustomAttributes<T>() in every path. See the short circuit path in: https://github.com/dotnet/coreclr/blob/57f8358221a3c4ad7f1608f625bc3c5936618505/src/System.Private.CoreLib/src/System/Reflection/CustomAttribute.cs#L1315-L1320 We can avoid the list allocation, the array from it, and the resulting combined array by not looking for inherited attributed members that cannot exist for our purposes of creating a stack trace.
Member
|
cc: @noahfalk |
stephentoub
approved these changes
Oct 16, 2018
benaadams
approved these changes
Oct 16, 2018
jkotas
approved these changes
Oct 17, 2018
NickCraver
added a commit
to NickCraver/coreclr
that referenced
this pull request
Nov 3, 2018
This is a follow-up to many allocations recognized in dotnet#20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes.
jkotas
pushed a commit
that referenced
this pull request
Nov 4, 2018
…'s inherited path (#20779) * Optimization: avoid 2 array allocations in inherited attribute misses This is a follow-up to many allocations recognized in #20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes. * Move RuntimeType empty array cache generation to Attribute.CreateAttributeArrayHelper This exposes Attribute.CreateAttributeArrayHelper to internal and uses it directly on the RuntimeType caching for empty arrays. Though this allocated 1 additional array overall, it's simpler, faster to init, and still is an infinite win over the per-call allocations before this overall changesets. * CustomAttributes: remove needless array copy in the inherited hit case This removes a .ToArray() for the sake of Array.Copy() where a simple for loop suffices and removes the allocation. Reversing the "empty" result checks is also just a bit cleaner here. This also expands the same fix to the MemberInfo path. Note: should DRY these up too (longstanding issue) - but let's do that in a separate commit for clarity. * GetCusomAttributes: use ListBuilder<object> for inheritance crawls This exposes RuntimeType.ListBuilder<T> for internal usage and replaces the List<T> allocation in GetCustomAttributes() paths to reduce allocations and increase performance in the inherited crawl paths (which is the default for many optional-parameter methods in layers above). Note: there is a subtle behavior depending on previous-null (not possible with a struct now) in AttributeUsageCheck() that I believe still behaves correctly, but could use another set of eyes and a full test suite run to confirm. object[] attributes was removed there simply because it wasn't used before - only cleaning up. * Attribute caching: use Array.CreateInstance() directly on RuntimeType This also reverts the CreateAttributeArrayHelper => internal change, since it's no longer needed. * Ref passing for RuntimeType.ListBuilder<object> & CustomAttribute simplification This fixes the struct passing duplication and tweaks how we're creating arrays a bit, centralizing the zero-element checks to cover all cases as well as simplify the per-method code to rely on the fact this is happening underneath.
A-And
pushed a commit
to A-And/coreclr
that referenced
this pull request
Nov 20, 2018
When looking for compiler generated async state machine types and methods, we need not look at parents in the reflection path. This introduces additional overhead in both .IsDefined() and .GetCustomAttributes<T>() in every path. See the short circuit path in: https://github.com/dotnet/coreclr/blob/57f8358221a3c4ad7f1608f625bc3c5936618505/src/System.Private.CoreLib/src/System/Reflection/CustomAttribute.cs#L1315-L1320 We can avoid the list allocation, the array from it, and the resulting combined array by not looking for inherited attributed members that cannot exist for our purposes of creating a stack trace.
A-And
pushed a commit
to A-And/coreclr
that referenced
this pull request
Nov 20, 2018
…'s inherited path (dotnet#20779) * Optimization: avoid 2 array allocations in inherited attribute misses This is a follow-up to many allocations recognized in dotnet#20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes. * Move RuntimeType empty array cache generation to Attribute.CreateAttributeArrayHelper This exposes Attribute.CreateAttributeArrayHelper to internal and uses it directly on the RuntimeType caching for empty arrays. Though this allocated 1 additional array overall, it's simpler, faster to init, and still is an infinite win over the per-call allocations before this overall changesets. * CustomAttributes: remove needless array copy in the inherited hit case This removes a .ToArray() for the sake of Array.Copy() where a simple for loop suffices and removes the allocation. Reversing the "empty" result checks is also just a bit cleaner here. This also expands the same fix to the MemberInfo path. Note: should DRY these up too (longstanding issue) - but let's do that in a separate commit for clarity. * GetCusomAttributes: use ListBuilder<object> for inheritance crawls This exposes RuntimeType.ListBuilder<T> for internal usage and replaces the List<T> allocation in GetCustomAttributes() paths to reduce allocations and increase performance in the inherited crawl paths (which is the default for many optional-parameter methods in layers above). Note: there is a subtle behavior depending on previous-null (not possible with a struct now) in AttributeUsageCheck() that I believe still behaves correctly, but could use another set of eyes and a full test suite run to confirm. object[] attributes was removed there simply because it wasn't used before - only cleaning up. * Attribute caching: use Array.CreateInstance() directly on RuntimeType This also reverts the CreateAttributeArrayHelper => internal change, since it's no longer needed. * Ref passing for RuntimeType.ListBuilder<object> & CustomAttribute simplification This fixes the struct passing duplication and tweaks how we're creating arrays a bit, centralizing the zero-element checks to cover all cases as well as simplify the per-method code to rely on the fact this is happening underneath.
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
…'s inherited path (dotnet/coreclr#20779) * Optimization: avoid 2 array allocations in inherited attribute misses This is a follow-up to many allocations recognized in dotnet/coreclr#20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes. * Move RuntimeType empty array cache generation to Attribute.CreateAttributeArrayHelper This exposes Attribute.CreateAttributeArrayHelper to internal and uses it directly on the RuntimeType caching for empty arrays. Though this allocated 1 additional array overall, it's simpler, faster to init, and still is an infinite win over the per-call allocations before this overall changesets. * CustomAttributes: remove needless array copy in the inherited hit case This removes a .ToArray() for the sake of Array.Copy() where a simple for loop suffices and removes the allocation. Reversing the "empty" result checks is also just a bit cleaner here. This also expands the same fix to the MemberInfo path. Note: should DRY these up too (longstanding issue) - but let's do that in a separate commit for clarity. * GetCusomAttributes: use ListBuilder<object> for inheritance crawls This exposes RuntimeType.ListBuilder<T> for internal usage and replaces the List<T> allocation in GetCustomAttributes() paths to reduce allocations and increase performance in the inherited crawl paths (which is the default for many optional-parameter methods in layers above). Note: there is a subtle behavior depending on previous-null (not possible with a struct now) in AttributeUsageCheck() that I believe still behaves correctly, but could use another set of eyes and a full test suite run to confirm. object[] attributes was removed there simply because it wasn't used before - only cleaning up. * Attribute caching: use Array.CreateInstance() directly on RuntimeType This also reverts the CreateAttributeArrayHelper => internal change, since it's no longer needed. * Ref passing for RuntimeType.ListBuilder<object> & CustomAttribute simplification This fixes the struct passing duplication and tweaks how we're creating arrays a bit, centralizing the zero-element checks to cover all cases as well as simplify the per-method code to rely on the fact this is happening underneath. Commit migrated from dotnet/coreclr@6d9fc60
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is optimizing
StackTrace.ToString()in general. The newer version ofStackTracemakes async stack traces far more usable, but also introduces overhead due to System.Reflection being...less than awesome.The code in question is trying to take what is an async stack frame and find the "user" method that goes with it. This happens by checking if the type we're on is decorated with
[CompilerGenerated](via.IsDefined()) and if so, looping through all methods in the parent class decorated with[AsyncStateMachine], looping at their Type property, and seeing which one we belong with.With the above info, we know we need not look at parents in the reflection path. This introduces additional overhead in both
.IsDefined()and.GetCustomAttributes<T>()in pretty much every call to them. Since this resolution runs for every stack frame in every exception, so the multiplicative impact is quite large.Details
For
.GetCustomAttributes<T>():This eventually calls down to
CustomAttribute.GetCustomAttributes(). By looking at the short circuit path in:coreclr/src/System.Private.CoreLib/src/System/Reflection/CustomAttribute.cs
Lines 1315 to 1320 in 57f8358
We can avoid the list allocation, the array from
.ToArray(), and the resulting combined array by not looking for inherited attributed members that cannot exist for our purposes of creating a stack trace.For
.IsDefined():This eventually calls down to
CustomAttribute.IsDefined(). We can also short circuit overhead there.Impact
In a test deploy of some .NET Core code for websockets at Stack Overflow, we were throwing a few thousand exceptions per minute (not that many) and racked up millions of arrays from these code paths. Here's a quick summary of that memory dump:

Note that Gen 0 is not being cleared correctly (that's a separate issue), but we needn't be allocating these arrays in the first place.
I'll open another PR tomorrow (thanks to @benaadams for digging with me!) where we have a plan to utilize
Array.Empty<T>()in the reflection miss (no attributes found) path that's another optimization on top of this with quick wins for all reflection consumers.cc @benaadams @mgravell @deanward81