Skip to content

Conversation

@y-yamshchikov
Copy link
Contributor

This PR fixes part of #44948 and fixes #46160.

This code simply traverses through assemblies in the application
domain. For each assembly (module) it realizes is it Ready To Run and is
it in the same bubble with (does it deliberately bubbling the) module
from which generic function originates. If so, it makes request for code
is Ready To Run and hopes there is some in the module. If the request
succeeds it proceeds with found pointer to the bare native code.

Generic instantiations like System.Collections.Concurrent.ConcurrentDictionary`2[System.Int32,System.__Canon] contained within same version bubble are either a bug in crossgen2 or should be taken care of by having PGO data.

Now such methods use their Ready To Run code.

We have got significant performance gain on startup: 7% average on our representative set on Tizen.

This PR worked out notices in PR below:
#47269
about linear search through the set of RangeSections. In this new PR we propose storing of RangeSections in sorted array (with number of optimizations inspired by prior linked list based solution).

We have extensively tested this PR on armel/Tizen platform so in this case we are confident in reliability and profitability of the solution.

Dear colleagues @jkotas @alpencolt @gbalykov @t-mustafin, please take a look.

@ghost ghost added area-crossgen2-coreclr community-contribution Indicates that the PR has been added by a community member labels Aug 12, 2021
@y-yamshchikov y-yamshchikov changed the title Partial fix 44948 Search native code in all R2R ni.dll in version bubble Aug 12, 2021
@y-yamshchikov y-yamshchikov force-pushed the partial-fix-44948 branch 7 times, most recently from c2d2f5d to 91e753a Compare August 13, 2021 17:50
@y-yamshchikov y-yamshchikov force-pushed the partial-fix-44948 branch 3 times, most recently from da2f1cd to 8dc0aa9 Compare September 3, 2021 12:50
@alpencolt
Copy link

@jkotas had you time to check PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs comment on what this returns. It does not return a simple index...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can it be in the private section (before public:)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In debug/checked builds, we should make the initial size small and make the array grow on element at a time to ensure that the array resize algorithm and the potential race conditions are exercised.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be volatile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that this needs to store HighAddress. It can just store the LowAddress and the method that is binary searching inside the array can just return the candidate RS index. We can then validate that the candidate RS index actually fits by comparing the address with RangeSection::HighAddress.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-effect of this change is going to be that the size of this struct will be power of two that will make the binary search faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BEGIN_PRESERVE_LAST_ERROR should be before the newly added code.

Copy link
Member

@jkotas jkotas Sep 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not look right to create a dummy ExternalMethodFrame just to get the return address.

The fixups from R2R modules should come via ExternalMethodFixupWorker. Can we just use the module that is computed there?

@jkotas
Copy link
Member

jkotas commented Sep 3, 2021

RangeSections now stored in sorted array

BTW: I do like this change - it should be general performance improvement for any scenario with large number of modules.

@y-yamshchikov y-yamshchikov force-pushed the partial-fix-44948 branch 5 times, most recently from 3d7cad1 to fe70768 Compare September 16, 2021 11:24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important for this path to be lock-free. Adding a lock here is very likely to regress GC stackwalking performance on machines with a lot of cores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is so important, we will investigate another way to isolate readers from writers. We think we could find one with two interchanging arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what this is trying to achieve, but it does not look right.

Copy link
Contributor Author

@y-yamshchikov y-yamshchikov Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are obliged to ask if the lock has been aquired after NoHostCalls constructor of ReaderLockHolder. NoHostCalls is necessary because of contract requirement from very above on call stack. We have encountered contract violation if we use default constructor (and the source of violation is not GetRangeSection, it is much higher).

@mangod9
Copy link
Member

mangod9 commented Nov 15, 2021

hi @y-yamshchikov, assume you are still working on this PR feedback?

@mangod9
Copy link
Member

mangod9 commented Jan 10, 2022

hi @y-yamshchikov, checking if this PR still needs to be kept open since it hasnt been updated in a couple of months. Thx

@y-yamshchikov
Copy link
Contributor Author

y-yamshchikov commented Jan 13, 2022

hi @y-yamshchikov, checking if this PR still needs to be kept open since it hasnt been updated in a couple of months. Thx

Dear @mangod9, we are working on new approach granting lockfree reading of RangeSections and still benefiting from binary search capabilities of sorted array, fighting all the contradictions dictating by the multiprocessor environment. We end up with new version of synchronization mechanics and for now working on comprehensive testing environment which grants extensive coverage of all possible synchronization issues. To be presented soon, both the code and the testing model.

@mangod9
Copy link
Member

mangod9 commented Jan 13, 2022

thanks for the update, should we close this PR for now and open a new one once ready? Thx!

@y-yamshchikov
Copy link
Contributor Author

thanks for the update, should we close this PR for now and open a new one once ready? Thx!

There is lot of useful feedback conversation still not outdated, I think the change is made in a successive way to prolong current PR.

@mangod9
Copy link
Member

mangod9 commented Feb 7, 2022

I would suggest we close the PR and reopen when more changes are added. This helps to keep the "Open PRs" list manageable.

@mangod9
Copy link
Member

mangod9 commented Mar 14, 2022

Closing for now so it doesnt get flagged in stale PRs. Please reopen when ready to merge again. Thx.

@mangod9 mangod9 closed this Mar 14, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Apr 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-crossgen2-coreclr community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Version Bubble implementation flaw: some methods are forced to rejit

5 participants