Skip to content

Conversation

@VSadov
Copy link
Member

@VSadov VSadov commented Sep 12, 2022

Fixes:#75107

  • eager finalization is an important perf and reliability improvement for code that uses a lot of weak references.
  • now that we have eager finalization, simplified the WeakReference<T> and WeakReference implementations a bit.
  • extra method table flags can now be stored in place of ComponentSize, which is rarely used (only arrays and strings need that). A similar pattern as used in CoreClr to increase data density in method tables. The trick allows 16 more bits for type traits, as long as the traits do not apply to arrays.

@VSadov
Copy link
Member Author

VSadov commented Sep 12, 2022

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@VSadov VSadov requested a review from jkotas September 12, 2022 14:51
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EagerFinalizer and CriticalFinalizer are not generic type system concepts. They do not need to have a flag here.

We only need to check these when producing EETypes. We can do the check only when building EEType.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the first time I need to change something in the type system, so Iwas just following existing patterns - like HasFinalizer.
What I hear is that instead of setting this flags we could just check for the same info that we capture in flags, but later.
Do we have a more appropriate example?

We only need to check these when producing EETypes. We can do the check only when building EEType.

Where is that done? At the locations where these flags are currently consumed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should only need to compute these flags here: https://github.com/dotnet/runtime/blob/e3cd737cea578c629a194474fd67d660fc7a9903/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/EETypeNode.cs#L657

Something like:

if (type is MetadataType mdType &&
                mdType.Module == context.SystemModule &&
                mdType.Name == "WeakReference" pr "WeakReference`1"
                mdType.Namespace == "System")
{
    flags |= TypeFlags.HasEagerFinalizer;
}

You should not need the one in EETypeBuilderHelpers. Copying from the template type should be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need KeepAlive in a finalizer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory a finalizer never runs concurrently with itself, thus neither Interlocked.Exchange nor KeepAlive are necessary, since noone could be recycling the handle while we are freeing it.

However, since nongeneric WeakReference is not sealed, it is hard to guarantee anything.

The original code looked like it tried to handle concurrent finalization. It does look like an attempt to handle just a particular kind of misuse, so I kept such assumption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can remove both Interlocked.Exchange and KeepAlive. If things are broken, they can be broken in many ways.

I hope deriving from WeakReference or at least overriding the finalizer and doing weird things in it is not a common practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to match what CoreCLR does (work towards sharing the code eventually).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoreCLR has a bunch of code that handles COM and I am not very familiar with requirements. It looks a lot more complicated, perhaps because it is in native code and maybe because of the COM stuff.

Ignoring COM, there is nothing AOT specific here. WeakReferences are just thin facades to GC and handles, which are the same for CoreCLR.
CoreCLR could be trivially switched to use managed implementation (which seems simpler), just would need to handle the COM stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The COM stuff is for http://github.com/microsoft/cswinRT/ support. If once we want to make nativeaot work well for cswinrt, we will need to implement the COM stuff too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will need to implement the COM stuff too

I'd hope there is a way to do that in managed code.
There is some code in NativeAOT under #if ENABLE_WINRT, but maybe it is a different kind of WINRT.

Copy link
Member Author

@VSadov VSadov Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the native implementation a bit more. The COM is really about detecting RCWs and storing them via a different kind of handle.

The native implementation seems to be adding a lot of complexity just for being native. There is gc protection, for the reference itself and for the referenced object.

There is also a spinlock in the setter. Since the kind of the handle can be changing, the assignment is not a single operation.

There is also a comment saying that the lock synchronizes with finalization done by GC, but GC does not run concurrently with FCalls, (not the blocking mark phases when we know which finalizables are not reachable), so I am not sure the spinlock helps with that. Besides, the getter does not take the spinlock anyways. There is comment saying it is ok, but it is not very convincing. If handle recycling is possible, it would be possible to fetch an object of a wrong type and that is still a GC hole.

I think with the eager finalization the handle recycling is not possible in WeakReference<T>, because eager finalization does not run concurrently with managed code. The same should hold for the nongeneric WeakReference as well, unless it is overriden, but then there are endless opportunities for breakage.
Same should hold for the FCalls as well. I see some GCX_PREEMP(); in the code, but gc.pThis protection should work as KeepAlive.

Maybe it is worth to actually make the native implementation closer to managed, or even switch to managed, if COM stuff can be handled via FCall callouts. Hard to tell without actually trying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think uintptr_t would be more appropriate type to use for the handle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should match the tricks that we do for this in MethodTable. It is 0x80000000 so that the JIT can optimize this as sign check. Also, the flags are fetched as dword so that it can use the smaller instructions (on x64).

Copy link
Member Author

@VSadov VSadov Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like - merge the Flags and FlagsEx into one uint32 and move the HasComponentSize to the sign bit?

or just move the bit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the codegen for go_through_object for CoreCLR and make sure that it is as good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we want mT->HasComponentSize() in the following just be a sign check.
(now it probably loads a short into a 32bit register and does a bit test)

inline size_t my_get_size (Object* ob)
{
    MethodTable* mT = header(ob)->GetMethodTable();

    return (mT->GetBaseSize() +
            (mT->HasComponentSize() ?
             ((size_t)((CObjectHeader*)ob)->GetNumComponents() * mT->RawGetComponentSize()) : 0));
}

The perf here is typically gated by the cache misses when fetching the mT pointer - we are looking at headers of semi-random objects, and that is often a miss.

It would not hurt if codegen for HasComponentSize is tighter though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is subtle change. Target property is virtual and it call be overridden to do whatever in the inherited type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not even think the Target could be virtual. GetGeneration makes even less sense if Target is overridden.

I guess I will revert to the original implementation, except the KeepAlive(wo);. We do not need to protect the handle once we fetched the obj. The handle can be gone by the time we return to the caller anyways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code also does not do RuntimeImports.RhHandleGet(h) ?? TryGetComTarget(). Another difference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry. I confused IsAlive and GC.GetGeneration. Both have subtle changes if Target is overridden.
I'll revert both to the old implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @noahfalk This is revving the debugger contract.

Copy link
Member Author

@VSadov VSadov Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more changes when native side is updated, but the most observable part is that a few MT flags have moved around

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - type arg count for typedefs,
// - type arg count for generic type definitions MethodTables,

Doesn't hurt to spell it out. Generic type definition MethodTables can be weird because they never show up as allocated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the comment, on the native side as well.

Comment on lines 137 to 138
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We made this a single int on the managed side and made emission also happen as a single int. Should we collapse to a single integer here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The native changes are coming shortly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a lot, but we can optimize important things like computing object size in GC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the large comment at the top of the file? We no longer emit/access this as two ushorts, but as a single int.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Thanks!

@VSadov
Copy link
Member Author

VSadov commented Sep 14, 2022

With the native changes, we have

=== object size computation:

                s = size (oo);
00007FF6202E23FA  mov         rcx,qword ptr [r10]  
00007FF6202E23FD  and         rcx,0FFFFFFFFFFFFFFF8h  
00007FF6202E2401  mov         eax,dword ptr [rcx]  
00007FF6202E2403  test        eax,eax  
HasComponentSize     => 00007FF6202E2405  jns         SVR::gc_heap::mark_object_simple1+4DAh (07FF6202E282Ah)  
num components => 00007FF6202E240B  mov         edx,dword ptr [r10+8]  
comp size      => 00007FF6202E240F  movzx       eax,ax  
arr body size  => 00007FF6202E2412  imul        rdx,rax  
00007FF6202E2416  jmp         SVR::gc_heap::mark_object_simple1+4DCh (07FF6202E282Ch) 

=== figuring whether an instanse is eagerly finalizable:

bool GCToEEInterface::EagerFinalized(Object* obj)
{
00007FF7C9279C07  mov         r8,rcx  
    if (!obj->GetGCSafeMethodTable()->HasEagerFinalizer())
00007FF7C9279C0A  and         rax,0FFFFFFFFFFFFFFF8h  
00007FF7C9279C0E  mov         edx,dword ptr [rax]  
HasEagerFinalizer => 00007FF7C9279C10  test        dl,1  
00007FF7C9279C13  je          GCToEEInterface::EagerFinalized+42h (07FF7C9279C42h)  
HasComponentSize  => 00007FF7C9279C15  test        edx,edx  
00007FF7C9279C17  js          GCToEEInterface::EagerFinalized+42h (07FF7C9279C42h)  

@VSadov
Copy link
Member Author

VSadov commented Sep 14, 2022

Thanks!!

@VSadov VSadov merged commit ad8debf into dotnet:main Sep 14, 2022
@VSadov VSadov deleted the wr branch September 14, 2022 08:04
@ghost ghost locked as resolved and limited conversation to collaborators Oct 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants