-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Add a cache for Native offset to IL Offset mapping in the stack walking logic #117218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ng logic - This cache reduces an issue where the locking around the DebuggerJitInfo is extremely inefficient to access in a multithreaded system, or if MANY different methods are on the stack - Since we do not have extensive experience with this cache, its size is configurable - This allows us to discover scenarios in production where the cache is insufficiently big
|
Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a hash-based cache to speed up native-to-IL offset lookups during stack walking, reducing contention on DebuggerJitInfo under heavy load. It also makes the cache size configurable and integrates cache checks into the existing stack-trace initialization.
- Implemented
CheckNativeToILCacheCore,CheckNativeToILCache, andInsertIntoNativeToILCachefunctions indebugdebugger.cpp - Added cache lookup in
DebugStackTrace::Element::InitPass2before calling the debug interface - Exposed
NativeToILOffsetCacheSizevia CLR configuration with a default of 1024
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/coreclr/vm/debugdebugger.cpp | Added caching logic for native-to-IL offset mapping and integrated lookup into stack trace init |
| src/coreclr/inc/clrconfigvalues.h | Introduced UNSUPPORTED_NativeToILOffsetCacheSize configuration entry |
Comments suppressed due to low confidence (1)
src/coreclr/vm/debugdebugger.cpp:1042
- The newly added cache behavior, including version checks and linear probing, has complex edge cases; add unit tests to cover cache hits, misses, resizing, and concurrent updates.
bool CheckNativeToILCacheCore(void* ip, bool fAdjustOffset, uint32_t* pILOffset)
Co-authored-by: Copilot <[email protected]>
|
Ack, something is busted, so I'm closing till I fix it. |
Co-authored-by: Copilot <[email protected]>
noahfalk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| // Cache is not initialized | ||
| return false; | ||
| } | ||
| DWORD cacheSize = VolatileLoadWithoutBarrier(&s_stackWalkCacheSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we tested cacheSize != 0 it looks like we could eliminate all the VolatileStore/Load for s_stackWalkCache and s_stackWalkCacheSize. Regardless of read/write ordering the variables only have two possible values, 0 or initialized. I am mostly just trying to minimize the amount of memory ordering that someone has to reason about when trying to understand the multi-threaded behavior of this cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, if we checked the size of cacheSize we could convert the VolatileLoad for the s_stackWalkCache to a VolatileLoadWithoutBarrier. I find it easier to analyze the VolatileStore/VolatileLoad matchup, but shrug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its fine if you leave it as-is, but it sounds like you may have misunderstood me. I wasn't proposing eliminating only the barrier. I was proposing you can write and read those two fields with standard C++ assignments with no VolatileLoad or VolatileStore at all:
// init the fields on one thread
s_stackWalkCache = blablabla;
s_stackWalkCacheSize = foofoofoo;
// then read them on another
cache = s_stackWalkCache;
cacheSize = s_stackWalkCacheSize;
if(cache == NULL || cacheSize == 0)
{
return false; // not initialized yet
}
DebuggerJitInfois extremely inefficient to access in a multithreaded system, or if MANY different methods are on the stack