Skip to content

Conversation

@davidwrighton
Copy link
Member

  • This cache reduces an issue where the locking around the DebuggerJitInfo is extremely inefficient to access in a multithreaded system, or if MANY different methods are on the stack
  • Since we do not have extensive experience with this cache, its size is configurable
    • This allows us to discover scenarios in production where the cache is insufficiently big, and address problems as needed
    • The initial size is 1024 which is expected to be fairly ok, at a cost of 16KB of space on 64 systems.

…ng logic

- This cache reduces an issue where the locking around the DebuggerJitInfo is extremely inefficient to access in  a multithreaded system, or if MANY different methods are on the stack
- Since we do not have extensive experience with this cache, its size is configurable
  - This allows us to discover scenarios in production where the cache is insufficiently big
@davidwrighton davidwrighton requested review from Copilot and hoyosjs July 1, 2025 20:54
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 1, 2025
@davidwrighton davidwrighton added area-Diagnostics-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jul 1, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a hash-based cache to speed up native-to-IL offset lookups during stack walking, reducing contention on DebuggerJitInfo under heavy load. It also makes the cache size configurable and integrates cache checks into the existing stack-trace initialization.

  • Implemented CheckNativeToILCacheCore, CheckNativeToILCache, and InsertIntoNativeToILCache functions in debugdebugger.cpp
  • Added cache lookup in DebugStackTrace::Element::InitPass2 before calling the debug interface
  • Exposed NativeToILOffsetCacheSize via CLR configuration with a default of 1024

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/coreclr/vm/debugdebugger.cpp Added caching logic for native-to-IL offset mapping and integrated lookup into stack trace init
src/coreclr/inc/clrconfigvalues.h Introduced UNSUPPORTED_NativeToILOffsetCacheSize configuration entry
Comments suppressed due to low confidence (1)

src/coreclr/vm/debugdebugger.cpp:1042

  • The newly added cache behavior, including version checks and linear probing, has complex edge cases; add unit tests to cover cache hits, misses, resizing, and concurrent updates.
bool CheckNativeToILCacheCore(void* ip, bool fAdjustOffset, uint32_t* pILOffset)

@davidwrighton
Copy link
Member Author

Ack, something is busted, so I'm closing till I fix it.

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// Cache is not initialized
return false;
}
DWORD cacheSize = VolatileLoadWithoutBarrier(&s_stackWalkCacheSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we tested cacheSize != 0 it looks like we could eliminate all the VolatileStore/Load for s_stackWalkCache and s_stackWalkCacheSize. Regardless of read/write ordering the variables only have two possible values, 0 or initialized. I am mostly just trying to minimize the amount of memory ordering that someone has to reason about when trying to understand the multi-threaded behavior of this cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, if we checked the size of cacheSize we could convert the VolatileLoad for the s_stackWalkCache to a VolatileLoadWithoutBarrier. I find it easier to analyze the VolatileStore/VolatileLoad matchup, but shrug

Copy link
Member

@noahfalk noahfalk Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its fine if you leave it as-is, but it sounds like you may have misunderstood me. I wasn't proposing eliminating only the barrier. I was proposing you can write and read those two fields with standard C++ assignments with no VolatileLoad or VolatileStore at all:

// init the fields on one thread
s_stackWalkCache = blablabla;
s_stackWalkCacheSize = foofoofoo;

// then read them on another
cache = s_stackWalkCache;
cacheSize = s_stackWalkCacheSize;
if(cache == NULL || cacheSize == 0)
{
    return false; // not initialized yet
}

@davidwrighton davidwrighton merged commit c12e876 into dotnet:main Jul 3, 2025
97 of 99 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Aug 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants