-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Change profilers to use thread local evacuation counters #59741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @tommcdon Issue DetailsUsing one global evacuation counter per profiler was causing lots of contention in hot calls to CORProfilerPresent, this change switches to using one per thread. It also switches hot CORProfiler* methods to FORCEINLINE, and changes CORProfilerPresent to be a much cheaper check, especially when a notification profiler is present.
|
|
Looks like there's also a CRST ordering issue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that the profiler detach can crash if it happens to interleave with a callback issued where GetThread is null?
How many of the callbacks are affected? Did we have this problem before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all of desktop and before my change in coreclr the evacuation counters were on the Thread * and it would only increment/decrement if GetThread returned non-null. So this behavior should be identical to 5.0 and previous runtimes.
I have been thinking since last night about if there was a latent bug. Mostly I don't think so, but there is a couple places like EventPipe callbacks that may not have a managed thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I am pretty suspicious there is a latent bug here when it comes to EventPipe notifications - they can be triggered on threads where GetThread() == NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noahfalk thinking about it though, with the profiler they are synchronous on the thread that it happens. We may not actually have a bug, since I would expect any place EventPipe events happen to have an EE thread. Since we have the managed bridge from native events to EventPipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue also affects server GC threads. Server GC threads do not have Thread*.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect any place EventPipe events happen to have an EE thread
I'm ~98% sure this isn't the case : ) Some of the EventPipe events are emitted where GetThread()==NULL (such as the events from Server GC threads @jkotas mentioned). Originally the code in EventPipe's WriteEvent() code refered to GetThread() for the thread-local write buffers but that was broken. We had to invent the new EventPipeThread object to work around the lack of Thread* for some events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I agree there is a latent issue here, but it also has been present ever since the EventPipe stuff was merged in 3.1. Given that and that there's also an easy workaround - stop the session before detaching - I propose I file a bug and don't try to solve this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't pushing to have you fix it now given that it was pre-existing, just confirming that it does exist.
da6b788 to
080d379
Compare
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
2257ec5 to
e8e39fd
Compare
| } | ||
|
|
||
| inline HRESULT ProfControlBlock::JITInlining(FunctionID callerId, FunctionID calleeId, BOOL *pfShouldInline) | ||
| { | ||
| LIMITED_METHOD_CONTRACT; | ||
|
|
||
| *pfShouldInline = TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will initialize *pfShouldInline deterministically for the first profiler only.
Should this line rather be in JITInliningHelper instead? I think you can even delete pfShouldInline argument for the JITInliningHelper and have BOOL fShouldInline local in JITInliningHelper instead.
|
I have not seen the last commit for some reason. |
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
|
/backport to release/6.0 |
|
Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1314583486 |
|
@jkotas backporting to release/6.0 failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: Change profilers to use thread local evacuation counters
.git/rebase-apply/patch:191: trailing whitespace.
return (&g_profControlBlock)->mainProfilerInfo.pProfInterface.Load() != NULL
.git/rebase-apply/patch:1056: trailing whitespace.
DWORD newValue =
.git/rebase-apply/patch:1067: trailing whitespace.
DWORD newValue =
warning: 3 lines add whitespace errors.
Using index info to reconstruct a base tree...
M src/coreclr/inc/profilepriv.inl
M src/coreclr/vm/profilinghelper.cpp
M src/coreclr/vm/proftoeeinterfaceimpl.cpp
M src/coreclr/vm/threads.cpp
M src/coreclr/vm/threads.h
Falling back to patching base and 3-way merge...
Auto-merging src/coreclr/vm/threads.h
Auto-merging src/coreclr/vm/threads.cpp
Auto-merging src/coreclr/vm/proftoeeinterfaceimpl.cpp
Auto-merging src/coreclr/vm/profilinghelper.cpp
Auto-merging src/coreclr/inc/profilepriv.inl
Applying: get rid of lambdas
.git/rebase-apply/patch:153: trailing whitespace.
.git/rebase-apply/patch:166: trailing whitespace.
return pProfilerInfo->pProfInterface->IsCallback3Supported();
.git/rebase-apply/patch:194: trailing whitespace.
return AnyProfilerPassesCondition(&RequiresGenericsContextForEnterLeaveHelper);
warning: 3 lines add whitespace errors.
Using index info to reconstruct a base tree...
M src/coreclr/inc/profilepriv.inl
Falling back to patching base and 3-way merge...
Auto-merging src/coreclr/inc/profilepriv.inl
CONFLICT (content): Merge conflict in src/coreclr/inc/profilepriv.inl
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 get rid of lambdas
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128Please backport manually! |
* Change profilers to use thread local evacuation counters Change to prefix increment * get rid of lambdas * Fix jit inlining, fix R2R too * Remove VolatilePtr<> from helpers * Get rid of additionalData argument
Using one global evacuation counter per profiler was causing lots of contention in hot calls to CORProfilerPresent, this change switches to using one per thread.
It also switches hot CORProfiler* methods to FORCEINLINE, and changes CORProfilerPresent to be a much cheaper check, especially when a notification profiler is present.