-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix a deadlock in NonGC + Profiler API #90847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
64b9fd0 to
cec6257
Compare
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Co-authored-by: Jan Kotas <[email protected]>
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Co-authored-by: Jan Kotas <[email protected]>
| _ASSERT((uint8_t*)obj >= m_pStart + sizeof(ObjHeader) && (uint8_t*)obj < m_pCurrent); | ||
| _ASSERT((uint8_t*)obj >= m_pStart + sizeof(ObjHeader) && (uint8_t*)obj < m_pCurrentRegistered); | ||
|
|
||
| // FOH doesn't support objects with non-DATA_ALIGNMENT alignment yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set m_NumComponents for arrays as part of TryAllocateObject?
We are setting it too late and we can end up enumerating arrays without m_NumComponents set that is not going to end wel..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! It also allowed to simplify the PublishObject logic a bit. The final API might be simplified a bit with C++ template to allow use of capturing lambdas for simplicity but that needed a bit more changes
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me otherwise. Thank you!
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Thanks for the help! I wish I could easily spot all possible race conditions/corner cases just like you 🙂 |
Co-authored-by: Jan Kotas <[email protected]>
|
/backport to release/8.0 |
|
Started backporting to release/8.0: https://github.com/dotnet/runtime/actions/runs/5979164788 |
Fixes #90830
Quick explanation how's the dead-lock happening:
Thread1:
Someone (typically, JIT) tries to allocate an object on NonGC heap.
FrozenObjectHeapManager(FOHM) acquires its lock and calls GC's APIRegisterNewSegment. That API internally can hit a case when a GC is happening so it has to wait for GC to complete.Thread2 (GC's):
GC is executing a callback (e.g.
GarbageCollectionFinishedor*Started) and Profiler uses that callback to enumerate objects on NonGC heap viaICorProfilerInfo14::GetNonGCHeapBoundsthus, it also tries to acquire FOHM's lock (to be able to safely enumerate the objects). Thus, GC's thread (Thread2) is wating for FOHM's lock to release (it's taken by Thread1) while Thread1 is waiting for GC to finish.The fix is #90830 (comment)