-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix behavior of inlined NDirect methods with MulticoreJit #55185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What is different about NDirect methods being loaded on the MCJ thread as opposed to being loaded on other threads? I thought that the only difference should be in timing of when things are loaded. Are the custom search paths for native libraries being set after the profile is started? Would it be possible to set them before starting the profile? |
The user code can install handlers to resolve the native libraries using |
|
I see, I guess that would normally be done before the native libraries are loaded but MCJ could attempt that out of order. |
|
Actually, I've previously faced only the second part of the problem (when ni.dll exist). Recently, during mcj testing in debug mode I've faced this problem without ni.dll. On x64/armel With this PR segfault is fixed. |
src/coreclr/jit/importer.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| (!opts.compNoPInvokeInline) // profiler is preventing inline pinvoke | |
| (!opts.compNoPInvokeInline) // JIT flag is preventing inline pinvoke |
src/coreclr/debug/daccess/nidump.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like the new parameters are used, are they necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new parameters are indeed not used in NativeImageDumper::HandleFixupForHistogram, NativeImageDumper::HandleFixupForMethodDump. They are required, because Module::FixupDelayListAux should pass this flag to Module::FixupNativeEntry, and Module::FixupDelayListAux is used for all three to traverse over fixups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, it would be good to give it the same name so that it's clear what that parameter is a placeholder for, even if unused
8d9e881 to
370d1fc
Compare
kouvel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to have a config option that allows disabling preloading the p/invokes on the mcj thread. By default and when custom paths are not used for native libraries, it may be beneficial to do the previous thing. Could add an mcj config option, preload it in EEConfig, and use it via g_pConfig where the decisions are made such that the changes are opt-in.
|
How are people going to tell whether PInvokes in all libraries that they use are compatible with preloading? Note that multicore JIT was skipping PInvokes up until recently. This is fixing a functional regression introduced by adding PInvoke handling to multicore JIT. I do not think we need an configuration option to control this behavior. |
370d1fc to
a0a98d3
Compare
|
@kouvel @jkotas I'm not 100% sure, but it seems that this is not directly related to ndirect handling in mcj, because ndirect handling was disabled back in #48326. This ndirect inlining issue (when there are no ni.dll) seems to be present in mcj for a long time. I also agree, that this change should at least be enabled by default. |
src/coreclr/vm/jitinterface.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think you need to suppress this one. Resolving ENCODE_INDIRECT_PINVOKE_TARGET should not be loading any unmanaged .dlls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The suppression for ENCODE_PINVOKE_TARGET should stay.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
|
Sounds good, we can keep this on by default |
Was thinking for self-contained publish, would it be reasonable for a user to expect that all the native library dependencies would be in the default paths? |
As far as I understand, no. For example, crossgen2 is published as self-contained and it does similar trick, when "clrjitilc" is replaced with actually used lib in runtime using |
a0a98d3 to
2482af1
Compare
|
I'm not familiar with the crossgen2 particulars. Would it be a common issue in customer scenarios with self-contained publish (or even with shared framework cases that I expect would resolve the native dependencies through default paths)? I would understand if it's not worthy of a config option, I'm just wondering. Don't want to block on this though. |
|
@gbalykov could you please add new changes to new commits in the future? It would make it much easier to review the changes. |
|
@kouvel sure, sorry for inconvenience |
I do not actually understand how this is possible for regular PInvokes. Is it only a problem for PInvokes with SuppressGCTransitionAttribute? |
Is doesn't look so. |
|
What is the callstack originating in |
|
@jkotas This is the call stack for After that segfault with call stack mentioned in #55185 (comment) happens. |
This callstack is for SuppressGCTransitionAttribute method ( So disabling the PInvoke inlining for multicore JIT solves this problem, but I think it is too conservative. It really only needs to be disabled for the methods with SuppressGCTransitionAttribute. |
Can't this lead to problems in future? Approach with disallowed inlining only for ndirect methods with SuppressGCTransitionAttribute requires sync between code in |
|
There is a ton of code out there that depends on targets for regular PInvokes to be resolved right as the PInvoke is called for the first time, and no earlier. Changing the JIT to resolve targets of regular PInvokes eagerly would be a major breaking change. |
|
@jkotas ok, thanks for clarification. I'll update first part of this PR by reverting |
|
@jkotas I've updated PR, |
src/coreclr/jit/importer.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is less than idea to be hardcoding policy like this in the JIT. I think it would be better to implement this policy on the VM side in the pInvokeMarshalingRequired method instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about passing explicit flag that tells jit to skip inlining for pinvokes with SuppressGCTransitionAttribute in mcj thread like in gbalykov@124bcaf?
pInvokeMarshalingRequired approach will require change with additional arg passing, because on vm side we don't know whether this is mcj thread compilation or not:
bool CEEInfo::pInvokeMarshalingRequired(CORINFO_METHOD_HANDLE method, CORINFO_SIG_INFO* callSiteSig, bool isMCJ)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's useful it is possible to get the PrepareCodeConfig for the in-progress compilation in callbacks to the VM like this:
runtime/src/coreclr/vm/jitinterface.cpp
Lines 6910 to 6912 in 4bab779
| PrepareCodeConfig *config = GetThread()->GetCurrentPrepareCodeConfig(); | |
| if (config != nullptr) | |
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kouvel thanks a lot, that's a piece of logic I was looking for! I've updated PR.
013c32f to
3fe3a3a
Compare
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
NDirect methods can be inlined in other methods during jitting (see `impCheckForPInvokeCall`). If this happens in mcj thread, it results in NDirect methods being loaded. This should not happen in mcj thread, because user can set custom native library search callbacks. In order to fix this, explicit flag JIT_FLAG_NO_PINVOKE_INLINE is set in mcj.
… they have NDirect methods inlined in them NDirect methods can be inlined in other methods during aot compilation. If such method is loaded in mcj thread, it results in NDirect method being loaded. Additionally, there's no way to control inlining during aot from runtime side, because r2r ni.dll might already exist. In order to fix this, runtime aborts loading if method has NDirect methods inlined in it and this all happens in mcj thread.
This reverts commit 59102a8.
… in MulticoreJit thread
3fe3a3a to
ecc436f
Compare
|
I've rebased this on newest main, there were no conflicts and all tests have passed. Can this be merged? |
|
Thank you! |
|
Thanks! |
NDirect methods can be inlined in other methods during jitting (see
impCheckForPInvokeCall). If this happens in mcj thread, it results in NDirect methods being loaded. This should not happen in mcj thread, because user can set custom native library search callbacks. In order to fix this, explicit flag JIT_FLAG_NO_PINVOKE_INLINE is set in mcj.NDirect methods can be inlined in other methods during aot compilation. If such method is loaded in mcj thread, it results in NDirect method being loaded. Additionally, there's no way to control inlining during aot from runtime side, because r2r ni.dll might already exist. In order to fix this, runtime aborts loading if method has NDirect methods inlined in it and this all happens in mcj thread.
Without this change console helloworld segfaults in debug mode at least on x64.
cc @alpencolt @jkotas