-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix for Issue 44895 #45284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for Issue 44895 #45284
Conversation
|
@dotnet/jit-contrib PTAL There are no Asm diffs in System.Private.Corelib |
src/coreclr/src/jit/gentree.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could back this change out as it is not necessary for this fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change cause the diffs that you see with arm64 multi-reg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated comment:
Yes, it does cause the diffs that I saw. I am backing it out as it is unnecessary
|
Nice catch! Does the test take a long time to run, should it be Pri1 or longRunningGC test? I have tried to run it locally and it failed with a Jit assert: for this tree: looks like we catch this problem with our asserts but it makes it hard to understand what exactly goes wrong in release and why it leads to bad execution. I think your test uncovers a bigger issue with that can be hit not only for I have added another condition to your test so now it fails with the same assert but for I think there are 2 main questions:
|
|
The test fails with an assert using Checked, With a retail build I think that there is a (Bad Codegen and or a GC hole). I will try running it with GCSTRESS. |
|
We will want a conservative fix for 5.0, so that is what I will pursue as the initial fix in master as well. We wouldn't want to allow gtNewTemp assign to accept a mismatched TYP_REF type assignment. |
89445d5 to
3c990ff
Compare
|
Joy, Joy: This branch has conflicts that must be resolved |
|
PR #44973 removed the second src folder. |
|
@JulieLeeMSFT PR #44973 removed the second src folder. Yes, I know and this caused all existing PR to have a merge conflict that has to be resolved by hand. :-( |
|
@dotnet/jit-contrib @sandreenko Please take a look I expanded this chnage to disallow any mismatched type for a GT_RETURN of a struct, |
|
@briansull do you know on which tree we had a GC hole, was it I was surprised to see that in I agree with your latest change that the type doesn't matter but I am not sure that |
|
@sandreenko |
|
@AndyAyersMS @sandreenko |
|
I still don't know what the bug is, so it's hard to be confident this is a complete fix. Also I wonder if there's some overlap here with #45557? |
|
@briansull are you confident that the issue can't be repro when the parent is not a return? |
Do you mean "when the parent is not a GT_RETURN"? |
Yes, thanks, edited. |
|
@briansull what does the bad IL (and codegen) look like in the test case, and what does it look like with your change? |
|
I am continuing to investigate. This is not an area in which I know a lot about. @sandreenko may know more about the invariants in this area than I know. If I disallow all other parents there is a huge regression, but it looks like most of that is where the parent is a GT_ADDR node which should always be a safe parent node. |
|
@BruceForstall The IL isn't bad, instead we hit a whole pile of asserts with a checked build that don't exist in a release build, so in release generates what I assume is bade code because the asserts are there for a good reason. |
|
Only five methods have diffs in |
|
Here is an example of what the transformation does: The tree node being modified here is: In this case the parent node is a GT_ASG, so this is one of the five diffs that I got in System.Private.Corelib |
|
I've simplified the test so now it does not need any additional plugins or iterations, the test always passes in Debug and fails in Release (compile with VS with 5.0 target). I think it is easier to analyze, so later on we CSE the LHS and write to a CSE temp instead of the actual value, when return still references and the actual return value is never written. Could somebody generate 3.1 JitDump? I do not have it on my current machine. I have checked that it works there but don't know what exactly exposed the issue, I suspect it was caused by mine It is still an open question if it can be reproduced without into correct but morph is not called in runtime/src/coreclr/jit/morph.cpp Lines 16935 to 16942 in 76a443d
again, I think we are just getting lucky and the other phases are not really prepared for this |
it could be a safe solution, but this method looks important: what happens there? |
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
src/coreclr/jit/gentree.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is under #ifdef DEBUG it's not going to do anything useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this new check and noway_assert:
// Added this noway_assert for runtime\issue 44895, to protect against silent bad codegen
//
if ((dstTyp == TYP_STRUCT) && (valTyp == TYP_REF))
{
noway_assert(!"Incompatible types for gtNewTempAssign");
}
src/coreclr/jit/importer.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for backport sake we should leave it out.
@sandreenko Both tests fail when run with complus_TieredCompilation=0 |
They're more or less equivalent so I would just keep the simpler one. |
- Fix: Don't allow an unwrapped promoted field of TYP_REF to be returned when we are expecting a TYP_STRUCT Backout change in gtGetStructHandleIfPresent for GT_RETURN as it isn't needed for this fix Deoptimize all GT_RETURN's with mismatched types for promoted struct fields.
|
@AndyAyersMS @sandreenko |
|
/azp run runtime-coreclr jitstress |
|
Azure Pipelines successfully started running 1 pipeline(s). |
sandreenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
runtime (Libraries Test Run release coreclr windows x64 Debug) Failing after 42m |
|
I think this is a known issue that pops up from time to time: #29683. |
|
@AndyAyersMS - Thanks I guess I will go ahead and merge and I'm sure it isn't caused by my changes. |
|
Fixes #44895 |
* Fix for Issue 44895 - Fix: Don't allow an unwrapped promoted field of TYP_REF to be returned when we are expecting a TYP_STRUCT Backout change in gtGetStructHandleIfPresent for GT_RETURN as it isn't needed for this fix Deoptimize all GT_RETURN's with mismatched types for promoted struct fields. * Allow both GT_ADDR and GT_ASG as a parent node * Add second test case Repro2_44895.cs * Change assert about Incompatible types to be a noway_assert in gtNewTempAssign * Only use the smaller repro case for Runtime_44895.cs * Added noway_assert in release build for an assignment of a TYP_REF to a TYP_STRUCT * rerun jit-format
Don't allow an unwrapped promoted field to be returned when we are expecting a TYP_STRUCT
(Expanded to include any mismatched type)