-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix LSRA handling of arm32 double registers and spilling #94947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d381c85
6d8545d
445f131
ef25ee6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -7946,8 +7946,9 @@ regNumber LinearScan::getTempRegForResolution(BasicBlock* fromBlock, | |||||||||||||||
| #ifdef TARGET_ARM | ||||||||||||||||
| if (type == TYP_DOUBLE) | ||||||||||||||||
| { | ||||||||||||||||
| // Exclude any doubles for which the odd half isn't in freeRegs. | ||||||||||||||||
| freeRegs = freeRegs & ((freeRegs << 1) & RBM_ALLDOUBLE); | ||||||||||||||||
| // Exclude any doubles for which the odd half isn't in freeRegs, | ||||||||||||||||
| // and restrict down to just the even part of the even/odd pair. | ||||||||||||||||
| freeRegs &= (freeRegs & RBM_ALLDOUBLE_HIGH) >> 1; | ||||||||||||||||
| } | ||||||||||||||||
| #endif | ||||||||||||||||
|
|
||||||||||||||||
|
|
@@ -12457,6 +12458,18 @@ regMaskTP LinearScan::RegisterSelection::select(Interval* currentInterval, | |||||||||||||||
| regMaskTP busyRegs = linearScan->regsBusyUntilKill | linearScan->regsInUseThisLocation; | ||||||||||||||||
| candidates &= ~busyRegs; | ||||||||||||||||
|
|
||||||||||||||||
| #ifdef TARGET_ARM | ||||||||||||||||
| // For TYP_DOUBLE on ARM, we can only use an even floating-point register for which the odd half | ||||||||||||||||
| // is also available. Thus, for every busyReg that is an odd floating-point register, we need to | ||||||||||||||||
| // remove from candidates the corresponding even floating-point register. For example, if busyRegs | ||||||||||||||||
| // contains `f3`, we need to remove `f2` from the candidates for a double register interval. The | ||||||||||||||||
| // clause below creates a mask to do this. | ||||||||||||||||
| if (currentInterval->registerType == TYP_DOUBLE) | ||||||||||||||||
|
||||||||||||||||
| #ifdef TARGET_ARM | |
| if (regType == TYP_DOUBLE) | |
| { | |
| assert(genIsValidDoubleReg(reg)); | |
| regMask |= (regMask << 1); | |
| } | |
| #endif // TARGET_ARM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model used for arm32 doubles is possibly inconsistent: it seems like only the low register of the pair is set in some (most) cases, whereas in getRegMask both are set (IMO, that function has poorly named given genRegMask exists). Also, it's not clear when LSRA should use genRegMask versus getRegMask.
However, here, busyRegs is correct: only one of two of a double pair are busy. But because we're asking for a double pair, we need to avoid it if the pair is not fully available.
The example is that when we're here allocating a double, the candidates are all the even registers (the odd registers are not candidates, although when we allocate an even register, we implicitly also allocate the odd register). So, say f2 is a candidate, and f3 is a busy register. we need to remove f2 from the candidate set due to f3 being busy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
genRegMaskversusgetRegMask
I think we should use getRegMask(reg, var_type) version and delete genRegMask(reg, var_type). They are doing same thing.
So, say f2 is a candidate, and f3 is a busy register. we need to remove f2 from the candidate set due to f3 being busy.
Correct, thats what I expect, but it should happen at place that added f3 as busy register. Do you know which of regsBusyUntilKill or regsInUseThisLocation does it gets marked as busy? The change looks correct, but I want to make sure we do not accidently mark a register (probably in jitstressregs) not part of current even/odd pair as busy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case was regsBusyUntilKill.
but it should happen at place that added f3 as busy register
If f3 is marked as busy because it is a float (single-precision) type, then you wouldn't want to mark f2 busy at that point. It's only when we're looking for candidates for a double that we need to mask off both f2 and f3 if f3 is already marked as busy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind sharing the repro or the JitDump file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sent a link to JitDump offline.
You can also repro locally using:
py src\coreclr\scripts\superpmi.py replay -f coreclr_tests -arch x86 -target_arch arm -target_os linux -jitoption JitOptRepeat#* -jitoption JitOptRepeatCount#2 -jitoption JitStressRegs#3
And looking for:
Assertion failed '!isRegBusy(physRegRecord->regNum, current->registerType)'
(for me, MC#459399)
(I’ve fixed the other asserts in #94250)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to repro and see how f3 ends up in regsBusyUntilKill. I am ok with this fix, but can we add assert((busyRegs & 0x555555550000) == RBM_NONE) which will confirm at least that we never have f2 in this mask and hence we don't accidently remove f1 from the mask?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better understanding this concept, should we create a function or macro something like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... that seems very confusing to me.
RBM_ALLDOUBLEalready is the even or "low" registers of the pairs.This change fixes a bug I noticed just by inspection, and leads to some diffs. I'm still investigating, but I also found some pre-existing corruption in double register output in double register disasm, so I'm continuing down the rabbit hole...