-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix CRC32 encoding #60329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CRC32 encoding #60329
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsOn x64, when the crc32 instruction 2nd operand is a memory address Fixes #59714
|
|
@tannergooding @dotnet/jit-contrib PTAL |
src/coreclr/jit/emitxarch.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure this is the right fix....
I would have expected EncodedBySSE38OrSSE3A to be returning true here, because CRC32 is F2 0F 38 F0 or F2 0F 38 F1.
However, looking at https://github.com/dotnet/runtime/blob/745fa1c7b30f1b9107084f0baecc325527fa9561/src/coreclr/jit/emitxarch.cpp#L1583-L1611, it might be failing because the "prefix" is 0xF2 and not 0x66.
0x66, 0xF2, and 0xF3 are all encounterable "prefix" bytes here and this is probably something that was missed when instructions covering the other two were added. In particular:
0x66is the most prominent and covers every instruction we support today, exceptCRC320xF2- Only used forSSE42.CRC320xF3- no instructions exposed here yetADX.ADOXAESKLE.AESDEC128KLAESKLE.AESDEC256KLAESKLEWIDE_KL.AESDECWIDE128KLAESKLEWIDE_KL.AESDECWIDE256KLAESKLE.AESENC128KLAESKLE.AESENC256KLAESKLEWIDE_KL.AESENCWIDE128KLAESKLEWIDE_KL.AESENCWIDE256KLAESKLE.ENCODEKEY128AESKLE.ENCODEKEY256KL.LOADIWKEY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure this is the right fix
Do you mean you don't know if this fixes the issue? Or you think there might be a more general fix? (I did of course verify that it fixes the issue. It's also nice that it's very simple and contained, as I want to port this to .NET 6.)
More details:
EncodedBySSE38OrSSE3A returns false for crc32 because crc32 is not in the SSE or AVX instruction range, so IsSSEOrAVXInstruction returns false. (Not sure why it's defined that way; tzcnt/lzcnt/popcnt also aren't in that set.)
Looking through the code base, there are 17 places that use the condition (EncodedBySSE38orSSE3A(ins) || (ins == INS_crc32)) and only 8 that use the condition (EncodedBySSE38orSSE3A(ins)). In particular, the places without the || (ins == INS_crc32) logic handle emission of:
IF_RRW_ARD_CNS
IF_RWR_ARD_CNS
IF_RWR_RRD_ARD_CNS
IF_RWR_RRD_ARD_RRD
IF_RRW_SRD_CNS
IF_RWR_SRD_CNS
IF_RWR_RRD_SRD
IF_RWR_RRD_SRD_CNS
IF_RWR_RRD_SRD_RRD
IF_RRW_MRD_CNS
IF_RWR_MRD_CNS
IF_RWR_RRD_MRD
IF_RWR_RRD_MRD_CNS
IF_RWR_RRD_MRD_RRD
Since crc32 is a binary operator, none of these apply, so we could let EncodedBySSE38OrSSE3A handle crc32 as well (and rename it to match).
The only question then is whether Is4ByteSSEInstruction should also include it (that's the only other use of EncodedBySSE38OrSSE3A). That's a little trickier to unravel; there a case in emitOutputRR I'm not sure about; a case in emitOutputInstr dealing with IF_RRW_RRW_CNS that doesn't matter; and a case in emitGetAdjustedSize to fix. Presumably we could define it as:
bool emitter::Is4ByteSSEInstruction(instruction ins)
{
return (!UseVEXEncoding() && EncodedBySSE38orSSE3A(ins)) || (ins == INS_crc32);
}
as crc32 is not affected by VEX encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean you don't know if this fixes the issue
I meant this as, there might be additional locations that need equivalent fixups; however, adjusting the existing method that handles SSE38 and SSE3A encodings may be a better overall fix.
EncodedBySSE38OrSSE3A returns false for crc32 because crc32 is not in the SSE or AVX instruction range
Ah I see, that would impact this as well: https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/instrsxarch.h#L615
For what its worth, it is actually an SSE4.2 instruction (and SSE38 based encoding), even if its not a vector instruction and should likely be in the list (much as BMI1 and BMI2 are in the overall AVX range). I'm guessing that crc32, tzcnt, lzcnt, and popcnt fell into some special category that forced them out of the SSE/AVX ranges, possibly due to more complicated fixups that would've been required. In particular:
crc32is explicitly SSE4.2tzcntis explicitly BMI1lzcntis implicitly BMI1 on Intel; its separate because its a separate bit for compat with AMD which exposed it as part of ABM ~2007 (so 6 years earlier)popcntis implicitly SSE4.2 on Intel; its separate because for the same reason aslzcnt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your comments above; it sounds like this fix is fine for .NET 6. It's probably worth seeing if we should adjust the overall handling for .NET 7 to properly account for the SSE38 prefix difference (as that is likely required for future instruction expansion regardless).
On x64, when the crc32 instruction 2nd operand is a memory address (such as for a static field), and that address is containable (which normally doesn't happen, because the address will be above the 4GB lower address space), then the instruction was being improperly encoded.
745fa1c to
2e286ba
Compare
|
@tannergooding Do you have any further change you think are required for this, or are you willing to approve this PR? |
tannergooding
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good for a minimal fix.
I think, in terms of potential future instructions and the like we may need a more complex fix properly recognizing CRC32 as EncodedBySSE38
|
/backport to release/6.0 |
|
Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1338896860 |
On x64, when the crc32 instruction 2nd operand is a memory address
(such as for a static field), and that address is containable
(which normally doesn't happen, because the address will be above
the 4GB lower address space), then the instruction was being
improperly encoded.
Fixes #59714