Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 3, 2025

Description

Adds infrastructure for emitting integer and floating-point constants in the WebAssembly RyuJIT backend. This enables the backend to generate constant loading instructions (i32.const, i64.const, f32.const, f64.const) when constant nodes appear in the IR.

Changes Made

  • Instruction definitions: Enabled i32_const, i64_const, f32_const, f64_const instructions in instrswasm.h with corresponding formats (IF_LEB128, IF_F32, IF_F64) in emitfmtswasm.h

  • Emitter methods: Added emitIns_R_F() for floating-point constants and emitEncodeLEB64() for LEB128 size calculation in emitwasm.h/cpp (both NYI stubs)

  • Code generation: Implemented genSetRegToConst() in codegenwasm.cpp that:

    • Handles GT_CNS_INT by calling emitIns_R_I() with i32_const/i64_const
    • Handles GT_CNS_DBL by calling emitIns_R_F() with f32_const/f64_const
    • Measures LEB128-encoded size for integer constants
    • Uses 4/8-byte sizes for float/double constants
  • Tree node handling: Added GT_CNS_INT and GT_CNS_DBL cases to genCodeForTreeNode() that invoke genSetRegToConst() and produce the result register

Customer Impact

None. This is foundational work for the experimental WebAssembly RyuJIT backend. The actual instruction emission logic is NYI and will be implemented in follow-up work.

Regression

No. The WebAssembly backend is not yet functional.

Testing

Built CoreCLR successfully with the changes. The WASM backend files (codegenwasm.cpp, emitwasm.cpp) compile without errors or warnings.

Risk

Minimal. Changes are isolated to the WebAssembly-specific codegen files which are not used in production. The NYI stubs ensure no silent failures occur if these code paths are reached unexpectedly.

Original prompt

This section details on the original issue you should resolve

<issue_title>[Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp</issue_title>
<issue_description>The WebAssembly RyuJIT backend in codegenwasm.cpp and emitwasm.cpp needs to be updated as follows:

  1. Add a definition for CodeGen::genSetRegToConst which switches on tree->gtOper() and has cases for GT_CNS_INT and GT_CNS_DBL. The case for GT_CNS_INT should use AsIntCon()->IconValue() to store the integer constant into a ssize_t constant. The case for GT_CNS_DBL should use AsDblCon()->DconValue() to store the double constant into a double constant.
  2. Declare a new member function in emitwasm.h called emitEncodeLEB64 with the signature size_t emitter:emitEncodeLEB64(uint8_t *destination, const void *source, bool valueIsSigned).
  3. Define emitEncodeLeb64 in emitwasm.cpp and make its body a NYI_WASM.
  4. Declare a new member function in emitwasm.h called emitIns_R_F with the signature void emitter::emitIns_R_F(instruction ins, emitAttr attr, regNumber reg, double immDbl).
  5. Define emitIns_R_F in emitwasm.cpp and make its body a NYI_WASM.
  6. Update genSetRegToConst's GT_CNS_INT case to use emitEncodeLEB64 to measure the size of the integer constant like so: emitAttr encodedSize = (emitAttr)GetEmitter()->emitEncodeLEB64(nullptr, &constant, true);
  7. Update genSetRegToConst's GT_CNS_DBL case to determine the size of the floating-point constant based on targetType, like so: emitAttr encodedSize = (targetType == TYP_FLOAT) ? EA_4BYTE : EA_8BYTE;
  8. Update genSetRegToConst's GT_CNS_INT case to call emitIns_R_I with appropriate arguments, like so: GetEmitter()->emitIns_R_I((targetType == TYP_INT) ? INS_i32_const : INS_i64_const, encodedSize, targetReg, constant);.
  9. Update genSetRegToConst's GT_CNS_DBL case to call emitIns_R_F with appropriate arguments, like so: GetEmitter()->emitIns_R_F((targetType == TYP_FLOAT) ? INS_f32_const : INS_f64_const, encodedSize, targetReg, constant);.
  10. Update CodeGen::genCodeForTreeNode to add cases for GT_CNS_INT and GT_CNS_DBL that call the new genSetRegToConst member function. The targetReg is treeNode->GetRegNum() and the targetType is treeNode->TypeGet(). After calling genSetRegToConst, call genProduceReg(treeNode).</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI assigned Copilot and kg Dec 3, 2025
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Dec 3, 2025
@kg kg added arch-wasm WebAssembly architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Dec 3, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

Copilot AI changed the title [WIP] Add partial constant emitting support to codegenwasm.cpp [Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp Dec 4, 2025
Copilot AI requested a review from kg December 4, 2025 00:08
Copilot finished work on behalf of kg December 4, 2025 00:08
@kg
Copy link
Member

kg commented Dec 4, 2025

cc @dotnet/jit-contrib

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be extended to handle the missing cases in gtSetEvalOrder

{
regNumber targetReg = treeNode->GetRegNum();
var_types targetType = treeNode->TypeGet();
genSetRegToConst(targetReg, targetType, treeNode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just push the constant on the wasm stack in most cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it mirror how all the other backends work for now. I agree that we should push to the stack if we can, but I have no idea what that would look like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would just emit the i32.const instruction.

We don't have great analogs for some aspects of Wasm codegen in other targets. I am not sure what 'reg' values stack-only values will carry during codegen, perhaps some kind of temp reg set thing, or maybe just REG_NA.

That is, on most targets a node must either materialize its value in a register or in a (fixed) stack slot (or be contained and have its codegen handled elsewhere), with the exception of some argument pushes. But on Wasm the common thing will be to materialize the value on the stack.

@SingleAccretion may have already thought about this some.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current model is that registers correspond to WASM locals, and I expect that will be sufficient for our needs (where this wouldn't be sufficient is if we wanted to emit debug info with "current stack depth" locations, but I don't think we need that).

Nodes that don't produce WASM locals are therefore all REG_NA-annotated and just push/pop from the stack implicitly. The analogy here is the flags registers and GTF_SET_FLAGS.

Copy link
Contributor

@adamperlin adamperlin Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding some kind of flag or keeping our own internal stack depth counter for JIT debug assertion purposes? I.e., for compiling a binary operation, it might be nice to be able to assert as an invariant that we have indeed left the stack in a state where there are >= 2 entries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding some kind of flag or keeping our own internal stack depth counter for JIT debug assertion purposes? I.e., for compiling a binary operation, it might be nice to be able to assert that we have indeed left the stack in a state where there are >= 2 entries?

My thinking on this has been that there would be a "mini validator" that operates on instrDesc* level in debug code. This way you can catch not just depth mismatches but also type mismatches. That said, we can't do that just yet since we don't have the ordering phase yet.

@kg
Copy link
Member

kg commented Dec 4, 2025

This should also be extended to handle the missing cases in gtSetEvalOrder

Looking now, I see GT_CNS_LNG. Did I miss another one?

@AndyAyersMS
Copy link
Member

AndyAyersMS commented Dec 4, 2025

This should also be extended to handle the missing cases in gtSetEvalOrder

Looking now, I see GT_CNS_LNG. Did I miss another one?

This NYI_WASM will fail compilation the before the JIT can get into codegen with constants.

https://github.com/dotnet/runtime/blob/171a9433a1ef0d8e6316fba75a898367f6870ca4/src/coreclr/jit/gentree.cpp#L5176-L5182

I think it makes sense to call from here into something to get size estimates for the encoded constants. For the "ex" costs it should be cheap so perhaps just cost at 1?

@kg
Copy link
Member

kg commented Dec 4, 2025

This should also be extended to handle the missing cases in gtSetEvalOrder

Looking now, I see GT_CNS_LNG. Did I miss another one?

This NYI_WASM will fail compilation the before the JIT can get into codegen with constants.

https://github.com/dotnet/runtime/blob/171a9433a1ef0d8e6316fba75a898367f6870ca4/src/coreclr/jit/gentree.cpp#L5176-L5182

I think it makes sense to call from here into something to get size estimates for the encoded constants. For the "ex" costs it should be cheap so perhaps just cost at 1?

I wasn't expecting this PR to actually get us to the point where we can emit constants, I was planning to come in and fix everything up by hand. I'm not sure I can convince Copilot to fix something as complex as that site in gentree.

Ex cost of 0 or 1 seems about right, and for sz we can probably call the leb encoder to measure it.

@AndyAyersMS
Copy link
Member

I wasn't expecting this PR to actually get us to the point where we can emit constants, I was planning to come in and fix everything up by hand. I'm not sure I can convince Copilot to fix something as complex as that site in gentree.

Fair enough, just expect we might need to rework things some (like the set reg stuff) once we can get the JIT to go that far.

@kg
Copy link
Member

kg commented Dec 4, 2025

Grabbing this PR to manually rebase it and redo the codegenwasm.cpp stuff from scratch to hopefully clean it up.

@kg kg force-pushed the copilot/add-partial-constant-emitting branch from 4cc70c5 to 3368ede Compare December 4, 2025 19:17
else
b |= 0x80;

if (originalDestination)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (originalDestination)
if (originalDestination != nullptr)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also braces even around single-statement blocks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jitformat didn't point this out, I'll make the update, thanks

if (value != 0)
b |= 0x80;

if (originalDestination)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (originalDestination)
if (originalDestination != nullptr)

return false;
}

size_t emitter::emitEncodeLEB64(uint8_t* destination, const void* source, bool valueIsSigned)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Header comment? In particular if destination can be nullptr.

There is already a ULEB128 size estimator, and we need one for LEB128. Would you suggest calling this with nullptr to get the size, then calling it again later to fill in the contents?

Or should we create a separate size estimator for LEB128 and just have this assert it has a non-null destination?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to use this for the size estimation, that way everything flows through one implementation. I grabbed this from the jiterpreter so we've proved that it works in production, at least for 64-bit values.

case IF_F64:
{
dst += emitOutputByte(dst, opcode);
/* FIXME: How do I get the constant from inside the emitter? */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codegen for float/dbl constants should create an instrDesc (of suitable type, we may need a new kind) with room for the appropriate value, and then you can add an accessor to call here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with single on discord a little bit. I'm thinking either we add a new instrDesc for 64-bit integral constants (and pack doubles into that) or we repurpose VEC constants since we know those are big enough. Which do you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try and keep instrDesc sizes minimal though that may be less of an issue if we're always prejitting. I'd be ok either way.

@kg
Copy link
Member

kg commented Dec 4, 2025

This needs lots of cleanup but disasmo and crossgen2 both complete for constants without hitting any JIT assertions now.
image
The design probably still needs a from-scratch rethink and some of the changes in here probably should be reverted. I'm going to apply the jit-format patch once it runs.

IF_DEF(BLOCK, IS_NONE, NONE) // <opcode> <0x40>
IF_DEF(LABEL, IS_NONE, NONE) // <ULEB128 immediate>
IF_DEF(ULEB128, IS_NONE, NONE) // <opcode> <ULEB128 immediate>
IF_DEF(LEB128, IS_NONE, NONE) // <opcode> <LEB128 immediate (signed)>
Copy link
Contributor

@adamperlin adamperlin Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do SLEB for signed, and in other places where applicable?

Suggested change
IF_DEF(LEB128, IS_NONE, NONE) // <opcode> <LEB128 immediate (signed)>
IF_DEF(SLEB128, IS_NONE, NONE) // <opcode> <SLEB128 immediate>

Comment on lines -129 to +229
static unsigned SizeOfULEB128(uint64_t value)
size_t SizeOfULEB128(uint64_t value)
{
// bits_to_encode = (data != 0) ? 64 - CLZ(x) : 1 = 64 - CLZ(data | 1)
// bytes = ceil(bits_to_encode / 7.0); = (6 + bits_to_encode) / 7
unsigned x = 6 + 64 - (unsigned)BitOperations::LeadingZeroCount(value | 1UL);
// Division by 7 is done by (x * 37) >> 8 where 37 = ceil(256 / 7).
// This works for 0 <= x < 256 / (7 * 37 - 256), i.e. 0 <= x <= 85.
return (x * 37) >> 8;
// Reuses the encoder for consistency and simplicity.
// TODO: If this becomes a bottleneck, do a more direct size estimation.
return EncodeLEB64(nullptr, &value, false);
}
Copy link
Contributor

@PaulusParssinen PaulusParssinen Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we switching away from branchless (on almost every platform) exact size (U)LEB128 size calculation?

Here is the branchless ULEB128 variant if needed https://android.googlesource.com/platform/art/+/ba257bc/runtime/leb128.h#111

Both of these are used in ILC for DWARF (U)LEB128s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp

6 participants