[Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp #122162

Copilot · 2025-12-03T23:33:33Z

Description

Adds infrastructure for emitting integer and floating-point constants in the WebAssembly RyuJIT backend. This enables the backend to generate constant loading instructions (i32.const, i64.const, f32.const, f64.const) when constant nodes appear in the IR.

Changes Made

Instruction definitions: Enabled i32_const, i64_const, f32_const, f64_const instructions in instrswasm.h with corresponding formats (IF_LEB128, IF_F32, IF_F64) in emitfmtswasm.h
Emitter methods: Added emitIns_R_F() for floating-point constants and emitEncodeLEB64() for LEB128 size calculation in emitwasm.h/cpp (both NYI stubs)
Code generation: Implemented genSetRegToConst() in codegenwasm.cpp that:
- Handles GT_CNS_INT by calling emitIns_R_I() with i32_const/i64_const
- Handles GT_CNS_DBL by calling emitIns_R_F() with f32_const/f64_const
- Measures LEB128-encoded size for integer constants
- Uses 4/8-byte sizes for float/double constants
Tree node handling: Added GT_CNS_INT and GT_CNS_DBL cases to genCodeForTreeNode() that invoke genSetRegToConst() and produce the result register

Customer Impact

None. This is foundational work for the experimental WebAssembly RyuJIT backend. The actual instruction emission logic is NYI and will be implemented in follow-up work.

Regression

No. The WebAssembly backend is not yet functional.

Testing

Built CoreCLR successfully with the changes. The WASM backend files (codegenwasm.cpp, emitwasm.cpp) compile without errors or warnings.

Risk

Minimal. Changes are isolated to the WebAssembly-specific codegen files which are not used in production. The NYI stubs ensure no silent failures occur if these code paths are reached unexpectedly.

Original prompt

This section details on the original issue you should resolve

<issue_title>[Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp</issue_title>
<issue_description>The WebAssembly RyuJIT backend in codegenwasm.cpp and emitwasm.cpp needs to be updated as follows:

Add a definition for CodeGen::genSetRegToConst which switches on tree->gtOper() and has cases for GT_CNS_INT and GT_CNS_DBL. The case for GT_CNS_INT should use AsIntCon()->IconValue() to store the integer constant into a ssize_t constant. The case for GT_CNS_DBL should use AsDblCon()->DconValue() to store the double constant into a double constant.

Declare a new member function in emitwasm.h called emitEncodeLEB64 with the signature size_t emitter:emitEncodeLEB64(uint8_t *destination, const void *source, bool valueIsSigned).

Define emitEncodeLeb64 in emitwasm.cpp and make its body a NYI_WASM.

Declare a new member function in emitwasm.h called emitIns_R_F with the signature void emitter::emitIns_R_F(instruction ins, emitAttr attr, regNumber reg, double immDbl).

Define emitIns_R_F in emitwasm.cpp and make its body a NYI_WASM.

Update genSetRegToConst's GT_CNS_INT case to use emitEncodeLEB64 to measure the size of the integer constant like so: emitAttr encodedSize = (emitAttr)GetEmitter()->emitEncodeLEB64(nullptr, &constant, true);

Update genSetRegToConst's GT_CNS_DBL case to determine the size of the floating-point constant based on targetType, like so: emitAttr encodedSize = (targetType == TYP_FLOAT) ? EA_4BYTE : EA_8BYTE;

Update genSetRegToConst's GT_CNS_INT case to call emitIns_R_I with appropriate arguments, like so: GetEmitter()->emitIns_R_I((targetType == TYP_INT) ? INS_i32_const : INS_i64_const, encodedSize, targetReg, constant);.

Update genSetRegToConst's GT_CNS_DBL case to call emitIns_R_F with appropriate arguments, like so: GetEmitter()->emitIns_R_F((targetType == TYP_FLOAT) ? INS_f32_const : INS_f64_const, encodedSize, targetReg, constant);.

Update CodeGen::genCodeForTreeNode to add cases for GT_CNS_INT and GT_CNS_DBL that call the new genSetRegToConst member function. The targetReg is treeNode->GetRegNum() and the targetType is treeNode->TypeGet(). After calling genSetRegToConst, call genProduceReg(treeNode).</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes [Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp #122161

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

dotnet-policy-service · 2025-12-03T23:54:54Z

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

kg · 2025-12-04T00:38:21Z

cc @dotnet/jit-contrib

AndyAyersMS

This should also be extended to handle the missing cases in gtSetEvalOrder

AndyAyersMS · 2025-12-04T01:33:05Z

src/coreclr/jit/codegenwasm.cpp

+        {
+            regNumber targetReg  = treeNode->GetRegNum();
+            var_types targetType = treeNode->TypeGet();
+            genSetRegToConst(targetReg, targetType, treeNode);


Shouldn't this just push the constant on the wasm stack in most cases?

I had it mirror how all the other backends work for now. I agree that we should push to the stack if we can, but I have no idea what that would look like.

It would just emit the i32.const instruction.

We don't have great analogs for some aspects of Wasm codegen in other targets. I am not sure what 'reg' values stack-only values will carry during codegen, perhaps some kind of temp reg set thing, or maybe just REG_NA.

That is, on most targets a node must either materialize its value in a register or in a (fixed) stack slot (or be contained and have its codegen handled elsewhere), with the exception of some argument pushes. But on Wasm the common thing will be to materialize the value on the stack.

@SingleAccretion may have already thought about this some.

The current model is that registers correspond to WASM locals, and I expect that will be sufficient for our needs (where this wouldn't be sufficient is if we wanted to emit debug info with "current stack depth" locations, but I don't think we need that).

Nodes that don't produce WASM locals are therefore all REG_NA-annotated and just push/pop from the stack implicitly. The analogy here is the flags registers and GTF_SET_FLAGS.

Is it worth adding some kind of flag or keeping our own internal stack depth counter for JIT debug assertion purposes? I.e., for compiling a binary operation, it might be nice to be able to assert as an invariant that we have indeed left the stack in a state where there are >= 2 entries?

Is it worth adding some kind of flag or keeping our own internal stack depth counter for JIT debug assertion purposes? I.e., for compiling a binary operation, it might be nice to be able to assert that we have indeed left the stack in a state where there are >= 2 entries?

My thinking on this has been that there would be a "mini validator" that operates on instrDesc* level in debug code. This way you can catch not just depth mismatches but also type mismatches. That said, we can't do that just yet since we don't have the ordering phase yet.

kg · 2025-12-04T01:37:20Z

This should also be extended to handle the missing cases in gtSetEvalOrder

Looking now, I see GT_CNS_LNG. Did I miss another one?

AndyAyersMS · 2025-12-04T01:40:25Z

This should also be extended to handle the missing cases in gtSetEvalOrder

Looking now, I see GT_CNS_LNG. Did I miss another one?

This NYI_WASM will fail compilation the before the JIT can get into codegen with constants.

https://github.com/dotnet/runtime/blob/171a9433a1ef0d8e6316fba75a898367f6870ca4/src/coreclr/jit/gentree.cpp#L5176-L5182

I think it makes sense to call from here into something to get size estimates for the encoded constants. For the "ex" costs it should be cheap so perhaps just cost at 1?

kg · 2025-12-04T01:56:42Z

This should also be extended to handle the missing cases in gtSetEvalOrder

Looking now, I see GT_CNS_LNG. Did I miss another one?

This NYI_WASM will fail compilation the before the JIT can get into codegen with constants.

https://github.com/dotnet/runtime/blob/171a9433a1ef0d8e6316fba75a898367f6870ca4/src/coreclr/jit/gentree.cpp#L5176-L5182

I think it makes sense to call from here into something to get size estimates for the encoded constants. For the "ex" costs it should be cheap so perhaps just cost at 1?

I wasn't expecting this PR to actually get us to the point where we can emit constants, I was planning to come in and fix everything up by hand. I'm not sure I can convince Copilot to fix something as complex as that site in gentree.

Ex cost of 0 or 1 seems about right, and for sz we can probably call the leb encoder to measure it.

AndyAyersMS · 2025-12-04T02:07:55Z

I wasn't expecting this PR to actually get us to the point where we can emit constants, I was planning to come in and fix everything up by hand. I'm not sure I can convince Copilot to fix something as complex as that site in gentree.

Fair enough, just expect we might need to rework things some (like the set reg stuff) once we can get the JIT to go that far.

Co-authored-by: kg <[email protected]>

kg · 2025-12-04T18:33:27Z

Grabbing this PR to manually rebase it and redo the codegenwasm.cpp stuff from scratch to hopefully clean it up.

AndyAyersMS · 2025-12-04T19:46:16Z

src/coreclr/jit/emitwasm.cpp

+            else
+                b |= 0x80;
+
+            if (originalDestination)


Suggested change

if (originalDestination)

if (originalDestination != nullptr)

Also braces even around single-statement blocks

jitformat didn't point this out, I'll make the update, thanks

AndyAyersMS · 2025-12-04T19:46:49Z

src/coreclr/jit/emitwasm.cpp

+            if (value != 0)
+                b |= 0x80;
+
+            if (originalDestination)


Suggested change

if (originalDestination)

if (originalDestination != nullptr)

AndyAyersMS · 2025-12-04T19:48:41Z

src/coreclr/jit/emitwasm.cpp

    return false;
 }

+size_t emitter::emitEncodeLEB64(uint8_t* destination, const void* source, bool valueIsSigned)


Header comment? In particular if destination can be nullptr.

There is already a ULEB128 size estimator, and we need one for LEB128. Would you suggest calling this with nullptr to get the size, then calling it again later to fill in the contents?

Or should we create a separate size estimator for LEB128 and just have this assert it has a non-null destination?

I would prefer to use this for the size estimation, that way everything flows through one implementation. I grabbed this from the jiterpreter so we've proved that it works in production, at least for 64-bit values.

AndyAyersMS · 2025-12-04T19:52:20Z

src/coreclr/jit/emitwasm.cpp

+        case IF_F64:
+        {
+            dst += emitOutputByte(dst, opcode);
+            /* FIXME: How do I get the constant from inside the emitter? */


Codegen for float/dbl constants should create an instrDesc (of suitable type, we may need a new kind) with room for the appropriate value, and then you can add an accessor to call here.

I discussed this with single on discord a little bit. I'm thinking either we add a new instrDesc for 64-bit integral constants (and pack doubles into that) or we repurpose VEC constants since we know those are big enough. Which do you prefer?

We try and keep instrDesc sizes minimal though that may be less of an issue if we're always prejitting. I'd be ok either way.

kg · 2025-12-04T23:05:52Z

This needs lots of cleanup but disasmo and crossgen2 both complete for constants without hitting any JIT assertions now.

The design probably still needs a from-scratch rethink and some of the changes in here probably should be reverted. I'm going to apply the jit-format patch once it runs.

adamperlin · 2025-12-04T23:31:43Z

src/coreclr/jit/emitfmtswasm.h

 IF_DEF(BLOCK,   IS_NONE, NONE) // <opcode> <0x40>
 IF_DEF(LABEL,   IS_NONE, NONE) // <ULEB128 immediate>
 IF_DEF(ULEB128, IS_NONE, NONE) // <opcode> <ULEB128 immediate>
+IF_DEF(LEB128,  IS_NONE, NONE) // <opcode> <LEB128 immediate (signed)>


Can we do SLEB for signed, and in other places where applicable?

Suggested change

IF_DEF(LEB128, IS_NONE, NONE) // <opcode> <LEB128 immediate (signed)>

IF_DEF(SLEB128, IS_NONE, NONE) // <opcode> <SLEB128 immediate>

PaulusParssinen · 2025-12-05T16:05:02Z

src/coreclr/jit/emitwasm.cpp

-static unsigned SizeOfULEB128(uint64_t value)
+size_t SizeOfULEB128(uint64_t value)
 {
-    // bits_to_encode = (data != 0) ? 64 - CLZ(x) : 1 = 64 - CLZ(data | 1)
-    // bytes = ceil(bits_to_encode / 7.0);            = (6 + bits_to_encode) / 7
-    unsigned x = 6 + 64 - (unsigned)BitOperations::LeadingZeroCount(value | 1UL);
-    // Division by 7 is done by (x * 37) >> 8 where 37 = ceil(256 / 7).
-    // This works for 0 <= x < 256 / (7 * 37 - 256), i.e. 0 <= x <= 85.
-    return (x * 37) >> 8;
+    // Reuses the encoder for consistency and simplicity.
+    // TODO: If this becomes a bottleneck, do a more direct size estimation.
+    return EncodeLEB64(nullptr, &value, false);
+}


Why are we switching away from branchless _{(on almost every platform)} exact size (U)LEB128 size calculation?

Here is the branchless ULEB128 variant if needed https://android.googlesource.com/platform/art/+/ba257bc/runtime/leb128.h#111

Both of these are used in ILC for DWARF (U)LEB128s

Copilot AI assigned Copilot and kg Dec 3, 2025

Copilot started work on behalf of kg December 3, 2025 23:33 View session

github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Dec 3, 2025

kg added arch-wasm WebAssembly architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Dec 3, 2025

Copilot AI changed the title ~~[WIP] Add partial constant emitting support to codegenwasm.cpp~~ [Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp Dec 4, 2025

Copilot AI requested a review from kg December 4, 2025 00:08

Copilot finished work on behalf of kg December 4, 2025 00:08

kg approved these changes Dec 4, 2025

View reviewed changes

AndyAyersMS reviewed Dec 4, 2025

View reviewed changes

Copilot AI and others added 3 commits December 4, 2025 10:27

Initial plan

e64dd5d

Add partial constant emitting support to WebAssembly RyuJIT backend

116ab85

Co-authored-by: kg <[email protected]>

Back out some changes.

e6d4abc

Fill out more stuff

3368ede

kg force-pushed the copilot/add-partial-constant-emitting branch from 4cc70c5 to 3368ede Compare December 4, 2025 19:17

Cleanup formatting

9425af2

AndyAyersMS reviewed Dec 4, 2025

View reviewed changes

kg added 4 commits December 4, 2025 12:17

Cleanups

c84d652

Checkpoint

dfd67ee

Checkpoint

7aae050

Restore obliterated changes

c187ed4

kg added 17 commits December 4, 2025 12:49

Checkpoint

71b7ca4

Checkpoint

0ed9418

Checkpoint

9601838

Checkpoint

f20b75b

Checkpoint

31817a4

Checkpoint

7215eec

Checkpoint

248c9ad

Checkpoint

f12c3cc

Checkpoint

fee66ff

Checkpoint

3c24bf3

Checkpoint

a7217ac

Checkpoint

540d2a4

Checkpoint

1ad2af5

Checkpoint

dcb37a5

Checkpoint

43fad5e

Checkpoint

3192443

Checkpoint

7cec96a

Apply JIT format patch

5109126

adamperlin reviewed Dec 4, 2025

View reviewed changes

PaulusParssinen reviewed Dec 5, 2025

View reviewed changes

	IF_DEF(LEB128, IS_NONE, NONE) // <opcode> <LEB128 immediate (signed)>
	IF_DEF(SLEB128, IS_NONE, NONE) // <opcode> <SLEB128 immediate>

[Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp #122162

Are you sure you want to change the base?

[Wasm RyuJit] Add partial constant emitting support to codegenwasm.cpp #122162

Conversation

Copilot AI commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes Made

Customer Impact

Regression

Testing

Risk

Comments on the Issue (you are @copilot in this section)

Uh oh!

dotnet-policy-service bot commented Dec 3, 2025

Uh oh!

kg commented Dec 4, 2025

Uh oh!

AndyAyersMS left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamperlin Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kg commented Dec 4, 2025

Uh oh!

AndyAyersMS commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kg commented Dec 4, 2025

Uh oh!

AndyAyersMS commented Dec 4, 2025

Uh oh!

kg commented Dec 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kg commented Dec 4, 2025

Uh oh!

adamperlin Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PaulusParssinen Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Copilot AI commented Dec 3, 2025 •

edited

Loading

adamperlin Dec 4, 2025 •

edited

Loading

AndyAyersMS commented Dec 4, 2025 •

edited

Loading

adamperlin Dec 4, 2025 •

edited

Loading

PaulusParssinen Dec 5, 2025 •

edited

Loading