[release/8.0-staging] Fix constant folding for arm64 MultiplyByScalar operations involving Vector2.One #98150

jakobbotsch · 2024-02-08T10:18:32Z

Backport of #94413 to release/8.0-staging

Customer Impact

Customer reported
Found internally

Expressions involving multiplications of VectorX.One with a scalar are constant folded to incorrect values. For example, Vector2.One * a, which should be folded to <a, a> is folded to <a, 0>. Customer reported in #98137; this backports a .NET 9 fix #94413 that noticed and fixed the issue.

Regression

Yes
No

Testing

Regression test included.

Risk

Low. The fix has been in main for several months already.

ghost · 2024-02-08T10:18:43Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Backport of #94413 to release/8.0-staging

Customer Impact

Customer reported
Found internally

Expressions involving multiplications of VectorX.One with a scalar are constant folded to incorrect values. For example, Vector2.One * a, which should be folded to <a, a> is folded to <a, 0>. Customer reported in #98137; this backports a .NET 9 fix #94413 that noticed and fixed the issue.

Regression

Yes
No

Testing

Regression test included.

Risk

Low. The fix has been in main for several months already.

Author:	jakobbotsch
Assignees:	-
Labels:	`area-CodeGen-coreclr`
Milestone:	-

carlossanlop · 2024-02-08T22:44:08Z

Friendly reminder that Monday February 12th is the Code Complete deadline for the March Release. Please make sure to add the servicing-consider label when ready and send the email to Tactics requesting approval.

theofficialgman · 2024-02-09T06:14:35Z

@jakobbotsch in the meantime, while this bug exists in release 8.0.x and likely will for at least a few weeks/months until this is merged and a new release is cut, can the problematic NI_AdvSimd_MultiplyByScalar and NI_AdvSimd_Arm64_MultiplyByScalar be forced to use the interpreter and never be jitted via an environment variable?

jakobbotsch · 2024-02-09T11:53:31Z

@jakobbotsch in the meantime, while this bug exists in release 8.0.x and likely will for at least a few weeks/months until this is merged and a new release is cut, can the problematic NI_AdvSimd_MultiplyByScalar and NI_AdvSimd_Arm64_MultiplyByScalar be forced to use the interpreter and never be jitted via an environment variable?

Not easily I think, @tannergooding do you know if we have any fine grained controls that would allow this?

A possible workaround is to extract the multiplication to your own uninlined function, something like

[MethodImpl(MethodImplOptions.NoInlining)]
static Vector2 MulByScalar(Vector2 v, float a) => v * a;

jeffschwMSFT

approved. we will take for consideration in 8.0.x

tannergooding · 2024-02-09T20:03:29Z

do you know if we have any fine grained controls that would allow this?

We do not. You either get all intrinsics in an ISA, or none of them. There is no per method configuration (which would be a nightmare to maintain, given we have over 2000 Arm64 intrinsic APIs).

We likewise have no support for emulating the behavior of Arm64 hardware intrinsics. If the hardware doesn't support them, you don't get access to them and they throw.

theofficialgman · 2024-02-11T19:22:03Z

do you know if we have any fine grained controls that would allow this?

We do not. You either get all intrinsics in an ISA, or none of them. There is no per method configuration (which would be a nightmare to maintain, given we have over 2000 Arm64 intrinsic APIs).

@tannergooding understood. so disabling all arm64 hardware intrinsics is an option then in the meantime DOTNET_EnableArm64AdvSimd=0 (and is much more performant that disabling the whole jit)

Is there any way to set that environment variable as default at application compile time? It seems like it needs to be set before the dotnet runtime even starts inorder for the jit to respect it, so setting it in the main Program.cs doesn't seem to work as that is too late.

tannergooding · 2024-02-11T19:56:22Z

so disabling all arm64 hardware intrinsics is an option then in the meantime DOTNET_EnableArm64AdvSimd=0 (and is much more performant that disabling the whole jit)

A lot of very core components in the BCL use hardware intrinsics for acceleration but very few of them rely on multiplication by a scalar and even fewer rely or expose any potential to hit multiplication of a scalar by a constant. As such, I expect you would see very significant performance regressions from globally disabling the feature and it would likely be significantly easier to just workaround the behavior in your own application instead.

For example, if you're needing to construct a Vector2 with every element being the same value, it's much better to simply do new Vector2(number). Multiplication by 1 is effectively a no-op, and so the optimization here in morph is really just a "nice to have" to help minimally improve codegen if you happen to hit this path after inlining. It's not expected to be some typical thing that a user would frequently hit in their applicaiton (much as you wouldn't expect to see x * 0 or x * 1 for scalar code in a library or application).

It seems like it needs to be set before the dotnet runtime even starts inorder for the jit to respect it

Yes

Is there any way to set that environment variable as default at application compile time

Not that I'm aware of. These configuration knobs are primarily meant for testing purposes, so dev's can trivially test things like their software fallback on a platform with SIMD support. They aren't really meant to be actually disabled in production scenarios, so they only support environment variables and don't have a corresponding RuntimeConfig.json switch or anything like that.

jakobbotsch · 2024-02-12T16:57:38Z

Approved by tactics over email.

theofficialgman · 2024-02-12T17:01:46Z

A lot of very core components in the BCL use hardware intrinsics for acceleration but very few of them rely on multiplication by a scalar and even fewer rely or expose any potential to hit multiplication of a scalar by a constant. As such, I expect you would see very significant performance regressions from globally disabling the feature and it would likely be significantly easier to just workaround the behavior in your own application instead.

Sure, performance is poorer but the results are correct (which is what is important). Given the time/complexity/difficulty in working around the temporary problem (see below), I proposed the temporary solution. This issue existing at all even when a fix was in main for months I think highlights an issue in your development process for .NET. If there is a functional bug that can be trivially backported where there is a discrepancy between the interpreter and jit, why not mark that for backporting in every case?

For example, if you're needing to construct a Vector2 with every element being the same value, it's much better to simply do new Vector2(number). Multiplication by 1 is effectively a no-op, and so the optimization here in morph is really just a "nice to have" to help minimally improve codegen if you happen to hit this path after inlining. It's not expected to be some typical thing that a user would frequently hit in their applicaiton (much as you wouldn't expect to see x * 0 or x * 1 for scalar code in a library or application).

Sure, easy to say in theory, not feasible in practice where vector math is used across many files in the application and its framework (just two examples here: https://github.com/FosterFramework/Foster/blob/main/Framework/Utility/Calc.cs https://github.com/ExOK/Celeste64/blob/main/Source/Helpers/Utils.cs). Looking at it further, I assume that any use of the following constants multiplied by a vector would cause the bug https://github.com/FosterFramework/Foster/blob/dc065e0bfe1ed73e7285701c7128f5c6d1443b4b/Framework/Utility/Calc.cs#L18-L49 but I have not checked every file for user of vectors multiplied by constants... it would be great ordeal to do so and fix each one. edit: or maybe not since those are floats... in any case, this just highlights the ordeal that it would be to go through the application and all recursive dependencies to check for vector times scalar operations.

Its not my application either, I just was one of the ones that encountered the issue after building and testing the application on ARM64 linux and reported it to the application developer. Incase it wasn't clear, I am not a developer, just a hardware hacker/packager/porter/OS maintainer.

edit: this comment does not necessitate a response. it is just something to think about for the future.

theofficialgman · 2024-03-06T17:13:17Z

@jakobbotsch I'd like to confirm that this PR will get merged into the upcoming v8.0.3 tag by @dotnet-bot

It seems arbitrary to me what gets pulled from 8.0-staging into the release tag. Something like only commits that have an age >= 1 month seem to have gotten pulled in from 8.0-staging into the v8.0.2 release tag which is why this was missed last cycle.

jakobbotsch · 2024-03-06T17:38:03Z

Yes, this will be in 8.0.3.

It seems arbitrary to me what gets pulled from 8.0-staging into the release tag. Something like only commits that have an age >= 1 month seem to have gotten pulled in from 8.0-staging into the v8.0.2 release tag which is why this was missed last cycle.

The servicing releases are finalized and validated several weeks before they are released. I've personally already manually validated that the reported issues around this bug are fixed with 8.0.3. It was never expected that this fix would make 8.0.2.

jakobbotsch added 2 commits February 8, 2024 11:08

Add test

af4287f

Add bugfix

9a06a7a

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 8, 2024

ghost assigned jakobbotsch Feb 8, 2024

EgorBo approved these changes Feb 8, 2024

View reviewed changes

build-analysis bot mentioned this pull request Feb 8, 2024

Tracking issue for CI build timeouts #76454

Closed

jakobbotsch added the Servicing-consider Issue for next servicing release review label Feb 9, 2024

jeffschwMSFT approved these changes Feb 9, 2024

View reviewed changes

tannergooding approved these changes Feb 10, 2024

View reviewed changes

jakobbotsch added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Feb 12, 2024

jakobbotsch merged commit dedae5e into dotnet:release/8.0-staging Feb 12, 2024

jakobbotsch deleted the backport-94413 branch February 12, 2024 16:59

github-actions bot locked and limited conversation to collaborators Apr 7, 2024

[release/8.0-staging] Fix constant folding for arm64 MultiplyByScalar operations involving Vector2.One #98150

[release/8.0-staging] Fix constant folding for arm64 MultiplyByScalar operations involving Vector2.One #98150

Uh oh!

Conversation

jakobbotsch commented Feb 8, 2024

Customer Impact

Regression

Testing

Risk

Uh oh!

ghost commented Feb 8, 2024

Customer Impact

Regression

Testing

Risk

Uh oh!

carlossanlop commented Feb 8, 2024

Uh oh!

theofficialgman commented Feb 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakobbotsch commented Feb 9, 2024

Uh oh!

jeffschwMSFT left a comment

Choose a reason for hiding this comment

Uh oh!

tannergooding commented Feb 9, 2024

Uh oh!

theofficialgman commented Feb 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tannergooding commented Feb 11, 2024

Uh oh!

jakobbotsch commented Feb 12, 2024

Uh oh!

theofficialgman commented Feb 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theofficialgman commented Mar 6, 2024

Uh oh!

jakobbotsch commented Mar 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

theofficialgman commented Feb 9, 2024 •

edited

Loading

theofficialgman commented Feb 11, 2024 •

edited

Loading

theofficialgman commented Feb 12, 2024 •

edited

Loading