-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Make sorting of modules more stable in the compiler #96006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The compiler is multithreaded and to achieve stable outputs, it needs to sort everything before writing it to disk. The way we sort modules currently is by looking at the module MVID. It's a very simple, reliable, and efficient way. Except this causes disproportionally large differences in the compiler outputs when a small part of the program was changed - changing something small in an assembly could result in large blocks of methods being shifted in the output executable. This makes delta patching very inefficient. I'm changing the sorting to use assembly names instead. It's less efficient, but doesn't cause problems for delta patching.
|
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue DetailsThe compiler is multithreaded and to achieve stable outputs, it needs to sort everything before writing it to disk. The way we sort modules currently is by looking at the module MVID. It's a very simple, reliable, and efficient way. Except this causes disproportionally large differences in the compiler outputs when a small part of the program was changed - changing something small in an assembly could result in large blocks of methods being shifted in the output executable. This makes delta patching very inefficient. I'm changing the sorting to use assembly names instead. It's less efficient, but doesn't cause problems for delta patching. Cc @dotnet/ilc-contrib
|
| string thisName = _metadataReader.GetString(_metadataReader.GetModuleDefinition().Name); | ||
| string otherName = other._metadataReader.GetString(other._metadataReader.GetModuleDefinition().Name); | ||
| Debug.Assert(StringComparer.Ordinal.Compare(thisName, otherName) != 0); | ||
| return StringComparer.Ordinal.Compare(thisName, otherName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the Debug.Assert get removed for release builds of the compiler? Above you capture the comparison result and use it in the Assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the entire thing in the assert gets removed in non-debug builds. It didn't feel worth it to add the extra line here.
|
/backport to release/8.0-staging |
|
Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/7260177639 |
|
@MichalStrehovsky @jkotas - we are seeing a pretty nice size reduction (~25%) in the symbols size of a Linux (both x64 and arm) Native AOT app. The diff seems to have happened in 486142a...d2cbd1c, which both this and #96007 are in. Could either (or both) of those changes have caused this? See |
I don't see how this would affect the symbol size. December 15th is awfully close to when time to first response also got suddenly worse. Is this when we swapped the machine doing the benchmarking? A different linker (the platform linker, not IL "linker") could explain differences. I have a blurry memory of something happening around that day. @sebastienros? |
|
Yep - that must be it. I ran the benchmarks again on an old build (from around November 26) and the latest build, and I got roughly the same symbols size: old latest So this must have been caused by moving machines, since it no longer repros between these 2 builds on the same machine. |

The compiler is multithreaded and to achieve stable outputs, it needs to sort everything before writing it to disk. The way we sort modules currently is by looking at the module MVID. It's a very simple, reliable, and efficient way.
Except this causes disproportionally large differences in the compiler outputs when a small part of the program was changed - changing something small in an assembly could result in large blocks of methods being shifted in the output executable. This makes delta patching very inefficient.
I'm changing the sorting to use assembly names instead. It's less efficient, but doesn't cause problems for delta patching.
Cc @dotnet/ilc-contrib