Skip to content

Conversation

@MichalStrehovsky
Copy link
Member

The compiler is multithreaded and to achieve stable outputs, it needs to sort everything before writing it to disk. The way we sort modules currently is by looking at the module MVID. It's a very simple, reliable, and efficient way.

Except this causes disproportionally large differences in the compiler outputs when a small part of the program was changed - changing something small in an assembly could result in large blocks of methods being shifted in the output executable. This makes delta patching very inefficient.

I'm changing the sorting to use assembly names instead. It's less efficient, but doesn't cause problems for delta patching.

Cc @dotnet/ilc-contrib

The compiler is multithreaded and to achieve stable outputs, it needs to sort everything before writing it to disk. The way we sort modules currently is by looking at the module MVID. It's a very simple, reliable, and efficient way.

Except this causes disproportionally large differences in the compiler outputs when a small part of the program was changed - changing something small in an assembly could result in large blocks of methods being shifted in the output executable. This makes delta patching very inefficient.

I'm changing the sorting to use assembly names instead. It's less efficient, but doesn't cause problems for delta patching.
@ghost
Copy link

ghost commented Dec 14, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

The compiler is multithreaded and to achieve stable outputs, it needs to sort everything before writing it to disk. The way we sort modules currently is by looking at the module MVID. It's a very simple, reliable, and efficient way.

Except this causes disproportionally large differences in the compiler outputs when a small part of the program was changed - changing something small in an assembly could result in large blocks of methods being shifted in the output executable. This makes delta patching very inefficient.

I'm changing the sorting to use assembly names instead. It's less efficient, but doesn't cause problems for delta patching.

Cc @dotnet/ilc-contrib

Author: MichalStrehovsky
Assignees: MichalStrehovsky
Labels:

area-NativeAOT-coreclr

Milestone: -

string thisName = _metadataReader.GetString(_metadataReader.GetModuleDefinition().Name);
string otherName = other._metadataReader.GetString(other._metadataReader.GetModuleDefinition().Name);
Debug.Assert(StringComparer.Ordinal.Compare(thisName, otherName) != 0);
return StringComparer.Ordinal.Compare(thisName, otherName);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Debug.Assert get removed for release builds of the compiler? Above you capture the comparison result and use it in the Assert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the entire thing in the assert gets removed in non-debug builds. It didn't feel worth it to add the extra line here.

@MichalStrehovsky MichalStrehovsky merged commit 476c973 into dotnet:main Dec 14, 2023
@MichalStrehovsky MichalStrehovsky deleted the sortModules branch December 14, 2023 21:51
@MichalStrehovsky
Copy link
Member Author

/backport to release/8.0-staging

@github-actions
Copy link
Contributor

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/7260177639

@eerhardt
Copy link
Member

eerhardt commented Jan 2, 2024

@MichalStrehovsky @jkotas - we are seeing a pretty nice size reduction (~25%) in the symbols size of a Linux (both x64 and arm) Native AOT app. The diff seems to have happened in 486142a...d2cbd1c, which both this and #96007 are in. Could either (or both) of those changes have caused this?

See

@MichalStrehovsky
Copy link
Member Author

@MichalStrehovsky @jkotas - we are seeing a pretty nice size reduction (~25%) in the symbols size of a Linux (both x64 and arm) Native AOT app. The diff seems to have happened in 486142a...d2cbd1c, which both this and #96007 are in. Could either (or both) of those changes have caused this?

I don't see how this would affect the symbol size.

December 15th is awfully close to when time to first response also got suddenly worse. Is this when we swapped the machine doing the benchmarking? A different linker (the platform linker, not IL "linker") could explain differences.

I have a blurry memory of something happening around that day. @sebastienros?

image

@eerhardt
Copy link
Member

eerhardt commented Jan 3, 2024

Yep - that must be it. I ran the benchmarks again on an old build (from around November 26) and the latest build, and I got roughly the same symbols size:

old
| Published Size (KB) | 30,963 |
| Symbols Size (KB) | 20,792 |

latest
| Published Size (KB) | 30,652 |
| Symbols Size (KB) | 20,559 |

So this must have been caused by moving machines, since it no longer repros between these 2 builds on the same machine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants