Reduce allocations in UnboundLambda #23534

sharwell · 2017-12-02T23:06:56Z

Customer scenario

Running analyzer during a build is slower than it should be, with the analyzer driver contributing substantial overhead even when the analyzers themselves are lightweight.

Bugs this fixes

Fixes #23463

Workarounds, if any

None needed

Risk

The dictionary algorithm is now optimistically concurrent, with a small risk of a performance regression in highly-contended scenarios. In addition, the dictionary lookup went from O(1) to O(log n). This should not be a problem as it favors the frequently used small dictionaries without making large edge cases unreasonable.

Performance impact

AnalyzerRunner indicates a reduction in allocations of 2.56GiB (2.9%).

Is this a regression from a previous update?

No.

Root cause analysis

AnalyzerRunner is a new tool for helping us test analyzer performance in isolation.

How was the bug found?

AnalyzerRunner.

Test documentation updated?

No.

Fixes dotnet#23463

sharwell · 2017-12-05T16:01:20Z

@dotnet/roslyn-compiler for review

gafter · 2017-12-05T16:36:42Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

+            // when binding for real (not for return inference), there is still
+            // a good chance that we could reuse a body of a lambda previously bound for 
+            // return type inference.
+            var cacheKey = ReturnInferenceCacheKey.Create(delegateType, IsAsync);


cacheKey [](start = 16, length = 8)

This cache key is not used as a cache key. In fact, it really isn't used, except to get things that are available on the delegate type.

My point is, please avoid creating the cache key, and instead directly compute the parameter type array and parameter ref kind array.

➡️ Reverted the key change; can evaluate separately.

gafter · 2017-12-05T16:37:01Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

            }

-            return result;
+            var cacheKey = ReturnInferenceCacheKey.Create(delegateType, IsAsync);


cacheKey [](start = 16, length = 8)

Ditto

gafter · 2017-12-05T16:46:17Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

-            BoundLambda result;
-            if (!_bindingCache.TryGetValue(delegateType, out result))
+            if (_bindingCache.TryGetValue(delegateType, out var result))
            {


Concurrency on these caches will be quite rare, and encountered solely based on concurrent calls into the compiler APIs. I think we are likely to be better served using a simple Dictionary with an explicit lock around its uses.

AlekseyTs · 2017-12-06T00:38:50Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

-        private readonly ConcurrentDictionary<ReturnInferenceCacheKey, BoundLambda> _returnInferenceCache = new ConcurrentDictionary<ReturnInferenceCacheKey, BoundLambda>();
+        [PerformanceSensitive(
+            "https://github.com/dotnet/roslyn/issues/23582",
+            Constraint = "Avoid " + nameof(ConcurrentDictionary<NamedTypeSymbol, BoundLambda>) + " which has a large default size, but this cache is normally small.")]


which has a large default size, but this cache is normally small. [](start = 99, length = 65)

Could we provide explicit initial capacity instead, the reasonable small value? #Closed

AlekseyTs · 2017-12-06T00:40:28Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

        {
-            BoundLambda result;
-            if (!_bindingCache.TryGetValue(delegateType, out result))
+            if (_bindingCache.TryGetValue(delegateType, out var result))


var [](start = 60, length = 3)

Please spell out the type, or better revert the inlining of the declaration. #Closed

📝 According to the .editorconfig for this section of the code, both forms are acceptable. We need to update the configuration to match the practices followed during the code review process.

➡️ Reverted style changes to the extent possible.

We need to update the configuration to match the practices followed during the code review process.

Note that the exact rules in compilers - var is not allowed unless the type is explicitly named in the initializer. #Closed

AlekseyTs · 2017-12-06T00:48:57Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

+        [PerformanceSensitive(
+            "https://github.com/dotnet/roslyn/issues/23582",
+            Constraint = "Avoid " + nameof(ConcurrentDictionary<NamedTypeSymbol, BoundLambda>) + " which has a large default size, but this cache is normally small.")]
+        private ImmutableDictionary<NamedTypeSymbol, BoundLambda> _returnInferenceCache = ImmutableDictionary<NamedTypeSymbol, BoundLambda>.Empty;


NamedTypeSymbol [](start = 110, length = 15)

The change in the key type looks wrong. #Closed

I wouldn't call it so much wrong as replacing one apparently unimportant optimization (sharing the cache for semantically identical delegate types) for another apparently important optimization. I believe the transformation preserves correctness of the compiler. Do you, @AlekseyTs, believe that change is semantically incorrect?

This one is going to be important to understand. According to the previous implementation of BindForReturnTypeInference, returning a new result from ReallyInferReturnType for the same delegate type is acceptable in the case of a cache miss (note the use of TryAdd instead of GetOrAdd). I verified that the conversion from NamedTypeSymbol to ReturnInferenceCacheKey is a deterministic function, but incorrectly treated it as one-to-one.

Neither description of the change, nor comments mentioned the type change for the key. The motivation behind the change was not explained, no numbers supporting the change were provided. I am talking exclusively about the type change for the key. If you have numbers supporting that the type change on its own leads to noticeable improvement, and you are confident that the change is not going to regress other scenarios, then we should get rid of the ReturnInferenceCacheKey type altogether. #Closed

AlekseyTs · 2017-12-06T00:51:24Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

-
-            BoundLambda result;
-            if (!_returnInferenceCache.TryGetValue(cacheKey, out result))
+            if (_returnInferenceCache.TryGetValue(delegateType, out var result))


var [](start = 68, length = 3)

Please spell out the type, or better revert the inlining of the declaration. In general, var is not welcome in compilers code base. #Closed

AlekseyTs

Please revert changes to the key type of the dictionaries and all refactorings that were triggered by the key type change.

sharwell · 2017-12-07T20:01:44Z

@dotnet/roslyn-compiler This is ready for updated review. Changes are restricted to the type of dictionary used. I will update the original post with allocation savings under the new approach when available.

@gafter @AlekseyTs You both mentioned using different dictionary types. Do you have examples of where these dictionaries get large? My belief is these are typically single-element dictionaries, in which case the current tree-based dictionary would be as efficient as anything else. It seems like you would need a very large number of very large sets to give anything else an advantage. Unfortunately my test suite is limited with respect to this change.

AlekseyTs · 2017-12-08T00:41:02Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

                default:
                    // Prefer candidates with fewer diagnostics.
-                    IEnumerable<BoundLambda> minDiagnosticsGroup = candidates.GroupBy(lambda => lambda.Diagnostics.Length).OrderBy(group => group.Key).First();
+                    IEnumerable<KeyValuePair<T, BoundLambda>> minDiagnosticsGroup = candidates.GroupBy(lambda => lambda.Value.Diagnostics.Length).OrderBy(group => group.Key).First();


lambda [](start = 103, length = 6)

Consider renaming parameter to pair or something similar.

AlekseyTs · 2017-12-08T00:42:05Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

                    return minDiagnosticsGroup
-                        .OrderBy(lambda => GetLambdaSortString(lambda.Symbol))
-                        .FirstOrDefault();
+                        .OrderBy(lambda => GetLambdaSortString(lambda.Value.Symbol))


lambda [](start = 33, length = 6)

Same suggestion here.

AlekseyTs · 2017-12-08T00:43:51Z

src/Compilers/CSharp/Portable/BoundTree/UnboundLambda.cs

-                        .OrderBy(lambda => GetLambdaSortString(lambda.Symbol))
-                        .FirstOrDefault();
+                        .OrderBy(lambda => GetLambdaSortString(lambda.Value.Symbol))
+                        .FirstOrDefault()


FirstOrDefault [](start = 25, length = 14)

This probably should be First(), it looks like we don't expect the sequence to be empty.

Obsolete

AlekseyTs

LGTM (iteration 4)

sharwell · 2017-12-08T03:30:44Z

📝 The allocation savings do not appear to have changed significantly from the original report.

AlekseyTs · 2017-12-08T18:01:50Z

The allocation savings do not appear to have changed significantly from the original report.

@sharwell Thanks for following up on this. I am curious what exactly are we measuring. "Running analyzer during a build is slower than it should be" isn't quite clear in this respect. Does analyzer do something specific? Could you please provide more details? These questions are not blocking this PR.

sharwell · 2017-12-08T18:06:39Z

I am curious what exactly are we measuring. "Running analyzer during a build is slower than it should be" isn't quite clear in this respect. Does analyzer do something specific? Could you please provide more details? These questions are not blocking this PR.

This pull request is the result of an open-ended performance investigation. A test scenario was constructed (#23582), and code changes were applied according to the prominent sources of allocations revealed by the test. Each change was described in terms of gains made against the baseline results for the constructed scenario.

AlekseyTs · 2017-12-08T18:10:00Z

Could you please share some details about the test scenario?

sharwell · 2017-12-08T18:11:43Z

@AlekseyTs I think I edited my comment while you were reading it. I added a link to the test scenario I used for this (it's the same link that appears in the [PerformanceSensitive] attribute).

sharwell · 2017-12-08T18:34:23Z

@AlekseyTs If you know of a sample program which forces these maps to get large, let me know. I can run profiling passes on them to verify the change to a tree-based map will not cause regressions on code that differs from the input I used so far.

AlekseyTs · 2017-12-08T22:46:51Z

@sharwell Are we certain that analyzers executed as part of the scenario themselves doing things in an efficient manner, i.e. not causing compiler to redo work unnecessary, etc. For example, code in https://github.com/dotnet/roslyn/blob/master/src/Features/Core/Portable/UseThrowExpression/AbstractUseThrowExpressionDiagnosticAnalyzer.cs requests separate SemanticModel and causes compiler to rebind the same code again and build another IOperation tree for the same code.

sharwell · 2017-12-08T23:11:43Z

@AlekseyTs to the contrary, by all metrics I have encountered analyzers are frequently extremely inefficient and not in obvious ways. However, with analyzers running as an integral part of the build it's increasingly hard to separate the two.

sharwell · 2018-01-17T16:26:16Z

@Pilchie or @MeiChin-Tsai for ask mode

Pilchie · 2018-01-17T16:33:46Z

I'm in favor of the change for the scenario, but I'll leave it to @MeiChin-Tsai to approve in case she has any concerns about risk.

gafter · 2018-02-01T23:13:05Z

Can you please rebase this to 15.7.x (which is not in ask mode)?

gafter

MeiChin-Tsai · 2018-02-01T23:47:10Z

I agree with @gafter

sharwell · 2018-02-02T00:01:51Z

@MeiChin-Tsai This is now retargeted to 15.7

jinujoseph · 2018-02-02T20:23:51Z

To the best i know compiler side is not in ask mode for 15.7 yet so you should be good to merge this. @jaredpar correct me if thatz not the case

jaredpar · 2018-02-02T20:54:39Z

@jinujoseph that is correct.

gafter · 2018-02-02T21:27:43Z

sharwell changed the title ~~Reduce allocations in UnboundLambda~~ [WIP] Reduce allocations in UnboundLambda Dec 2, 2017

sharwell changed the title ~~[WIP] Reduce allocations in UnboundLambda~~ Reduce allocations in UnboundLambda Dec 4, 2017

Reduce allocations in UnboundLambda

22b4d28

Fixes dotnet#23463

sharwell force-pushed the optimize-unboundlambdastate branch from be91b2e to 22b4d28 Compare December 5, 2017 13:01

sharwell requested a review from a team as a code owner December 5, 2017 13:01

sharwell requested a review from a team December 5, 2017 16:01

gafter added Area-Compilers Tenet-Performance Regression in measured performance of the product from goals. labels Dec 5, 2017

gafter reviewed Dec 5, 2017

View reviewed changes

gafter requested a review from a team December 5, 2017 16:49

gafter self-assigned this Dec 5, 2017

AlekseyTs reviewed Dec 6, 2017

View reviewed changes

AlekseyTs previously requested changes Dec 6, 2017

View reviewed changes

sharwell changed the title ~~Reduce allocations in UnboundLambda~~ [WIP] Reduce allocations in UnboundLambda Dec 7, 2017

sharwell added 2 commits December 7, 2017 13:54

Restore ReturnInferenceCacheKey as the key for _returnInferenceCache

583ba22

Update code to more closely follow patterns of the original code

4757b85

sharwell changed the title ~~[WIP] Reduce allocations in UnboundLambda~~ Reduce allocations in UnboundLambda Dec 7, 2017

jasonmalinowski removed the request for review from a team December 7, 2017 20:15

AlekseyTs reviewed Dec 8, 2017

View reviewed changes

AlekseyTs approved these changes Dec 8, 2017

View reviewed changes

jaredpar approved these changes Dec 8, 2017

View reviewed changes

gafter removed their assignment Dec 26, 2017

sharwell changed the base branch from master to dev15.6.x January 9, 2018 16:06

sharwell added this to the 15.6 milestone Jan 9, 2018

jinujoseph modified the milestones: 15.6, 15.7 Jan 31, 2018

gafter approved these changes Feb 1, 2018

View reviewed changes

gafter self-assigned this Feb 1, 2018

sharwell changed the base branch from dev15.6.x to dev15.7.x February 2, 2018 00:01

sharwell merged commit ce5a3f1 into dotnet:dev15.7.x Feb 2, 2018

sharwell deleted the optimize-unboundlambdastate branch February 2, 2018 21:11

Reduce allocations in UnboundLambda #23534

Reduce allocations in UnboundLambda #23534

Uh oh!

Conversation

sharwell commented Dec 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Customer scenario

Bugs this fixes

Workarounds, if any

Risk

Performance impact

Is this a regression from a previous update?

Root cause analysis

How was the bug found?

Test documentation updated?

Uh oh!

sharwell commented Dec 5, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs left a comment

Choose a reason for hiding this comment

Uh oh!

sharwell commented Dec 7, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs Dec 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlekseyTs left a comment

Choose a reason for hiding this comment

Uh oh!

sharwell commented Dec 8, 2017

Uh oh!

AlekseyTs commented Dec 8, 2017

Uh oh!

sharwell commented Dec 2, 2017 •

edited

Loading

AlekseyTs Dec 6, 2017 •

edited

Loading

AlekseyTs Dec 6, 2017 •

edited

Loading

AlekseyTs Dec 8, 2017 •

edited

Loading

AlekseyTs Dec 6, 2017 •

edited

Loading

AlekseyTs Dec 6, 2017 •

edited

Loading

AlekseyTs Dec 6, 2017 •

edited

Loading

AlekseyTs Dec 8, 2017 •

edited

Loading

sharwell commented Dec 8, 2017 •

edited

Loading

sharwell commented Dec 8, 2017 •

edited

Loading