-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description
In .NET 10 AsyncEnumerable LINQ features are now included into BCL instead of the standalone package System.Linq.Async. Most things work as expected, but in the scenario of Append followed by a SumAsync we are seeing a regression in performance that is not expected (one of the scenarios we have noted down so far in our internal testing).
Configuration
Code snippet of benchmark:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkRunner.Run(BenchmarkConverter.TypeToBenchmarks(typeof(AppendTest)));
[MemoryDiagnoser]
[MinColumn, MaxColumn, Q1Column, Q3Column, AllStatisticsColumn]
[JsonExporterAttribute.Full]
public class AppendTest
{
[Benchmark]
public async Task<int> AppendInts()
{
var enumerable = AsyncEnumerable.Empty<int>();
for (int i = 0; i < 10000; i++)
enumerable = enumerable.Append(i);
return await enumerable.SumAsync();
}
}Benchmark result with .NET9 & System.Linq.Async:
// * Summary *
BenchmarkDotNet v0.15.6, macOS 26.1 (25B78) [Darwin 25.1.0]
Apple M2 Pro, 1 CPU, 12 logical and 12 physical cores
.NET SDK 9.0.306
[Host] : .NET 9.0.10 (9.0.10, 9.0.1025.47515), Arm64 RyuJIT armv8.0-a
DefaultJob : .NET 9.0.10 (9.0.10, 9.0.1025.47515), Arm64 RyuJIT armv8.0-a
| Method | Mean | Error | StdDev | StdErr | Median | Min | Max | Q1 | Q3 | Op/s | Gen0 | Gen1 | Allocated |
|----------- |---------:|--------:|---------:|--------:|---------:|---------:|---------:|---------:|---------:|--------:|---------:|-------:|----------:|
| AppendInts | 364.3 us | 7.14 us | 13.75 us | 2.03 us | 358.5 us | 342.9 us | 405.2 us | 356.0 us | 368.0 us | 2,745.0 | 166.5039 | 1.9531 | 1.34 MB |
Benchmark result with .NET10 with built-in AsyncEnumerable:
// * Summary *
BenchmarkDotNet v0.15.6, macOS 26.1 (25B78) [Darwin 25.1.0]
Apple M2 Pro, 1 CPU, 12 logical and 12 physical cores
.NET SDK 10.0.100
[Host] : .NET 10.0.0 (10.0.0, 10.0.25.52411), Arm64 RyuJIT armv8.0-a
DefaultJob : .NET 10.0.0 (10.0.0, 10.0.25.52411), Arm64 RyuJIT armv8.0-a
| Method | Mean | Error | StdDev | StdErr | Min | Max | Q1 | Q3 | Median | Op/s | Allocated |
|----------- |---------:|---------:|---------:|--------:|---------:|---------:|---------:|---------:|---------:|------:|----------:|
| AppendInts | 904.1 ms | 12.22 ms | 11.43 ms | 2.95 ms | 891.6 ms | 927.0 ms | 896.3 ms | 908.7 ms | 900.6 ms | 1.106 | 1.83 MB |
Regression?
Regression between .NET 9 using System.Linq.Async implementation over .NET 10 using System.Linq.AsyncEnumerable implementation.
Data
Included under configurations
Analysis
By copying over the Append implementation from System.Linq.Async of v6 branch, the results shows an improved performance instead of the regression:
// * Summary *
BenchmarkDotNet v0.15.6, macOS 26.1 (25B78) [Darwin 25.1.0]
Apple M2 Pro, 1 CPU, 12 logical and 12 physical cores
.NET SDK 10.0.100
[Host] : .NET 10.0.0 (10.0.0, 10.0.25.52411), Arm64 RyuJIT armv8.0-a
DefaultJob : .NET 10.0.0 (10.0.0, 10.0.25.52411), Arm64 RyuJIT armv8.0-a
| Method | Mean | Error | StdDev | StdErr | Min | Max | Q1 | Q3 | Median | Op/s | Gen0 | Gen1 | Allocated |
|----------- |---------:|--------:|--------:|--------:|---------:|---------:|---------:|---------:|---------:|--------:|---------:|-------:|----------:|
| AppendInts | 282.0 us | 3.06 us | 2.56 us | 0.71 us | 279.8 us | 288.1 us | 280.4 us | 281.7 us | 281.3 us | 3,546.4 | 166.5039 | 1.9531 | 1.34 MB |
Gist of copied implementation used for the benchmark test above: https://gist.github.com/iXyles/2c7bec4e1439222bdbbd970c6c36ded5