Skip to content
128 changes: 128 additions & 0 deletions docs/AssemblyLoadingStrategy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Assembly Loading Strategy

When running in co-hosting mode it is essential that the types used by the source generator and the rest of the tooling unify; Roslyn and Razor tooling must 'share' the same loaded copy of the source generator. This requires that Roslyn and Razor co-ordinate as to who is responsible for loading the source generator, and the other party must use the already loaded copy. Unfortunately, due to asynchronous loading it is non-deterministic as to which party will first attempt to load the generator.

If Razor loads first, the generator will be automatically loaded as a dependency. Because Razor directly references Roslyn, it has the ability to set a filter on the Roslyn loading code that will intercept the load and use the version already loaded by razor. However if roslyn tries to load the generator *before* Razor tooling has been loaded then the filter is unset and the source generator will be loaded into the default Shadow copy Assembly Load Context (ALC) in the same way as other generators. Roslyn has no reference to Razor, so has no ability to inform Razor that it should use the already loaded instance in the Shadow copy ALC.

It is possible to enumerate the ALC instances and search for a loaded assembly but its possible that Roslyn had already started loading the assembly at the point at which Razor checks; Razor doesn't find it so installs the filter and loads it, meanwhile the Roslyn code resumes loading and loads a duplicate copy into the shadow copy ALC.

Thus it becomes clear that this problem requires a strongly synchronized approach so that we can deterministically load a single copy of the source generator.

## Approach

While we stated that Roslyn has no references to Razor, it *does* have an 'External Access' (EA) assembly available to Razor. These are generally used as ways the Roslyn team can give internal access to small areas of code to authorized third parties in a way that minimizes breakages. If we create a filter in the razor EA assembly and have it always load, we can maintain a persistent hook that can be used to synchronize between the two parties.

The hook simply records the list of assemblies that have been loaded by Roslyn. In the majority of cases, when Razor and Roslyn aren't co-hosted, this is all it does and nothing else. However it also exposes an atomic 'check-and-set' style filter installation routine. This takes a 'canary' assembly to be checked to see if it has already been loaded. If the canary has already been loaded the filter installation fails. When the canary hasn't yet been seen the filter is installed. The assembly resolution and filter installation are synchronized to ensure that it is an atomic operation.

When the filter installation succeeds Razor can continue loading it's copy of the source generator, with the knowledge that any requests by Roslyn to load it will be redirected to it. As long as Razor also synchronizes its loading code with the filter requests it is possible to deterministically ensure that roslyn will always use razors copy in this case. If filter installation fails then Roslyn has already loaded (or begun loading) the source generator, and so Razor must retrieve the loaded copy rather than loading its own. In the very small possibility that Roslyn has begun loading the assembly but not yet finished, Razor is required to spin-wait for the assembly to become available. It's technically possible that Roslyn will fail to load the assembly meaning it will never become available so Razor tooling must use a suitable timeout before erroring.

### Examples

The following are possible sequences of events for the load order. Note that locks aren't shown unless they impact the order of operations

Razor loads first:

```mermaid
sequenceDiagram
razor->>filter: InstallFilter
filter-->>razor: Success
razor->>razor: Load SG
roslyn->>filter: Load SG
filter->>razor: Execute Filter
razor-->>filter: SG
filter-->>roslyn: SG
```

Roslyn loads first

```mermaid
sequenceDiagram
participant razor
participant filter
participant roslyn
roslyn->>filter: Load SG
filter-->>roslyn: No filter
roslyn->>roslyn: Load SG
razor->>filter: InstallFilter
filter-->>razor: Failure
razor->>roslyn: Search ALCs
roslyn-->>razor: SG
```

Razor loads first, roslyn tries to load during loading:

```mermaid
sequenceDiagram
razor->>+filter: InstallFilter
note right of filter: Begin Lock
roslyn->>filter: (2) Load SG
filter-->>-razor: Success
filter->>razor: (2) Execute Filter
razor->>razor: Load SG
razor->>filter: SG
filter->>roslyn: SG
```

Roslyn loads first, razor tries to load during loading:

```mermaid
sequenceDiagram
participant razor
participant filter
participant roslyn
roslyn->>filter: Load SG
filter-->>roslyn: No filter
roslyn->>roslyn: Load SG
razor->>filter: InstallFilter
filter-->>razor: Failure
loop Spin Lock
razor->>roslyn: Search ALCs
alt Not loaded
roslyn-->>razor: No result
else is well
roslyn-->>razor: SG
end
end
```

## Intercepting the ALC load for Razor tooling

In order to 'choose' which source generator assembly is used by the tooling, it needs some method to intercept the loading of the assembly and return the preferred copy. Razor tooling is hosted in ServiceHub, which has its own assembly loading mechanisms based on AssemblyLoadContext (ALC). Unfortunately there is no way to override the loading logic of the provided ALC that can be hooked into to achieve this. Instead, Razor provides its own ALC ([RazorAssemblyLoadContext.cs](..\src\Razor\src\Microsoft.CodeAnalysis.Remote.Razor\RazorAssemblyLoadContext.cs)) that has the logic required to interact with the Roslyn EA assembly.

ServiceHub doesn't provide a way to specify a particular ALC implementation to use when loading a service, and due to the nature of ServiceHub by the time the razor tooling code is running it has already been loaded into the ServiceHub ALC. Thus Razor tooling needs a way of bootstrapping itself into the Razor specific ALC before any code runs.

This is handled in [RazorBrokeredServiceBase.FactoryBase\`1.cs](..\src\Razor\src\Microsoft.CodeAnalysis.Remote.Razor\RazorBrokeredServiceBase.FactoryBase`1.cs). When ServiceHub requests the factory create an instance of the service, the factory instead loads a copy of itself into a shared instance of the `RazorAssemblyLoadContext`, and via reflection thunks the create request to the factory there. The instance created in the Razor ALC is returned to ServiceHub. This means that any code in the returned service that causes as assembly load will be handled by the Razor ALC, allowing for interception in the case of the source generator.

### Example

```mermaid
sequenceDiagram
box ServiceHub ALC
participant serviceHub as Service Hub
participant factory(1) as Factory
end
box Razor ALC
participant razorAlc as RazorAssemblyLoadContext
participant factory(2) as Factory
participant serviceInstance as Service Instance
end
serviceHub->>factory(1): Create Service
factory(1)->>razorAlc: Load self
#create participant factory(2) as Factory
#(see https://github.com/mermaid-js/mermaid/issues/5023)
factory(1)->>factory(2): Create New Factory
factory(2)-->>factory(1):
factory(1)->>factory(2): Create Service Internal
#create participant serviceInstance as Service Instance
factory(2)->>serviceInstance: Create Service instance
serviceInstance-->>serviceHub: Return instance
serviceHub->>serviceHub: Wait for request
serviceHub->>serviceInstance: Handle Request
serviceInstance-->>razorAlc: Implicit load request
razorAlc->>razorAlc: Load source generator
razorAlc-->>serviceInstance:
serviceInstance->>serviceInstance: Handle Request
serviceInstance-->>serviceHub: Result
```
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<TargetFrameworks>$(DefaultNetCoreTargetFramework);netstandard2.0</TargetFrameworks>
<Description>Razor is a markup syntax for adding server-side logic to web pages. This package contains the Razor design-time infrastructure.</Description>
<EnableApiCheck>false</EnableApiCheck>
<IsShippingPackage>false</IsShippingPackage>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
// Copyright (c) .NET Foundation. All rights reserved.
// Licensed under the MIT license. See License.txt in the project root for license information.

#if NET
using System;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Reflection;
using System.Runtime.Loader;
using System.Threading;
using Microsoft.CodeAnalysis.ExternalAccess.Razor;

namespace Microsoft.CodeAnalysis.Remote.Razor;

internal sealed class RazorAssemblyLoadContext : AssemblyLoadContext
{
private readonly AssemblyLoadContext? _parent;
private readonly string _baseDirectory;

private Assembly? _razorCompilerAssembly;

private object _loaderLock = new();

public static readonly RazorAssemblyLoadContext Instance = new();

public RazorAssemblyLoadContext()
: base(isCollectible: true)
{
var thisAssembly = GetType().Assembly;
_parent = GetLoadContext(thisAssembly);
_baseDirectory = Path.GetDirectoryName(thisAssembly.Location) ?? "";
}

protected override Assembly? Load(AssemblyName assemblyName)
{
var fileName = Path.Combine(_baseDirectory, assemblyName.Name + ".dll");
if (File.Exists(fileName))
{
// when we are asked to load razor.compiler, we first have to see if Roslyn beat us to it.
if (IsRazorCompiler(assemblyName))
{
// Take the loader lock before we even try and install the resolver.
// This ensures that if we successfully install the resolver we can't resolve the assembly until it's actually loaded
lock (_loaderLock)
{
if (RazorAnalyzerAssemblyResolver.TrySetAssemblyResolver(ResolveAssembly, assemblyName))
{
// We were able to install the resolver. Load the assembly and keep a reference to it.
_razorCompilerAssembly = LoadFromAssemblyPath(fileName);
return _razorCompilerAssembly;
}
else
{
// Roslyn won the race, we need to find the compiler assembly it loaded.
while (true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned in the doc that we'd need to error eventually in case Roslyn errored. When does that happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point. When writing the doc, I figured we'd need to do that, but at implementation I realized it's impossible to do so deterministically as-is, as its essentially a halting problem: we don't know if its not-loaded or failed to load and so another Yield might actually succeed. We could stick some arbitrary amount of retries in, but while unlikely, we could still prematurely assume failure if we hit the retry limit because the other thread just didn't get scheduled. We would need to add some other synchronization primitive to roslyn that allows you to query the status of a given assembly load.

The only way we could get into this state where it really has failed to load is if the assembly is missing or corrupted. Roslyn and razor both load the same assembly from disk, so even if we could detect that the load failed in Roslyn, it's just going to fail again in Razor anyway.

Given that its an error case either way, I'm inclined to just update the doc to note this and leave it as-is. If we think that's not acceptable let's file a bug and not block this work on it as its not trivial to do.

{
foreach (var alc in AssemblyLoadContext.All)
{
var roslynRazorCompiler = alc.Assemblies.SingleOrDefault(a => IsRazorCompiler(a.GetName()));
if (roslynRazorCompiler is not null)
{
return roslynRazorCompiler;
}
}
// we didn't find it, so it's possible that the Roslyn loader is still in the process of loading it. Yield and try again.
Thread.Yield();
}
}
}
}

return LoadFromAssemblyPath(fileName);
}

return _parent?.LoadFromAssemblyName(assemblyName);
}

private Assembly? ResolveAssembly(AssemblyName assemblyName)
{
if (IsRazorCompiler(assemblyName))
{
lock (_loaderLock)
{
Debug.Assert(_razorCompilerAssembly is not null);
return _razorCompilerAssembly;
}
}

return null;
}

private bool IsRazorCompiler(AssemblyName assemblyName) => assemblyName.Name?.Contains("Microsoft.CodeAnalysis.Razor.Compiler", StringComparison.OrdinalIgnoreCase) == true;
}
#endif
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ internal abstract class FactoryBase<TService> : IServiceHubServiceFactory
{
protected abstract TService CreateService(in ServiceArgs args);

public async Task<object> CreateAsync(
public Task<object> CreateAsync(
Stream stream,
IServiceProvider hostProvidedServices,
ServiceActivationOptions serviceActivationOptions,
Expand All @@ -36,6 +36,26 @@ public async Task<object> CreateAsync(
// Dispose the AuthorizationServiceClient since we won't be using it
authorizationServiceClient?.Dispose();

#if NET
// So that we can control assembly loading, we re-load ourselves in the shared Razor ALC and perform the creation there.
// That ensures that the service type we return is in the Razor ALC and any dependencies it needs will be handled by the
// Razor ALC dependency loading rather than the default ServiceHub ALC that we're in right now.
var assemblyInRazorAlc = RazorAssemblyLoadContext.Instance.LoadFromAssemblyName(GetType().Assembly.GetName());
var thisInRazorAlc = assemblyInRazorAlc.CreateInstance(GetType().FullName!)!;

var createInternalAsyncFunc = thisInRazorAlc.GetType().GetMethod("CreateInternalAsync", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic)!;
var result = (Task<object>)createInternalAsyncFunc.Invoke(thisInRazorAlc, [stream, hostProvidedServices, serviceBroker])!;
return result;
#else
return CreateInternalAsync(stream, hostProvidedServices, serviceBroker);
#endif
}

protected async Task<object> CreateInternalAsync(
Stream stream,
IServiceProvider hostProvidedServices,
IServiceBroker serviceBroker)
{
var traceSource = (TraceSource?)hostProvidedServices.GetService(typeof(TraceSource));

// RazorBrokeredServiceData is a hook that can be provided for different host scenarios, such as testing.
Expand Down
Loading