The .NET framework offers a rich variety of APIs for loading and executing code in a dynamic fashion. I'd like to focus on Assembly.Load and Assembly.LoadFrom, the .NET Reflection APIs for dynamically loading an assembly for inspection and execution. These are especially useful for plug-in architectures, where the design is usually similar to the following:
- A main host manager is written, which provides basic hosting services and means for discovering the plug-ins.
- A shared interface is defined for all plug-ins to implement. This interface serves as the contract between the host manager and the plug-in assemblies.
- Plug-in assemblies are developed, with types that implement the interface or interfaces that is shared with the host manager.
- The plug-in assemblies are placed in a well-known location (e.g. a "Plugins" directory), or the host manager's configuration is updated so that it can discover the plug-ins.
- The host manager then enumerates the available plug-ins, and loads them as configured, normally using Assembly.LoadFrom because only the assembly location is available (and not its display name).
This model is trivial enough to master, although some considerations plague its simplicity: the plug-ins might not originate from a trusted source, the plug-ins might require quotas to be established, the plug-ins might require fault isolation from the manager's code or from each other, etc. These considerations are normally addressed by loading individual plug-ins into separate application domains, or even abstracting them further away into separate processes. More advanced requirements can be addressed by writing a CLR host that facilitates absolute control over the plug-in execution environment.
The next version of Visual Studio (code name "Orcas") includes the next version of the .NET framework, version 3.5. In the new .NET framework we find an add-in framework (in the System.Addin assembly and namespace) that makes developing applications with plug-in support even easier.
Unfortunately, the subtleties of the assembly loading APIs are quite complicated and potential pitfalls are truly undebuggable. For example, have you ever encountered an InvalidCastException that goes somewhat on the following lines?
System.InvalidCastException: Unable to cast object of type 'Plugin.MyPlugin' to type 'Plugin.MyPlugin'.
It would suffice to say, for now, that it doesn't take much to encounter one of these issues. Namely, it's enough to provide a framework for loading dependencies, and you'll definitely have one coming. The sample code attached to this article reproduces the above exception, but the following is a realistic example. If you don't understand all this "load context" nonsense, don't worry. This is the purpose of this post.
- It is a requirement for the host manager to preload all the plug-in assemblies and their references for the sake of analyzing their metadata, inspecting their security evidence, or for any other purpose.
- Therefore, when the host infrastructure loads, it preloads all the plug-in assemblies and recursively pre-loads their references by using the Assembly.GetReferencedAssemblies method.
- Since both the host manager and the plug-in assemblies have the shared interfaces assembly as a reference, the shared assembly is loaded into the "load context" (because it is statically referenced by the host manager) and into the "load from" context (because the host manager recursively loads the plug-in assembly's references).
- Since the plug-in assemblies themselves are loaded into the "load from" context, they use the shared interface assembly from the "load from" context.
- Since the host manager assembly is loaded into the "load" context, it uses the shared interface assembly from the "load" context.
- Eventually, a cast will be required from an implementation type defined in one of the plug-in assemblies (which lives in the "load from" context and implements the shared interface from the "load from" context), to the shared interface type defined in the shared interface assembly from the "load" context.
- This results in an exception because the types are incompatible – they come from different assemblies, loaded in different load contexts.
These surely aren't nice to debug, and my reaction when I first saw one of those was: (a) re-read the exception text a couple of times; (b) run the program again; (c) give the screen one of those blank stares (it surely kept staring back). Later I discovered that the Fusion Log Viewer (fuslogvw.exe) can provide some insight into the gory details of assembly loading and binding. Without the Fusion Log, there are no means to obtain (e.g. using the Reflection API) any information regarding the context into which the assembly was loaded.
No doubt, this behavior is "well"-documented. The Assembly.LoadFrom MSDN documentation does mention this pitfall, in the following (somewhat minimalistic) fashion:
If an assembly is loaded with LoadFrom, and the probing path includes an assembly with the same identity but a different location, an InvalidCastException, MissingMethodException, or other unexpected behavior can occur.
So, for the sake of never blankly staring at the screen again, I figured it might be useful to offer a good explanation of this behavior. However, the Google search for "'load from context' assembly" brought up only four results. Luckily, the fourth was Alex Buckley's Ph.D. thesis, which analyzes the subject inside out and reaches very definitive conclusions. However, its 239 pages are somewhat hard to grasp, so here's a short summary of whatever is relevant to the above.
The Assembly.Load method takes an assembly display name and returns an Assembly instance representing the assembly, or null if the assembly could not be found. If the display name provided is a strong name, Fusion (the CLR component responsible for assembly loading and binding) checks the GAC first, and then proceeds to check the application directory (a.k.a. "application base") and additional probing paths as specified in the application configuration. However, if the display name provided is a weak name, Fusion probes the application base first (only assemblies with strong names can be installed in the GAC anyway). If Fusion finds an assembly with a weak name, it is returned immediately; but if the assembly found has a strong name, the binding process restarts as if the Assembly.Load call has been issued with the strong name obtained in the first bind. This second bind might find another assembly to load (e.g., in the GAC), in which case it will be returned; if no other assembly is found, the assembly found in the first bind is returned.
The Assembly.LoadFrom method is more complex than Assembly.Load. It might be the case that different assemblies will be loaded for the same display name, only because they are loaded from different locations. For example, if an assembly is first loaded statically due to a type reference being resolved by the JIT-compiler, and then being loaded dynamically using Assembly.LoadFrom from a different location, then essentially two assemblies have been loaded, even though they have the same display name and the same binary content. Because assemblies are resolved in memory by display name, this requires partitioning this "memory" into two separate contexts – the so-called "load context" and the so-called "load from context" (Alex Buckley uses the names "L" and "LF" to disambiguate these contexts from the Assembly class methods).
Before we proceed into what Assembly.LoadFrom does, it's important to note that a single assembly file on disk can be loaded into precisely one context. Thus, there is no way to cause "two" assemblies that have the same location on disk to be loaded; "they" are regarded as the same assembly and only loaded once.
Assembly.LoadFrom takes a filename and loads the assembly at the specified location. If an assembly is found, the bind is restarted using the assembly's display name (regardless of whether it is a strong or weak name, unlike Assembly.Load's second bind). If the second bind happens to bind the assembly's display name to the assembly at the originally specified location, then the assembly is loaded into the load context. However, if the second bind finds an assembly at a different location, it is ignored and the assembly at the originally specified location is loaded into the load from context. This behavior ensures that the assemblies loaded into the load context are only those assemblies that are available to the JIT-compiler through the "standard" and predictable binding process. Other assemblies are loaded into the load from context and are invisible as far as the assemblies in the load context are concerned. By the way, the reverse of this statement doesn't hold: assemblies in the load from context are allowed to see and reference assemblies in the load context.
It is crucial to note that assembly references loaded by assemblies in the load from context will be loaded into the load from context, regardless of whether they are statically referenced or loaded dynamically using Assembly.Load or Assembly.LoadFrom (unless an assembly with the same display name and location is already loaded in the load context). It's also important to note that those references (of assemblies loaded into the load from context) will be probed in the referencing assembly's directory, i.e. in the directory associated with the assembly in the load from context, after they are probed in the application base directory.
Finally, it's important to note that assemblies loaded using the Assembly.Load and other equivalents, which take a byte parameter to supply the assembly binary content, or assemblies that are dynamically constructed using Reflection.Emit, CodeDom or other code-generation APIs, are not loaded into either of the above contexts. By the way, their references are not automatically being loaded, either.