Monday, January 09, 2012

COM Interop Part 9: Custom Activations

So far, we've seen how to activate a single instance of a COM object, both locally and remotely, using CoCreateInstance or CoCreateInstanceEx. But there are many other ways to get a COM object activated, some of which allow us to do pretty interesting things. Today, to finish out this introductory series on COM interop, we're going to take at those other methods: CoGetClassObject, DllGetClassObject, and custom SDK functions.

The COM Class Factory

One of the requirements for building a COM server is that it must implement a separate class factory for each CoClass that it exposes to COM. This class factory implements a single interface, IClassFactory (or, optionally, the derived IClassFactory2), which in turn creates instances of the appropriate CoClass.

When we call a method like CoCreateInstance, COM is actually hiding the details of the class factory from us. Internally, the COM subsystem is launching the server, obtaining an appropriate class factory, using that to get a component instance, then releasing the class factory.

As you can tell, if we create multiple instances of the same component, this adds a lot of needless overhead to the process. To eliminate this overhead, we can just do what COM is doing already, and grab our own instance of IClassFactory.

To start with, we'll need the interop definition for IClassFactory itself. The definition is found in unknwn.idl, and contains our first example of a remote-call interface. Here's the IDL:
interface IClassFactory : IUnknown
    typedef [unique] IClassFactory * LPCLASSFACTORY;
    HRESULT CreateInstance(
        [in, unique] IUnknown * pUnkOuter,
        [in] REFIID riid,
        [out, iid_is(riid)] void **ppvObject);
    HRESULT RemoteCreateInstance(
        [in] REFIID riid,
        [out, iid_is(riid)] IUnknown ** ppvObject);
    HRESULT LockServer(
        [in] BOOL fLock);
    HRESULT __stdcall RemoteLockServer(
        [in] BOOL fLock);
A few points of interest here. First, there's no [local] attribute as there was with IUnknown, because this is not a local-only interface. Second, it looks like we have two sets of methods: one with the [local] attribute and a second with the [call_as] attribute, referring back to the first. When you see an interface like this, you only need to define one set of methods in C#. The [call_as] methods don't actually appear in the interface's method table (it's vtable), so we don't want to include them in our managed definition. They are an implementation detail, used to auto-generate stubs for the remote proxy classes. (In this case, a local client calling LockServer will call the LockServer method; a remote client calling LockServer over an RPC channel will actually end up calling RemoteLockServer.) The end result is, we basically ignore any methods tagged with [call_as].

With that out of the way, the translation is nothing we can't handle ourselves:
public interface IClassFactory
    [return: MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 1)]
    void CreateInstance(
        [MarshalAs(UnmanagedType.IUnknown)] object pUnkOuter,
        [In] ref Guid riid);

    void LockServer(
        [MarshalAs(UnmanagedType.Bool)] bool fLock);
I'm using object for the aggregation outer IUnknown here, and not IntPtr. This is an interface I will actually use in production code, so I don't want to limit myself for no reason; since it's decared as an object, I can just pass in a null in the typical case, and a real value if I actually need it. Also, the bool type in C# is another of those types (like strings or arrays) that has more than one possible unmanaged meaning. In this case, the MarshalAs attribute is redundant, as the 4-byte BOOL type is the default mapping, but as before, in this case we're always going to be explicit.

Getting The Class Object

To obtain a class factory, we ask our COM server to create what's called the class object. This is an object whose entire purpose is to implement IClassFactory and instantiate CoClasses for us. Not surprisingly, we obtain this class object using the CoGetClassObject function, which is defined like so:
HRESULT CoGetClassObject(
  __in      REFCLSID rclsid,
  __in      DWORD dwClsContext,
  __in_opt  COSERVERINFO *pServerInfo,
  __in      REFIID riid,
  __out     LPVOID *ppv
This looks like a hybrid of our two CoCreateInstance functions, and that's basically true. However, in this case we only have one function, which we use for both local and remote activation, and that causes a bit of a problem.

The catch is that COSERVERINFO parameter. In our remote activation case, we pass in a structure to that parameter with the remote server information. In our local activation case, we should pass NULL. This is problematic because you can't pass null to a C# function that expects a ref parameter. We could change ComServerInfo from a struct to a class, then use a local variable that's set to null. But there's another option, made possible by the DllImport attribute. We're going to provide two different managed methods that call the same entry point, each with different (but compatible) parameter lists. To do this, we give one of them a different name, then set the EntryPoint field of the DllImport attribute to the original name, like so:
[return: MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 3)]
[DllImport("ole32.dll", PreserveSig = false)]
public static extern object CoGetClassObject(
    [In, MarshalAs(UnmanagedType.LPStruct)] Guid rclsid,
    RegistrationClassContext dwClassContext,
    IntPtr pServerInfo,
    [In, MarshalAs(UnmanagedType.LPStruct)] Guid riid);
[return: MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 3)]
[DllImport("ole32.dll", PreserveSig = false, EntryPoint = "CoGetClassObject")]
public static extern object CoGetClassObjectRemote(
    [In, MarshalAs(UnmanagedType.LPStruct)] Guid rclsid,
    RegistrationClassContext dwClassContext,
    ref ComServerInfo pServerInfo,
    [In, MarshalAs(UnmanagedType.LPStruct)] Guid riid);
If we need to create a local class factory, we call the first version, and pass IntPtr.Zero to the parameter. If we need to create a remote class factory, we call the second version, and pass a real ComServerInfo structure.

Now, lets see how we might use this in code. To request local activation, we just supply a null value for the server info, and ask for the IClassFactory IID. (We could also ask for IClassFactory2, which supports ActiveX licensing, if we needed it.) From there, we use IClassFactory::CreateInstance just like we would use CoCreateInstance:

var factory = Ole32NativeMethods.CoGetClassObject(new Guid(CustomClsId), 
    RegistrationClassContext.InProcessServer, IntPtr.Zero, typeof(IClassFactory).GUID)
    as IClassFactory;
var iid = typeof(ICustom).GUID;
var c1 = factory.CreateInstance(null, ref iid) as ICustom;
var c2 = factory.CreateInstance(null, ref iid) as ICustom;
var c3 = factory.CreateInstance(null, ref iid) as ICustom;
var c4 = factory.CreateInstance(null, ref iid) as ICustom;

Activating Unregistered COM Servers

If you're paying close enough attention, you may have noticed there's one thing we still haven't talked about in this activation process: how does COM know where to get the class factory for a given server? The answer depends on which kind of server is involved, in-process or out-of-process. The details can be found in this article that details the responsibilities of a COM server.

In both cases, the server is assumed to have an implementation of IClassFactory for every CoClass it provides, and the server is responsible for storing the correct information in the registry to allow CoGetClassObject to locate that class factory. From there, things diverge a bit.

When CoGetClassObject needs to activate an out-of-process component that it has never seen before, the following (highly simplified) process occurs

  1. COM launches the server binary, with the special command-line parameter /EMBEDDED
  2. The server initialized COM, then creates an instance of its class factory
  3. The server passes the CLSID and IUnknown pointer for the class factory to CoRegisterClassObject
  4. COM stores this CLSID into an in-memory registered classes table for later use.
Once these steps have been done, CoGetClassObject can simply extract the class factory from the registered classes table and QueryInterface for IClassFactory as needed. (As you can prob ably guess, COM actually checks the in-memory cache first, and only launches a binary on a miss.) In practice, most COM developers use a framework, such as ATL, that does all of this setup work for them. The COM Interop subsystem within the CLR takes care of this for managed COM servers. Since there's nothing really useful for us to do with this, I won't go into much more detail, except to show you what CoRegisterClassObject would look like in C#. The return value is a token that can be later used by CoRevokeClassObject to remove the class factory from the cache:
[DllImport("ole32.dll", PreserveSig = false, CharSet = CharSet.Unicode)]
public static extern int CoRegisterClassObject(
    [MarshalAs(UnmanagedType.LPStruct)] Guid rclsid,
    [MarshalAs(UnmanagedType.IUnknown)] object pUnkOuter,
    RegistrationClassContext dwClassContext,
    uint flags);

For an in-process server, the concept is similar, but uses the existing DLL function export mechanics. All in-process servers are implemented as Windows DLLs, and are required to export a number of pre-defined functions for use by COM. When a new in-process component needs to be activated, this is (roughly) what happens:
  1. COM calls LoadLibrary to load the DLL into the calling process
  2. COM calls GetProcAddress to get the address of the exported DllGetClassObject function.
  3. COM calls the DllGetClassObject function with the CLSID and IID it wants.
  4. The server creates the appropriate instance and returns it to the caller
The key here is that none of these steps actually involve the COM subsystem directly; we could just as easily do them in our own C# program. This means we could, for example, load a COM library that was not actually registered on the local machine. First, we need to translate the two native kernel functions, both of them found in winbase.h:
  __in  LPCTSTR lpFileName
  __in  HMODULE hModule,
  __in  LPCSTR lpProcName
The trickiest part is figuring out what types our return values should be. LoadLibrary returns an HMODULE, which the documentation says is "a handle to" the newly loaded library. GetProcAddress's first parameter is, in turn the HMODULE of the library that contains our function. What this means is, we don't actually care what the return value of LoadLibrary is, we just need to pass it right into GetProcAddress unchanged. This is exactly the kind of situation that IntPtr was made for, so that's what we'll use. Second, that FARPROC type there might look scary, but before we panic, lets see what the documentation says. The return value is "the address of the exported function"; address, in C terms, means "pointer", so we're getting a pointer to a function. Again, this is just the type of thing IntPtr was tailor-made to handle.
One little pitfall hiding in there, for the unattentive, is that second parameter to GetProcAddress. Notice that it is an "LPCSTR", a type we haven't run into before. The C in that name, and in the parameter to LoadLibrary, means const. We've encountered constant pointers before, with the MULTI_QI structure's const IID * parameter type, and this is no different. The const qualifier here is on the data type being pointed to: this is a pointer to a constant string, meaning the string itself cannot be changed through this pointer.
As far as the unmanaged type of this parameter, as we saw before, the const qualifier on a pointer has no equivalent in C#, because it doesn't have pointers. (Yes, for the nitpickers, I should be qualifying that kind of statement by saying "it doesn't have pointers outside of unsafe code". Mentally add there where appropriate. The whole point of interop is to keep the unsafe code out of C# and in the native code where it belongs.) So, as we did before, pretend the const is not there, and we have LPSTR. Notice that this is not the same as the parameter to LoadLibrary, which is an LPTSTR! GetProcAddress always takes an ANSI string, so we have to make sure to apply our marshalling attributes appropriately. The end result are the following two method signatures:
[DllImport("kernel32.dll", CharSet = CharSet.Unicode)]
public static extern IntPtr LoadLibrary(
    [MarshalAs(UnmanagedType.LPWStr)] string lpFileName);
[DllImport("kernel32.dll", CharSet = CharSet.Ansi)]
public static extern IntPtr GetProcAddress(
    IntPtr hModule,
    [MarshalAs(UnmanagedType.LPStr)] string lpProcName);
That's two methods down, one to go. Sort-of.
Unmanaged Callback FunctionsThe last step on our list is to call the DllGetClassObject function, as implemented by the library. However, we have a bit of a problem here. We can't use a DllImport attribute to define this function, because we won't know until runtime what the library name is. That's the whole reason we are using LoadLibrary and GetProcAddress to begin with. What we need, is some way to call the function by way of that IntPtr we get back from GetProcAddress.
Fortunately, P/Invoke has the answer. The Marshal class has a function, GetDelegateForFunctionPointer, that converts an unmanaged function pointer, in the form of an IntPtr, into a managed instance of a delegate. And, by sheer coincidence, we are getting an IntPtr back from GetProcAddress that has such a function pointer!
So, first things first: we need to translate our DllGetClassObject function into a delegate type. This is done in almost the same way as a normal P/Invoke translation, except we don't use DllImport. Instead, we apply the UnmanagedFunctionPointer attribute, and specify the calling convention of the method in the unmanaged code:
public delegate uint DllGetClassObjectDelegate(
    [MarshalAs(UnmanagedType.LPStruct)] Guid rclsid,
    [MarshalAs(UnmanagedType.LPStruct)] Guid riid,
    [MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 1)] out object pUnknown);
With all this in place, the only thing left is to put it all together. This time, I'll demonstrate on an actual example, that I have used in production code. The IFilter interface we use here is used by a number of Windows applications to extract full text from documents; this code manually loads the built-in IFilter for text files that ships with Windows. (Lots more on IFilter in later posts; this is just to demonstrate that this technique actually does work). First, we get our DllGetClassObject delegate from the native library:
var module = Win32NativeMethods.LoadLibrary(@"C:\Windows\system32\query.dll");
var proc = Win32NativeMethods.GetProcAddress(module, "DllGetClassObject");
var gco = Marshal.GetDelegateForFunctionPointer(proc, 
    as Win32NativeMethods.DllGetClassObjectDelegate;
Now, we call our delegate to obtain the IClassFactory, then use it to instantiate an instance of our real object:
var clsid = new Guid("{c1243ca0-bf96-11cd-b579-08002b30bfeb}");
object unknown;
gco(clsid, typeof(IClassFactory).GUID, out unknown);
var factory = unknown as IClassFactory;
var iid = typeof(IFilter).GUID;
var filter = factory.CreateInstance(null, ref iid) as IFilter;
One small problem with this code is, there's no easy way to turn on the return-value rewriting behavior for an unmanaged function loaded this way. It's actually possible: you can generate a P/Invoke extern method entirely at runtime using reflection; this eliminates the need for LoadLibrary and GetProcAddress, but adds a ton of reflection code. The open-hardware-monitor project for this, for example, to access device information, as seen here. This is not for the faint of heart; I'll suffer an out parameter instead.

Custom COM Entry Points

At this point, we have a COM object ready to go. But, I chose IFilter for a reason, because it allows me to demonstrate one final option for obtaining a COM interface. It's very rare that you will need to call any of these COM activation functions directly, especially when doing Windows Desktop programming, because the Windows SDK provides specialized entry points into various components. The full text filter is one such area: there is an unmanaged function, LoadIFilter, that wraps up all of the work we just did into a nice, tidy package for us; it even handles the work of finding the appropriate CLSID for us, based on the file type we're trying to filter. Other, similar methods include StgOpenStorage for opening a structured storage file, and SHCreateItemFromParsingName, to obtain a shell object for a given file or folder.

LoadIFilter, defined in filter.h, looks like this:
HRESULT __stdcall LoadIFilter(
  PCWSTR pwcsPath,
  __in   IUnknown *pUnkOuter,
  __out  void **ppIUnk
This is pretty similar to functions and methods we've already seen before, but this time we have an extra piece of information at our disposal. The MSDN page tells us that the final parameter is:
ppIUnk [out]
A pointer to a variable that receives the IFilter interface pointer.
which means it will always be an IFilter. We can use this to our advantage, and strongly type that final parameter. Combine this with our return-value rewriting, and we end up with this:
[DllImport("query.dll", CharSet = CharSet.Unicode, PreserveSig = false)]
public static extern IFilter LoadIFilter(
    [MarshalAs(UnmanagedType.LPWStr)] string pwcsPath,
    [MarshalAs(UnmanagedType.IUnknown)] object pUnkOuter);
This might look scary, but using it is almost too easy:
var filter = FilterNativeMethods.LoadIFilter(@"C:\Test.txt", null);
Well, that pretty much wraps up our introduction to COM interop (finally). This was a lot longer than I thought it would be when I started. Next time, I'm going to include a short summary of the key points we've covered so far, as a quick reference guide.

In future posts, I'll be relying on COM Interop for a lot of interesting features of the Windows API, so stay tuned.