Wednesday, January 04, 2012

COM Interop Part 7: COM Activation

So far in this series, we have focused on how to translate the methods of a COM interface. To fully exploit the power of interop, however, we need to understand a couple more types of native code element: structured types, enumerations, and functions. (For clarity, from here on out, when I say "function" I mean a native non-COM function, as opposed to a "method" which is part of a COM interface.) Windows-provided COM components make frequent use of all of these elements, so it's important that we understand the rules for those as much as the COM interfaces themselves.

This time, we're going to focus on native function calls. These are accessed via an interop mechanism called "platform invoke", or P/Invoke, which is subtly different than COM Interop. For the most part, P/Invoke calls are used to deal with part of the Windows API that are not based on COM, particularly the kernel itself, the GDI engine, and the windowing system. However, the COM-based SDK components also expose a small, but critical, number of P/Invoke calls that we need to use. To start off, we're going to take a look at the P/Invoke methods that do something we've kind of glossed over up until now: give us an initial reference to a new COM instance.

CoClasses: The Starting Point For COM

As we saw last time, once we have a reference to a COM interface, we can use typecasts or the as operator (translated into QueryInterface calls) to get other interfaces on that same object. But that initial interface has to come from somewhere. Among other things, we need some way to find and load the COM server (the general name for the DLL, OCX, or EXE that provides a COM component), instantiate a new copy of the component, obtain a reference to an interface on that object, and return it to the caller. How do we accomplish all of that, when our server may be a different process or even run on a different computer?

The process of creating a new component instance and retrieving the first pointer to one of its interfaces is called "activation". In native COM code, there are four ways to obtain a newly activated instance of a component:
  1. Via the CoCreateInstance function.
  2. Via the IClassFactory interface, obtained by CoGetClassObject 
  3. Via the IClassFactory interface obtained by DllGetClassObject
  4. Via a custom function call (which may or may not do one of the three previous things)
Three of these methods require that you know the CLSID of the component that you're creating. We haven't talked about CLSIDs a lot thus far, but they serve much the same purpose as an IID. Each class that implements IUnknown is called a CoClass, and it has a GUID, the CLSID, that uniquely identifies it. This CLSID is what gets stored in the registry when you register a COM component, along with other useful information. The most important piece of additional information is the InProcServer32 or LocalServer32 entry, which specify the in-process or out-of-process path, respectively, to the server binary. Normally, in COM, we don't deal with CoClasses directly, we deal with interfaces. This is the one time when we need to know about the class itself, because somehow we need to tell COM which concrete type to instantiate. When exporting your own C# classes to COM, you apply the Guid attribute to it to give it a CLSID.

In managed code, the first activation option is by far the easiest, as it's directly supported by the Framework:
var t = Type.GetTypeFromCLSID(CustomClsid);
var instance = Activator.CreateInstance(t) as ICustom;
This performs a two-step unmanaged activation of the COM object: first, CoCreateInstance will be called to obtain an IUnknown pointer to the newly activated component. Then, the as operator will be translated into a call to QueryInterface, to obtain an ICustom pointer.

We can actually consolidate this into a single step, if we are willing to go directly to the COM functions and bypass the Framework. It's probably not worth the effort to do this in a production program, but its good practice for our interop skills, so lets do it anyway. The function we need is in the ole32 library, so we'll create an Ole32NativeMethods class and add the translation there. The definition from MSDN tells us that this function is defined in objbase.h, and there we find the following function definition:
__checkReturn WINOLEAPI CoCreateInstance(__in     REFCLSID rclsid, 
                           __in_opt LPUNKNOWN pUnkOuter,
                           __in     DWORD dwClsContext, 
                           __in     REFIID riid, 
                           __deref_out LPVOID FAR* ppv);
This is why I prefer the IDL to the header files: the attributes seen here (anything starting with a __) are Visual C++ specific attributes, and the return type of WINOLEAPI is just a long chain of typedefs that, eventually, gets us back to HRESULT. For cases where there is no MIDL, we're probably better off just starting with the MSDN version, which is closer to what the MIDL would have looked like:
HRESULT CoCreateInstance(
  __in   REFCLSID rclsid,
  __in   LPUNKNOWN pUnkOuter,
  __in   DWORD dwClsContext,
  __in   REFIID riid,
  __out  LPVOID *ppv
Now this, we can translate. There's really only one bit here we haven't seen before, that LPUNKNOWN type. That is the Windows SDK name for a "void *", which we learned last time means "pointer to something of some type we don't know" (thus the name LPUNKNOWN.) We could translate that as an object, but I want to take this chance to point out a very useful data type provided by the interop subsystem: the IntPtr type.

IntPtr: Doing Everything By Hand

As its name implies, IntPtr (and it's sibling UIntPtr) is a managed data type that represents an unmanaged, integer-sized pointer. Its purpose is to provide a way for languages that deal heavily in pointers to talk safely to languages that don't deal with pointers. Many of the advanced interop features, exposed through the System.Runtime.InteropServices.Marshal class, accept IntPtr types as parameters. Essentially, we can use this type to receive pointers from unmanaged code, and ask the marshalling class to do unmanaged stuff to those pointers for us, and give us back something safe and managed.

As we get into more complex COM features of the Windows SDK, we will start running into cases where we need to do some of the grunt work ourselves using IntPtrs. For example, many COM methods have optional output parameters, where the parameter passed in might be NULL; but otherwise is a value type like an int. We can't use an out parameter for this because NULL isn't a valid int value, and trying to write an integer to a NULL pointer will crash our program. Instead, these parameters become IntPtrs and we marshal the data by hand. Other times, we'll be required to allocate and free memory from the COM memory heap ourselves, which the Marshal class can do for us, by giving us an IntPtr pointing to that block of unmanaged memory.

In this case, however, we're mostly interested in a handy feature of the structure, the static field IntPtr.Zero. This is a value that we can provide to any unmanaged function that means NULL, in cases where we can't actually supply a null object. To do so, we have to define the parameter in question as an IntPtr, which means we lose all type safety. For this reason, we only want to use IntPtr in cases where type safety has already long since vanished.

For CoCreateInstance, that LPUNKNOWN parameter is used for a very specific (and not very common) operation called COM aggregation, which is a special type of delegation where the container/containee structure is transparent. I've yet to run into a case where aggregation is used, and have so far been able to ignore it. That means that pUnkOuter will usually (for our purposes, always) be NULL. This is one of the cases where IntPtr really comes in handy. (If I were planning to use this interface in production code, I would probably make the parameter an object anyway, even if I never planned to use it; this is for illustrative purposes only.)

Creating COM Instances

With this in mind, here's our translation of CoCreateInstance:
[returnMarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 3)]
[DllImport("ole32.dll", PreserveSig = false)]
public static extern object CoCreateInstance(
    [Inref Guid rclsid,
    IntPtr pUnkOuter,
    RegistrationClassContext dwClassContext,
    [Inref Guid riid);
The first thing to note is that this method has a DllImportAttribute, instead of a ComImport attribute, and that we have to specify the name of the DLL that it's imported from. (To import from a library not in the system path, you can specify a full name, but we'll see an alternate way to do this in a future post.) The second thing to note is that P/Invoke functions have the PreserveSig behavor enabled by default, so in this case we need to explicitly turn it off to get the return-value rewriting to work.  (If you don't, you will get a very nasty exception that amounts to the managed-code version of a GPF.) For some reason, we don't use the PreserveSig attribute directly for P/Invoke functions, we use a field in the DllImport attribute instead. The effect is the same.

We also got a bit lucky with this translation. One of the parameters to this method is an enumeration, although it claims to be a DWORD (an integer). C and C++ do have enumerated types, but whether the SDK uses a true enum or just a set of macros is somewhat hit-or-miss. In either case, the function signature itself rarely uses the enumerated type name, and instead specifies just the underlying primitive type. We know it's an enumeration because of the MSDN documentation:

dwClsContext [in]
Context in which the code that manages the newly created object will run. The values are taken from the enumeration CLSCTX.
When you see a parameter like this, you have a couple of options. Some people prefer to leave these typed exactly as defined, in this case a uint, and define const fields with the appropriate values. This matches up closely with how the unmanaged C/C++ code probably works, but in my opinion, isn't the best way to go. Instead, I would translate that CLSCTX type into a managed enumeration.

In this case, in fact, Microsoft already did that for us. This enumeration is also used when registering a COM server, so it was needed by the Framework, and we get to reap the rewards. Note that Microsoft renamed the enumeration from it's SDK name, CLSCTX, into a much more friendly RegistrationClassContext. I follow this practice myself when translating custom structured types or enumerations from the SDK, which makes the code feel more "natural" when translated into C#. There's nothing special about the way this enumeration is defined (apart from explicitly specifying all of the values); the definition from the Framework looks like this:

public enum RegistrationClassContext
    InProcessServer = 1,
    InProcessHandler = 2,
    LocalServer = 4,
    // Rest of the values omitted for brevity

One change I probably would have made here, would be to base that enumeration on uint, since DWORD is an unsigned integer and I want the ranges to match up. (See this table of data types to match up the SDK types with their C# equivalents.) In this case, the possible values for the enumeration aren't anywhere near the Int32.MaxInt value, so in practice, things just work out. It's certainly not a big enough deal to motivate me to write my own version of this enumeration. In any case, as long as the actual numeric values being passed across the managed/unmanaged boundary are correct, the enumeration will just work seamlessly.

With this new method, our earlier sample that used Activator.CreateInstance would instead look like this:
var clsid = new Guid(CustomClsid);
var iid = typeof(ICustom).GUID;
var instance = Ole32NativeMethods.CoCreateInstance(ref clsid, IntPtr.Zero, 
    RegistrationClassContext.InProcessServer, ref iid) as ICustom;
Ugh. That looks ugly. Having to pass those GUIDs as reference parameters is really making this more complex than it ought to be. Fortunately, there's a better way.

LPStruct and Special GUID Handling

We mentioned last time that these REFIID parameters are GUIDs that are being passed by reference (that is, they are pointers to GUIDs), which we indicate by making them ref parameters. However, that means we are limited in what we can pass to these methods, to those things that are valid ref parameters. We also mentioned last time that this pattern, of passing a GUID pointer as an [in] parameter, happens all over COM. So much so that Microsoft actually made an UnmanagedType option that specifically for  this situation. I emphasized that for a reason: this particular unmanaged type has a very misleading name: UnmanagedType.LPStruct, which would imply that it works for any kind of structure. However, the value is only intended to be used in the very specific case we have here. In particular, all of the following must be true:
  1. You are defining a P/Invoke method, not a COM interface (thus, why we did not use this for QueryInterface)
  2. The parameter is a pointer to GUID or one of it's typedef aliases (REFIID, REFCLSID)
  3. The parameter is documented as an [in] parameter
Now, my own brief testing shows that the marshaller appears to work in the more general case (in particular, where neither #1 nor #2 are true), but interop code is one place where things often work "by accident" -- only to break at the least opportune moment. For now, to be safe, we'll stick to the advice from the above-linked blog post, and only use this trick when all of those are true. Fortunately, this accounts for most of the cases where we pass a GUID by reference; QueryInterface is the obvious exception, but we never call that directly, so it won't bother us a lot.

Armed with this new information, CoCreateInstance becomes:
[returnMarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 3)]
[DllImport("ole32.dll", PreserveSig = false)]
public static extern object CoCreateInstance(
    [MarshalAs(UnmanagedType.LPStruct)] Guid rclsid,
    IntPtr pUnkOuter,
    RegistrationClassContext dwClassContext,
    [MarshalAs(UnmanagedType.LPStruct)] Guid riid);
and, more importantly, our calling code now looks like this:
var instance = Ole32NativeMethods.CoCreateInstance(new Guid(CustomClsid), IntPtr.Zero, 
    RegistrationClassContext.InProcessServer, typeof(ICustom).GUID) as ICustom;
There's only one drawback to this function: it only works if your COM object is properly registered on the local machine. Next time, we'll look at how we would activate a remote COM server, and dig into the remaining types of unmanaged data: arrays and structures.