Monday, January 02, 2012

COM Interop Part 6: IUnknown and RetVal Rewriting

Today, we're going to finish out the definition of IUknown that we started last time, by adding the three methods to the interface. While it's a very simple interface, compared to some of the others we'll get to later, it does highlight a number of interesting and important behaviors that the runtime marshaller exhibits when calling a COM interface.

[Edited 1/3/2012 to include information about how the runtime translates is/as into QueryInterface].

To start with, this is where we left off last time with our interface:
namespace KutuluWare.Framework.ComInterop
{
  /// <summary>
  /// Translation of the IUnknown interface from the Windows SDK v7.1, Unknwn.idl
  /// </summary>
  [ComImport]
  [Guid("00000000-0000-0000-C000-000000000046")]
  [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
  public interface IUnknown
  {
  }
}
Now lets add some methods. As we can see from the IDL, IUknown has three methods: QueryInterface, AddRef, and Release. Within our C# definition, we must list the methods in the same order as they are in the IDL -- this is one of the hard and fast rules of COM programming. However, that first one is actually the most complex of the three, so I'm going to skip it for now, and come back later. Instead, lets look at AddRef and Release.

COM Reference Counting

If you recall from last time, we learned the three basic purposes behind the IUnknown interface, the second of which was lifetime management. In managed code, object lifetimes are handled for us by the garbage collector and the language rules for variable scope and visibility. In short, objects are born when we create the first reference to one and die when we lose the last reference to one. In COM, the component developers are expected to manage the lifetime of their objects manually; this is important because COM objects may live in another process, or even on another machine, where things like scope and reference trees aren't quite as meaningful.

This lifetime management is handled via the IUnknown::AddRef and IUnknown::Release methods. When code obtains an interface pointer from a new instance of an COM object, the object internally sets its reference count to 1. Thereafter, any time a client makes a copy of that interface pointer, it is expected to manually call IUnknown::AddRef on the interface. When receiving this call, the component should increment the reference count for that instance by one. When a client is finished with an interface pointer, and its about to go out of scope, the client should call IUnknown::Release . The component should decrease its reference count by one, and if the reference count has reached zero, the component can free itself and release any resources it has. (Components can have more complex implementations of AddRef and Release, if needed, but the client-visible behavior must be essentially similar: the component must survive so long as the number of AddRef calls is greater than the number of Release calls.) For example, if a COM method has an [in/out] parameter of an interface type, and it wants to return a new value, it must call Release on the interface value passed in, as well as calling AddRef on the interface value about to be returned.

The RCW and CCW wrapper objects created for us by the interop runtime are responsible for handling these details for us. Whenever we perform assignments with COM interface types, the wrapper classes internally call the AddRef and/or Release methods for us. Similarly, the runtime calls Release on COM objects when they are garbage collection. For extremely rare cases, the static Marshal object has AddRef and Release methods that can be used to call these methods manually, but they are only really useful for testing or debugging purposes.

The AddRef and Release methods are documented as returning an unsigned integer value, a ULONG. This value represents the new reference count for the object, after performing the increment or decrement operation. However, these values are explicitly defined as non-deterministic: they can be useful for debugging memory leaks, but should never be relied on to have any particular or meaningful value beyond "zero" or "not zero".

Armed with this information, we can add two of our three methods to our interface:
[ComImport]
[Guid("00000000-0000-0000-C000-000000000046")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IUnknown
{
  uint AddRef();
  uint Release();
}
Looks easy enough, right? Except, there's a problem, and it's kind of a tricky one. If we define our interface like this, we're going to have a issue because of a trick the marshalling code performs by default on any ComImport method. It's going to want to rewrite our method signatures so they return void instead of int. This is usually a good thing, but in this case it's a potentially a big problem for us at runtime.

Automatic Exception Throwing

Why would the runtime change our method signature without telling us? When would that ever be a good thing? As it turns out, pretty much all the time.

As you'll soon begin to notice, practically every COM interface method returns a result code, called an HRESULT, which is just a fancy name for a 32-bit integer. The HRESULT is actually a bitwise combination of several pieces of information (which you can see at the top of wtypes.h, if you're interested) related to any errors your might encounter from a COM component. The standard behavior for a COM method is to return an HRESULT of S_OK (which is 0) on success, or an error code on failure. To make things easy, the most significant bit of the HRESULT acts as an error flag: if it's set to 1 (meaning the value is negative), that's an error code; otherwise its a success code. When calling COM methods in code, you are expected to check the return value of every call to see if the error bit is set; the SDK even provides C/C++ developers SUCCEEDED() and FAILED() macros for just that purpose.

Because this is such a common idiom in COM development, the marshaller tries to hide it from you, and make things a bit more managed-code friendly. When it sees a COM method call who's return value is a 32-bit integer, the runtime will check to see if that value is a success or error HRESULT. If the result represents an error, than the runtime will throw whichever managed exception most closely matches the meaning of the error, or a ComException, if it can't come up with anything better. (There is also some slight of hand with the return value going on, but we'll see more on that when we get to QueryInterface).

Normally, this would be a great thing, and we'll take advantage of this feature as often as possible for our own interfaces. But there are two cases when this rewriting is a problem, and this is one of them. Those few COM methods actually return an int that is just an integer, not an HRESULT, could potentially return anything. We certainly don't want the runtime to throw an exception just because a method returns a perfectly legal value that just happens to be < 0.

The PreserveSig Attribute

The documentation for AddRef and Release tells us that these methods don't return an HRESULT, they return a reference count. While this count can't actually be negative, it can be larger than the biggest signed integer; if the value ever went over 0x7fffffff, it would look to the marshalling code like an error, and we'd get an exception. Fortunately, we can tell the runtime not to do any of this trickery, that we really mean it when we said int, by applying the PreserveSigAttribute to the methods as appropriate:
public interface IUnknown
{
  [PreserveSig]
  uint AddRef();

  [PreserveSig]
  uint Release();
}
This attribute serves a dual purpose in the COM interop world. If you are defining your own COM methods, the Type Library Exporter will perform the return value rewriting logic on your methods in reverse: taking methods that return normal values and translating them into HRESULT return values with appropriate output parameters. Using the PreserveSig attribute prevents the exporter from changing your methods, and is the only way to export a managed method into a type library that doesn't return an HRESULT.

On the flip side of the coin, the PreserveSig attribute is sometimes applied by the Type Library Importer when importing methods that don't return an HRESULT, to inform the marashaller not to try and translate the return value into an exception. This is done by default, for example, when using dispinterface style interfaces, unless you specify the /transform command-line parameter during import. Here, we're mimicking the importer's behavior, and tacking the attribute onto our method signatures ourselves, to turn off the exception throwing features.

QueryInterface and Managed Typecasts


Finally, lets get back to the first method in our interface, QueryInterface. This method is responsible for handling the third of IUnknown's three basic functions: interface discovery. Calls into this method are the way that COM components provide different interfaces on the same underlying instance of a COM components.

As we mentioned last time, when discussing object identity, all COM interfaces derive from IUnknown, and specifically, to the same underlying implentation of IUnknown for any given instance. This means a client can use any pointer to any interface to call QueryInterface, and is guaranteed to get the same behavior. This is, in fact, another one of the concrete rules of COM. A client calls QueryInterface on an existing interface pointer and provides the IID of another interface it wants, on the same instance, and provides an output parameter of the appropriate type. If QueryInterface returns S_OK, the output parameter now holds a pointer to the new interface; otherwise it will usually be NULL. Calls to QueryInterface also perform an implicit AddRef, so it's still necessary to Release the starting interface if we're done with it.

In managed code, we have three operations that directly affect the type of an interface: typecasting (including the implicit typecasts done during assignments), is, and as. When we use any of these operations against an RCW, they are translated into calls to QueryInterface. The RCW even handles the implicit assignment-typecast with the correct combination of QueryInterface and Release calls, all under the hood. The upshot of this is, once we obtain the first one, we can treat COM interfaces in our code just like managed interfaces.

Directing the Marshaller


So, now that we understand what QueryInterface is for, lets translate it. As a reminder, this is what it looks like in IDL:
HRESULT QueryInterface(
  [in] REFIID riid,
  [out, iid_is(riid), annotation("__RPC__deref_out")] void **ppvObject);
You can see here that this method does, in fact, return an HRESULT. Additionally, the method is documented as only having a single successful return value, S_OK. All other return values are errors. This means we're going to want the automatic exception throwing behavior we described earlier.

However, we're going to go a step further here, and allow the interop system to actually rewrite this method signature entirely. The marshalling code is actually pretty robust, and will often give us more than one option when translating a given method. The trick is to understand what options are available and how to get them.

In this case, we have three good options for translating this method. Here are all three, side-by-side, so you can compare the similarities and differences:
int QueryInterface(
  [In] ref Guid riid,
  [Out, MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 0)] out object ppv);
 
void QueryInterface(
  [In] ref Guid riid,
  [Out, MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 0)] out object ppv);
 
[return: MarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 0)]
object QueryInterface(
  [In] ref Guid riid);
Whew, there's a lot going on here, so lets start with the parts that are the same.

In and Out Parameter Attributes

First, you'll notice that we has a seeming contradiction with our first parameter: its a ref parameter but we tagged it with an InAttribute. This is somewhat of a micro-optimization, but you will see this pattern used a lot, and I prefer to use the attribute just to make the intent of the code more clear.

What's happening here is that the first parameter to QueryInterface is a REFIID; any time we see a Windows SDK type that starts with REF, it usually means "pointer", and this is no different. It takes some chasing down to find it, but eventually we find this pair of type definitions:
wtypes.h:  typedef IID *REFIID;
guiddef.h: typedef GUID IID;
If you're unfamiliar with the C language, the * in that first definition indicates a pointer type. In other words, a REFIID is just a pointer to a GUID. This is a very common type in COM, along with other styles of GUID pointer like REFCLSID and REFFMTID. Because GUIDs are a complex (structure) type, they are almost always passed by reference, as REF* types.

In C, like in C#, all parameters are passed by value by default. However, while C# includes the out and ref keywords to change this behavior (and C++ has its own technique), C has no such option. All parameters are passed by value. In order to allow a C method to change its parameter, we instead use pointers. Pointers are one of those "easy to learn, impossible to master" concepts that managed code works very hard to hide from us (for good reason). I'm not even going to attempt to explain pointers here; for our purposes, all we need to know is this: the runtime marshaller will automatically translate any "out" or "ref" parameters in our managed method signatures into the appropriate pointer type.

So, by simply declaring our parameter as a "ref Guid", it will be properly translated into the REFIID parameter type.

So what's with the attribute? The COM interop attributes InAttribute and OutAttribute are used for the same purpose as the [in] and [out] MIDL attributes: to help identify which directions we expect data to flow. Recall this diagram from earlier:

Data Marshalling Flow
When the marshaller knows that data is going in the "in" direction only, it won't bother to check the data stored in that parameter when the method returns, and won't do any marshalling work (for example, converting a REFIID pointer into a managed Guid structure); similarly, when it knows that data is going in the "out" direction, it will ignore whatever value is in the parameter when the calls in made initially, and only bother with that parameter on return. The default behavior of the marshaller is to match the expected behavior of the C# keywords: out parameters are [Out], ref parameters are [In, Out], and other parameters are [In]. It's perfectly legal to use these attributes all the time, and some people do that (since it makes the C# almost exactly match the MIDL). When the marshaller's default behavior is already correct, I prefer not to clutter the code with redundant attributes.

There is one common case when the marshaller ends up doing needless work: when parameters are passed as "constant references". These are values that are passed as pointers -- thus, in C#, as ref parameters --, but are never actually changed by the method. They are passed by reference simply to conserve stack space, even on a structure as small as a GUID. Unfortunately, there's no way to express this parameter type in C#, so the runtime ends up marshalling data back out of the parameter when the call returns, though it will never have changed.

This is one of those cases, which we know from reading the IDL and noting the [in] attribute. By applying the In attribute, we let the marshaller know not to bother copying the parameter's value back to the caller. As I said, a micro-optimization, but one that doesn't cost us anything to use.

Manual Data Marshalling

Next, we see our first example of a very powerful attribute: the MarshalAsAttribute, which we will see quite a lot more in the future. The purpose of this attribute is to direct the marshaller how to convert and/or copy a given parameter across the managed/unmanaged boundary. In most cases, when we use strongly-typed parameters, the marshalling code can intelligently translate them on its own. However, this isn't always possible, since COM has a number of common patterns that simply can't be strongly typed.

The second parameter to QueryInterface is one of these patterns. This is an output parameter, which is declared in the IDL as being a "void **". This is really something of a non-type, even in C: a "void *" is a special type that means "a pointer to something, but I don't know what". The meaning of "void **" here follows easily from that: this is a out parameter that's going to receive a pointer to something when the method returns. By itself, that type information doesn't give us a whole lot to work with.

Fortunately, we have the documentation for IUnknown at hand, and we know what this parameter is for:
Upon successful return, *ppvObject contains the requested interface pointer to the object. If the object does not support the interface, *ppvObject is set to NULL.
When trying to interpret this kind of behavior, its often helpful to start by ignoring the language-specific elements and interop details, and just figure out how you would implement this requirement in C#. Often this will bring you very close to the right answer. Here, we have a parameter that, when the method returns, needs to hold an interface on an object, but we don't know ahead of time what interface; it also can be returned as null. There's only one possibly way to represent that kind of parameter in C#: an out object.

But even though C# can't do anything better than System.Object for this paramter, we can. We know that the type coming back in this parameter isn't just any old object: it's a COM interface. This is where the MarshalAs attribute comes into play. The attribute's constructor takes a single parameter: a member of the UnmanagedType enumeration, which includes all of the standard parameter types used in unmanaged code (COM or P/Invoke). In this case, we want UnmanagedType.IUnknown, to tell the runtime that we expect a COM interface that derived from IUnknown to be returned to us in this parameter. (As before, it's perfectly legal to apply redundant MarshalAs attributes that specify the default unmanaged types, but in my opinion there's no benefit to doing so.)

And there's more: many of the MIDL attributes that apply to parameters specify additional metadata about the parameters relative to each other, and many of those have corresponding properties on the MarshalAs attribute as well. In this case, the iid_is MIDL annotation corresponds to the IidParameterIndex property; we are explicitly telling the runtime that, on a successful return, the second parameter will hold a pointer to an interface that matches the interface ID from the first parameter.

Return Value Rewriting

Which brings us to the last, and most interesting, part of this exercise: the return value. This is the biggest difference between the three translations we derived for QueryInterface -- what type of value they returned. The first two options are pretty simple. First we ask for the actual HRESULT value, as an int. We would call this method something like this:

object o;
int x = unk.QueryInterface(typeof(IInitializeWithStream).GUID, out o);
IInitializeWithStream iis = o as IInitializeWithStream;
The main reason we would want to capture the HRESULT of a COM method is because S_OK is not the only valid "success" value. S_FALSE (which equals 1) is defined by the SDK, and is sometimes used to represent "partial success" from a call. Additionally, the HRESULT structure specifically permits components developers to invent their own success HRESULT values to mean whatever they want. If we needed to know the difference between these successful results, we would need the original return value.

We've already seen, though, that QueryInterface only has one possible successful return value. Any return value other than S_OK will throw an exception. That makes the return value mostly meaningless, so our second option just ignores it altogether:

object o;
unk.QueryInterface(typeof(IInitializeWithStream).GUID, out o);
IInitializeWithStream iis = o as IInitializeWithStream;
That's still three lines of code to call a single method, which is one reason we try to avoid output parameters to begin with. Using the output parameter with a void method looks particularly strange. The solution, our last option, fixes this problem quite cleanly. It relied on the fact that QueryInterface uses an output parameter to return the requested interface, since the return value was already taken up by the HRESULT. But we don't care about the HRESULT, obviously, since we're throwing it away as it is. Instead, we ask the interop marshaller to use the output parameter in place of the original return value, and now our call looks like this:

var x = unk.QueryInterface(typeof(IInitializeWithStream).GUID) as IInitializeWithStream;

The details on why this work involve understanding the gory details of the compiled code and the calling convention used to pass parameters and return values between the caller and callee. That's a large, and very technical, topic all its own, so I'll leave it up to you to research if you're interested. For our purposes, we don't need to know how or why it works, only when.

This return value extraction works when the last parameter of the method is an out parameter, and the return value of the method is an HRESULT. In this case, we can simply remove the output parameter from the parameter list, and specify its type as the return value. The runtime code will handle the details and make sure the correct value is provided back to us.

Note that this return value rewriting is separate and independent from the automated exception throwing we talked about with AddRef. Regardless of how we define our QueryInterface method, any error result code will cause an exception to be thrown; in other words, if we used our first, int-returning signature, and the method call returned E_NOINTERFACE we wouldn't actually get the result -- it would throw an exception. However, the PreserveSig attribute controls both behaviors: if we had applied this attribute to the method, we could only have used the first option.

Wrapping IUnknown Up


Although the return value rewriting is the most complex of the three options, and differs the most from the original definition, in my opinion it comes the closest to how a similarly-implemented object in C# would behave. Thus, I tend to use that option whenever possible. Given that, our final IUnknown interface would look like this:

[ComImport]
[Guid("00000000-0000-0000-C000-000000000046")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IUnknown
{
    [returnMarshalAs(UnmanagedType.IUnknown, IidParameterIndex = 0)]
    object QueryInterface(
        [Inref Guid riid);
 
    [PreserveSig]
    uint AddRef();
 
    [PreserveSig]
    uint Release();
}

Next time, we'll look at a slightly more useful interface, and some of the other things we can do to direct the marshaller's behavior, particularly with string data.

0 comments: