Monday, January 02, 2012

COM Interop Part 5: Translating IUnknown For Fun And Profit

Well, probably not much profit, but hopefully a little fun.

As promised, this time we're going to write some actual C# interop code. A major downside to learning COM Interop is that there isn't an obvious place to "start small". You basically have to get all of the details right all at once, or things can go catastrophically wrong. Unfortuantely, that means the first few posts will have to include some things that might not make sense right away. Bear with me on this, we'll cover it all eventually.

When starting out to use COM Interop, I typically follow the same basic steps to get from idea to working code.
  1. Identify a COM interface I want to use, and whatever native functions I need to get it.
  2. Locate the IDL or header file in the SDK with the appropriate definitions.
  3. Locate the documentation for the interface or function.
  4. Start translating the interface from IDL
  5. As I encounter additional interfaces, types, etc., repeat steps 2 through 4.
The first step is sometimes the hardest, because Windows has a lot of interfaces available that do some pretty cool things. As you can tell, MSDN is a treasure trove of information here, and is usually where I start. The MSDN page for a function or interface will also tell you what IDL and header files to look in, and describe the intended behavior (the "contract") for the interface. With those two things, we have everything we need to start translating.

So, lets do one. I'm going to start with a relatively simple interface, just to show you how things go. Much of this is boilerplate stuff, so every COM interface you translate will start off the same. The interface we'll be looking at is the same one from last time, IUnknown.

Wherefore IUnknown?
"Wherefore" is an old English word that means "why" or "for what reason". Juliet's lament "Wherefore art thou Romeo?" does not mean "Where are you, Romeo?", it means "Why are you Romeo?", as in "Why did you have to be this guy who's family my parents hate?"

I picked IUnknown to start with for a couple of reasons. It is basically the heart and soul of COM programming -- every other COM interface derives from IUnknown. It's the System.Object of the COM world. This makes using IUnknown crucial for COM developers, and so it's behavior is well understood. Beyond that, it's a simple interface -- only three methods with low to moderate complexity. Lastly, I'm translating IUnknown because there's no way we'll ever use this definition in real code, so it can safely stay "broken" for a little while.

Wait, what? If IUnknown is the core of all COM everywhere, how can we possibly not use it? The reason is that the CLR uses it for you. The rules of when and how to use IUnknown are very specific, and mandated by COM. Failure to follow them causes lots of problems: memory leaks, objects disappearing while in use, and various undefined behaviors. To prevent those kind of problems from plaguing the CLR internally, much less our C# applications, the runtime's marshalling system automatically inserts the correct calls where they belong, so you don't have to. (Of course, you can still call those methods yourself, and there are cases where you'd want to, but they are rare exceptions, and there are managed mechanisms for doing so.)

So, while the IUnknown interface we're about to build will be a correct interop translation, you would never actually want to use it in your own code, because it would just confuse the runtime. Consider this exercise for entertainment purposes only.

Understanding IUnknown

Normally, by the time you've figured out what interface you want, you already know what it does. But since we picked IUnknown arbitrarily, we first need to understand what it's used for and what it does. This also gives us an excellent opportunity to see some of the work that's done for you by the runtime wrappers and marshaller.

IUnknown performs three basic functions in the COM world: object identity, object lifetime, and interface discovery. Those last two are handled explicitly through the interface's methods, so we'll talk about those methods when we start translating them. But the first one requires a bit of explanation.

In COM, all interaction with components is done via interfaces. A COM client never has a reference to an actual object (there might not even be an object, if the component is written in C); only references to interfaces on the object. Since any object can implement as many interfaces as it wants, it's quite possible at runtime to have multiple interfaces that span multiple instances of a given object. It might be important, at some point, to know which interfaces point to the same underlying object, and thus share their state information. Since they have different data types, how can we do this?

The solution is IUnknown. COM provides a way (through IUnknown, in fact), for a client to use any interface reference to ask a component for any other interface reference it might need. All COM interfaces inherit from IUnknown, and all COM objects must implement that interface, by itself, and provide it when asked. In fact, the rules of COM go further than that: any time an object is asked for it's IUnknown interface, no matter what interface you start with, it must return the same physical IUnknown reference to the caller. That means you can compare any two interfaces for object equivalence by ask for their IUnknown pointers and comparing.

When you expose your own objects to COM, the runtime creates a single CCW per instance, regardless of how many COM clients ask for it, and ensures that your objects follow this particular COM rule.

Defining an Interop Interface

Enough talk, lets get to work. The first step in translating any COM interface is understanding how its used. This will become very important as our interfaces get more complex, because there are often multiple ways to translate a single function, depending on how the method is called and/or implemented.

To start with, we'll want to have handy the IUnknown page on MSDN, which also tells us that IUnknown is defined in Unknwn.idl. (If you don't have the Windows SDK installed, you can get everything you need off MSDN, but you really should go get it.)

Another useful resource is the P/Invoke Wiki, which contains community-submitted translations of a large percentage of the SDK. In fact, if you're in a hurry you can often just head there and copy/paste their definitions into your code and go from there. Despite the name, P/Invoke includes an extensive set of COM Interop definitions in addition to the platform invoke signatures.

However, be careful using the translations from that site -- they are not vetted by anyone as being correct. More importantly, as mentioned, there are frequently multiple ways to translate a single interface, some of which are more effective than others. Since anyone can correct any errors, you rarely find a signature that is flat out wrong on the site. However, I often find the translations to be technically correct but practically useless for the situation I'm in. This is one reason I prefer to do them myself, so I can easily spot when other people make mistakes, or just alternate choices that I don't agree with. Mostly, I use P/Invoke as a quick check of my own work; if I see something significantly different between my version and theirs, I'll make sure I understand why before moving on.

So: IUnknown. As a reminder, this is the IDL for the interface, from Unknwn.idl (I'm omitting all the cpp_quote commands, as they're just included to make the C headers more useful):
interface IUnknown
    typedef [unique] IUnknown *LPUNKNOWN;
    HRESULT QueryInterface(
        [in] REFIID riid,
        [out, iid_is(riid), annotation("__RPC__deref_out")] void **ppvObject);
    ULONG AddRef();
    ULONG Release();
To translate this, we need an existing C# project; I have one specifically for COM Interop definitions, which I find convenient. So, I'll open up my KutuluWare.Framework.Interop project and add a new C# interface called IUnknown:
namespace KutuluWare.Framework.ComInterop
    public interface IUnknown
The next couple of steps are standard, boilerplate code you need to include on any interop definition. We need to tell the compiler that this is a COM interface, and identify exactly which interface we're talking about. We do this by adding three .NET attributes to our interface definition (plus a handy comment in case I want to refer back to the original source later):
namespace KutuluWare.Framework.ComInterop
    /// <summary>
    /// Translation of the IUnknown interface from the Windows SDK v7.1, Unknwn.idl
    /// </summary>
    public interface IUnknown
All three attributes are found in the System.Runtime.InteropServices namespace, where almost all of the useful COM Interop classes are. Lets take these one at a time:

ComImportAttribute: This attribute is what tells the runtime that this is an interop definition of an interface already defined by COM. This triggers all of the important COM-like behaviors on methods calls made through the interface, such as creating the RCW and inserting the IUnknown calls where needed.

GuidAttribute: We can apply a GuidAttribute to any type we want, but with a ComImport type it has a special meaning. Each COM interface is identified by a pre-defined, publically known Interface Identifier, or IID, that is a globally unique identifier. (Note that most of the ID acronyms you see in COM: GUID, CLSID, IID, etc. are all just different special-case uses of a UUID; the only exception is ProgID, which is a string, and we won't see much of those). The IID is used anywhere COM needs to know the "type" of an interface, so it is absolutely crucial that we get it right.

The correct value for the GUID attribute can always be found in the IDL, specifically, it's given by this attribute:
Applying this attribute to our class does two things for us. The immediate benefit is that the runtime uses this GUID when it needs to identify the interface to unmanaged code (in particular with QueryInterface, which we'll see next time.) A nice bonus, however, is that the GUID is now part of the managed type's metadata, specifically the Type.Guid property. Since COM often requires us to pass IIDs to various functions, we can take advantage of this to avoid hard-coding GUIDS, and use expressions like typeof(IUnknown).Guid instead.

InterfaceTypeAttribute: In COM, there are three types of interfaces: IUnknown, IDispatch, and dual. IUnknown-type interfaces are the most common type in the Windows SDK itself -- they are "early-bound" interfaces where the compiler knows all the methods, types, etc. and produces code that calls into the interface directly. IDispatch-type interfaces are used mostly for script-based automation purposes -- they are "late-bound", in that the method names are looked up at run-time and dispatched by the IDispatch handler, and any type or name mismatches aren't detected until then. A dual interface is an interface that supports both types of method call, depending on the client. IDispatch-only interfaces are not too common, since it usually takes minimal effort to convert one to a dual interface and all the early-bound benefits that come with it. For this reason, most ActiveX and OLE Automation interfaces are of the dual type, and ComImportType.InterfaceIsDual is the default for an imported COM interface. In this case, we aren't looking at dual interfaces, so we specify ComImportType.InterfaceIsIUnknown to indicate an IUnknown-only interface.

Notice that we've ignored the other MIDL attributes from the original definition; to C# they don't mean anything. The local attribute is used (with call_as) to generate remote-callable wrappers for remotable interfaces, which we aren't doing here. The object attribute simply indicates that this is a custom MIDL interface, which all COM interfaces are in C#. The final one, pointer_default, describes the default pointer behavior for C++ clients; since we don't get to use pointers directly, this one is also irrelevant.

That's it for the basic definition of this interface. Next time, we'll finish this out by declaring the methods, and start looking at some of the interesting things the runtime marshaller does to make our lives easier.