Monday, January 02, 2012

COM Interop Part 4: A Look at IDL

Last time, I promised we would get to some real code. That was only slightly a lie: there's code in this post but it's not C#. Instead, I wanted to talk about one more aspect of COM development that we'll be seeing a lot of, once we get started.

As I mentioned previously, the definitions for most of the COM interfaces exposed and/or consumed by Windows are found in the Windows SDK header files. Technically, however, the header files are not the official source of those definitions. Instead, the headers are themselves auto-generated from a set of IDL files, which contain the definitive interface definitions.

A Brief History of IDL

IDL is a fairly generic term -- Interface Definition Language -- that describes any computer language that defines the specification for a software systems' external component interface. There are a wide variety of such IDLS -- WSDL, for example, is an IDL describing the SOAP web services interface, while Facebook developed their own IDL named Thrift to describe their services. IDLs are particularly common when describing a system that includes remote procedure calls, or RPC, in which different systems, potentially running vastly different operating systems, are expected to communicate with each other. The IDL provides a language and platform neutral way of describing the interfaces, which can be used by the appropriate tools for the development environments in use.

In our case, the IDL we are taling about is the Microsoft IDL (MIDL), which was based on the very well-known OMG IDL specification. OMG's version of IDL was originally designed to describe CORBA, a vendor-neutral inter-process and inter-object technology that is conceptually very similar to COM/OLE, Java Beans, or D-BUS. The IDL used for CORBA was based heavily on a multi-vendor software system called a "Distributed Computing Environment", or DCE, which standardized many of the functions a networked computing environment needed to perform.

When Microsoft was building out the Windows NT domain networking model, they included a form of DCE/RPC. They used a slightly modified version of OMG's IDL to describe the interfaces into their networking system. When they

Interface Definitions in IDL

MIDL is a specification language that is very similar to C-style type definitions, but with the addition of a special syntax for adding metadata to those types via attributes. These attributes will look familiar to any C# programmers, as they work essentially the same way in MIDL. For example, here is a simple MIDL definition for the definitive COM interface:
    typedef [unique]  *LPUNKNOWN;
    HRESULT QueryInterface(
        [] REFIID riid,
        [ **ppvObject);
    ULONG AddRef();
    ULONG Release();
As you can see, if you ignore the elements inside the square brackets, this is almost a valid C++ interface definition. The first potential stumbling block here are the type name. The MIDL files use the same type name aliases as the C header files, but if you've never done any C/C++ programming for Windows before, the names might be confusing. The SDK headers general avoid using built-in C type names (like "int" or "long") because their byte sizes can vary: the language standards mandates the minimum size for these types, but not the maximum. To avoid problems with binary incompatibility, Windows development quickly standardized on type aliases, either as typedefs or macros, that explicitly specify the size of the variables. (A quick rule of thumb: if the type name is in ALL CAPITALS, it's probably a typedef or macro).

This page has a nice table (under "Marshalling Arguments") listing the mapping from Windows type to C++ type to CLR type for most of the things you'll find in the IDL or header files, which you can also find near the top of the Wtypes.h header file. Mostly they are pretty self-evident, and translate into similarly-named types in C#. The only exception here is the LONG/ULONG types that are prevalent in COM. When these type names were devised, 32-bit Windows was the norm; in most 32-bit C compilers, "int" and "long" are the same size -- System.Int32 -- with the System.Int64 type being called "long long". For types that aren't listed in that table, you may need to look up their definition in Wtypes.h; we'll see how to do that later on.

The real key to MIDL are the pieces inside the square brackets -- the MIDL attributes. They include the extra pieces of information that aren't strictly part of the interface definition, but describe properties of the interface. Note that metadata here can be applied to the interface itself, a function, or to individual parameters in a function. For example, the MIDL attributes here specify that this is a non-remoteable ("local") custom ("object") COM interface with the specified IID ("uuid") that has non-aliasable ("unique") pointers. It also designates that the first function has one input parameter, and one output parameter, and that the IID of the second parameter is stored in the first one.

Going Forward

In general, I am going to prefer the IDL definitions over the C headers whenever we translate a new interface. I do this for several reasons. Philosophically, the IDL is the "correct" place to go for the interface definitions (although the header files are what actually gets used by C/C++ programs, so which one is "more right" is debatable...) More importantly, the IDL syntax is much cleaner than the corresponding C header file definition. The purpose of IDL is not to be a functional type definition for any given language. Rather, it simply provides a syntax that can be used to generate function type definitions from this simply interface description. The header files, by contrast, include all the scaffolding and bookkeeping needed for support multiple versions of multiple compilers for multiple languages (!) For example, every interface is defined twice in the header files -- once for C and once for C++, using syntax appropriate for the language.

In future posts we'll look in much more detail at the IDL definitions, and cover what all of those attributes mean, and which ones you can safely ignore. Not all of them mean anything useful to C# (in fact, only four of the nine attributes in that snippet have C# equivalents, and three of them are optional), but the ones that do are often critical.

Next time, some C#, I promise.