Monday, February 20, 2012

We Are Marshal(ing)!

If you've been following my interop posts you've seen me toss around the word "marshaling" a lot, without really talking about what that means or how it works. This is an unfortunate oversight, since marshaling is possibly the most central concept behind interop programming. So, lets remedy that. Lets see what it means to marshal.

Marshaling vs. Serialization?

Marshaling (the second 'l' is optional), in general, occurs whenever data is being copied across some logical boundary. When an object is packaged up for transport, we say it is being marshaled; when it is extracted for use on the other side, it has been demarshaled. The whole process, from start to finish, is also referred to as marshaling.

In the C# world, we usually talk about marshaling in relation to the managed/unmanaged code boundary, but there are plenty of other kinds of marshaling going on. Essentially, marshalling happens whenever:
  1. An object is being copied from point A to point B, when
  2. The processes in charge of A and B cannot share the object directly, because
  3. There is some well-defined data boundary in the middle.
In .NET, the most common form of marshalling happens whenever you call into a web or WCF service. Parameters and return values have to be marshalled and demarshalled to cross the process or system boundary. In this case, we usually talk about "serialization", but the two are very closely related (and the distinction is somewhat fuzzy.) In a sense, serialization is a specialized case of marshaling; in another sense, it is the mechanism by which marshaling happens. The primary difference is one of degree, namely, how fully the original object is carried over to its new location.

When an object is serialized, its state is transformed into some form that is sent over the network, or to disk, or whatever. When it is deserialized, that state information is used to construct a new copy of the object in the same state as the original instance. Marshaled objects, on the other hand, are transferred intact, including any codebase associated with them, to whatever extent that is physically possible.

Interop Marshaling

For now, lets just focus on the kinda of marshaling we are most interested in: marshaling objects between managed and unmanaged segments of memory. The runtime already does a lot of this for us, including the automatic marshaling of strings between System.String and BSTR or LPWSTR, the automatic marshaling of C# classes that expose COM interfaces, and the automatic marshalling of C-style structures into C#-style value types. If the default behavior of the runtime marshaler isn't quite right, the judicious application of MarshalAs attributes is usually enough to fix it.

But there are times when things are just too complex for the runtime to handle on its own, and we need to step in and take direct control. That's where the Marshal object comes into play.

Grokking The IntPtr

Unfortunately, at the risk of driving many of you running for cover, I now have to break some tragic news to you. Understanding the marshaling process means understanding pointers. Pointers are simple to define but complex to explain, and the worst kind of language feature: hard to get right, easy to get wrong, and catastrophic when you do.  A ridiculous percentage of bugs, across all applications, are the result of bad memory management due to a misunderstanding of how pointers work.

Fortunately, we don't have to know all about pointers to do interop, just enough about them to find the data we want. If you're truly interested, you can find good introductions to them here and here. For our purposes, we'll ignore most of the complexities of pointers, and stick with this very simplified definition: a pointer represents a chunk of memory. That memory could be 4 bytes (to hold an int), 16 bytes (to hold a decimal), or hundreds of megabytes (to hold a database).

Since (safe) C# code cannot deal directly with pointers, the BCL introduces a special type, the IntPtr, to represent them. IntPtr is defined as a "A platform-specific type that is used to represent a pointer or a handle". The platform-specific part is the size, depending on the native size of integers (32- or 64-bit). If you're familiar with C, an IntPtr is roughly the same as a void *, in both size and purpose. (They are also use to hold Windows API handles, like HWND, but those are mostly unrelated to marshaling.)

When used by the Marshal class, the intent is that your IntPtr represents a pointer to some block of unmanaged memory that you need to manipulate. Where do we get these IntPtrs from? Basically, three places:
  1. We received one as the result of a memory allocation operation, for whatever purpose we might need. 
  2. We received one as a parameter or return value from an interop function, such as an out or ref parameter, or a structured parameter.
  3. We calculate one by adding or subtracting from another IntPtr, to get back an offset into the original memory block.
Manual Memory Allocation

In C#, memory management is mostly an afterthought. The compiler and runtime coordinate things so that memory is there when you need it, and cleaned up when you're done. But since we're interoperating here with languages that don't do that for us, sometimes we have to stoop to their level. To help out, the Marshal class provides two different ways for us to ask it for memory: one for COM, and one for P/Invoke. Once we have that memory, the easiest way to fill it is via the series of Copy overloads that copy a managed array into unmanaged memory. (We'll see other ways to fill this memory later on):
byte[] buffer = FillAByteBuffer();
IntPtr comMemory = Marshal.AllocCoTaskMem(buffer.Length);
Marshal.Copy(buffer, 0, comMemory, buffer.Length);
comObject.MethodThatUsesBuffer(comMemory, buffer.Length);

int[] buffer2 = FillAnIntBuffer();
var pinvokeMemory = Marshal.AllocHGlobal(buffer2.Length * sizeof(int));
PInvokeFunctionThatUsesIntegers(pinvokeMemory, buffer2.Length);
In both cases, we're asking for a block of unmanaged memory large enough to hold our array, and receiving a pointer to that memory. We pass that pointer in to our unmanaged function, then immediately free it. That last bit is key. Jamie Zawinski once famously stated, amid a rant about how awful Java is, that its one redeeming quality that completely overshadowed everything was the fact that it did not have free(). In other words, Java does not require you to manually allocate, or manually release, the memory you use; this is the hallmark of a garbage-collected language like Java or C#. But the garbage collector will not touch memory allocated by the Marshal class; if we don't do it, the memory will leak.

Also, note that you must always free memory with the method that matches how you allocate it, because the two allocation methods use different blocks of memory. (The runtime also follows this rule: out or ref parameters in COM methods are freed using FreeCoTaskMem, for example.) Actually, its more accurate to say that someone must free the memory; that might not always be you. As a general rule, the called function allocates memory for any out/ref/return values, and the calling function frees the memory when it's finished. But always read the documentation carefully: especially in older P/Invoke calls, you sometimes get back a static buffer that you must not free yourself.

The most common reason to need to allocate a chunk of memory is to hold string data, and the Marshal class has a variety of methods specifically for this purpose. In this case, we are asking the CLR to allocate as much as memory as we need to hold the characters, using the appropriate allocator, then automatically copy the characters into it and return an IntPtr to the memory buffer:
var s = "Interop is fun!";

IntPtr comBstr = Marshal.StringToBSTR(s);

IntPtr comWideString = Marshal.StringToCoTaskMemUni(s);

IntPtr piWideString = Marshal.StringToHGlobalUni(s);
There are also ANSI and Auto variants of the Unicode functions, which mimic the behavior of the corresponding UnmanagedType enumeration value. Its rare for a Windows COM object not to use Unicode, however, so I tend to stick to that. Note that COM objects can return either a BSTR (an OLE-style string) or a simple C-style wchar *; be sure you don't confuse the two.

We can also marshal data in the opposite direction: given a pointer, we can get the underlying string data from it. This works just like you would expect:
IntPtr bstr = comObject.GetAComString();
string s1 = Marshal.PtrToStringBSTR(bstr);

IntPtr comLpwsz = comObject.GetAWideCharString();
string s2 = Marshal.PtrToStringUni(comLpwsz);

IntPtr piLpwsz = GetAPInvokeString();
string s3 = Marshal.PtrToStringUni(piLpwsz);
(Auto, Ansi, etc.) Notice that, when a COM object chooses to return a C-style string, it is nearly identical to a P/Invoke function doing so. The only meaningful difference between the two, from our perspective, is which function we use to free it.

Why Bother?

The examples so far have been pretty contrived; all we've managed to do is replicate the same steps the runtime would do automatically, if we used properly-declared string parameters. The obvious follow-up question, then, is why would we ever want to do this manually?

The reason is that, particularly with older C code, it is often impossible for us to know ahead of time whether a given value will be a string or a number. Unlike C#, which enforces string type-safety rules, C is notoriously lenient in the kind of "typecast tricks" you can use. One particularly common technique takes advantage of the fact that, at least on PC hardware, pointers and integers are the same size. A classic example of this is the Windows API's WNDPROC type, which is used to send messages to UI windows. All messages get the same two parameters, two 32-bit integers. But, in practice, those 32-bit values could be integer, strings, or pointers to more complex structures. For example, if we needed to handle the WM_GETTEXT message in C# we would need to do this:
private IntPtr WindowProcedure(IntPtr handle, int message, IntPtr wParam, IntPtr lParam)
  if (message == Message.WM_GETTEXT)
    char[] text = this.Text.ToArray();
    int count = wParam.ToInt32();
    Marshal.Copy(text, 0, lParam, count);
A similar problem occurs in COM with the PROPVARIANT type used to hold COM properties (e.g. the Title and Author of a Word document). This type is a union of over 70 distinct types, including a mix of value and reference types. This isn't legal in C#, even on explicitly laid-out structures, so we need to use an IntPtr for those fields (I plan to cover property storage later; for now focus on how we get the data out of the resulting property variant):
PropertySpec[] requested = MakePropSpec("Author");
PropertyVariant[] values = propSet.ReadMultiple(1, requested);

var title = Marshal.PtrToStringUnicode(values[0].OtherValue);
There is an additional use for manual string marshaling that is wholly unrelated to interop: the SecureString class. This class stores its string data, encrypted, in an unmanaged memory clock (so it won't be GC'd, swapped out, found through reflection, etc.) To get back the string data, you need to manually marshal it back to a managed string:
var s = "Sooper Sekrit."

var secure = new SecureString();
s.ToList().ForEach(c => secure.AppendChar(c));

IntPtr securePtr = Marshal.SecureStringToCoTaskMemUnicode(secure);
string insecure = Marshal.PtrToStringUnicode(securePtr);
Again, there are Ansi and Auto versions of all three of those Marshal methods. Also, notice that we have a dedicated method for freeing a secure string's memory that zeros the memory out first; you should always use this in lieu of the standard free method.

Value Type Marshaling

Strings aren't the only thing we may need to manually marshal. If you downloaded the code for the managed IStream wrapper from an earlier post, you've seen another common example of a C technique that C# cannot properly express: "optional" value parameters. These are out or ref parameters where the caller, if they aren't interested in the value, can simply pass NULL.

Recall from our epic COM Interop series that ref and out parameters are implemented in C through pointers: the unmanaged code receives a pointer to a block of memory large enough to hold a single value type and writes the value there. (This is done because C has no concept of "by reference" parameters: everything is passed by value, but that "value" could be a pointer.) If the parameter is required, then the runtime does this pointer indirection automatically. If the parameter is optional, however, we are forced to manually marshal the primitive type through an IntPtr value:
public void CopyTo(IStream stream, long count, IntPtr read, IntPtr written)
  // Do the copy work and track bytes read.

  if (read != IntPtr.Zero)
      iop.Marshal.WriteInt64(read, totalRead);
The Marshal class has Read and Write methods for every integral type in the Framework (if you need something more complex, like a double, you will probably need to bring in the BitConverter class as well.)

Complex Value Types

The most complex forms of manual marshaling come into play when you need to marshal a complex value type (a structure). Depending on how much of a structure you need, you have two options for extracting structures data from an IntPtr. The normal process is to translate the entire structure into C#, and use the PtrToStructure method to marshal the data intact. However, some complex structures may be more difficult to translate than is warranted. In these cases, you can manually extract individual fields from a structure by doing "pointer math": calculating offsets into a block of memory based on the sizes of the fields in the structure. This last item is one of the most low-level operations you can perform in C# without going unsafe, so make sure you know what you're getting into.

Here's an example of extracting a field using both techniques; note that we're calling the same unmanaged function, which constructs a structure and returns a pointer to it:
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public struct Data
  int field1;
  int field2;
  int field3;
  [MarshalAs(UnmanagedType.ByValTStr, SizeConst=256)] string name;

IntPtr result1 = GetDataStructure();
Data data = Marshal.PtrToStructure(result1, typeof(Data)) as Data;
var name = data.Name;

IntPtr result2 = GetDataStructure();
IntPtr namePtr = IntPtr.Add(result2, 3 * sizeof(int));
var name = Marshal.PtrToStringUni(namePtr);
In the second example, we need to calculate how far into our structures "memory block" we can find the field we are interested in. Doing so requires a complete understanding of not just structure layout and data sizes, but also the alignment packing involved. Doing this by hand can sometimes be dangerous; fortunately we can get some help:
IntPtr result2 = GetDataStructure();
IntPtr offset = Marshal.OffsetOf(typeof(Data), "name");
IntPtr namePtr = IntPtr.Add(result2, offset.ToInt32());
var name = Marshal.PtrToStringUni(namePtr);

Manual Marshalling In Action

In upcoming posts, we'll occasionally get into situations where manual marshaling is required. COM property system is a common case; structured storage also uses a data type called a "string name block" that requires some effort. Fortunately, the Windows API is mostly written for easy of language interopability (including languages such as Pascal and VB), so these are rare. When using third-party unmanaged libraries, written by C developers for C develoeprs, these problems tend to crop up more often. Hopefully, you're now equipped to handle them.


jeffery said...

If you know anything about marshaling see if you can create a simple variant type that can be turned into an array (must hold byte format). This is the problem: I can marshal object array as variant array containing bytes to unmanaged code. Here is the forum post:

I tried asking them and trying many variances.

jeffery said...
This comment has been removed by the author.