A Neat Stack Corruption, or Reverse P/Invoke Structure Packing with Output Parameters

August 18, 2015

3 comments

I know, I’m working hard on beating my record for longest post title ever. I also thought of adding a random Win32 API to the title, say CoMarshalInterThreadInterfaceInStream or AccessCheckByTypeResultListAndAuditAlarmByHandle. But I didn’t, so here we are. What was I saying? Oh yeah, a neat stack corruption I spent a couple of hours chasing last week.

I was doing my usual reverse P/Invoke where I call a Windows API and pass a delegate as a callback. There’s a bunch of APIs in Win32 that take callbacks, but for the sake of this post let’s take a look at a very simple example with an unmanaged library of my own. Here’s the C++ code:

struct CALLBACK_OUTPUT
{
	__int32* ErrorCodeArray;
	unsigned __int64 NumberOfErrors;
};

typedef __int32(__stdcall *FROB_WIDGET_CALLBACK)(
	__int32 OperationCode,
	CALLBACK_OUTPUT* CallbackOutput
	);

extern "C" __declspec(dllexport) void __stdcall FrobWidgetWithCallback(
	char const* FrobName,
	FROB_WIDGET_CALLBACK Callback
	)
{
	CALLBACK_OUTPUT CallbackOutput = { 0 };
	Frobber frobber(FrobName);
	if (0 == Callback(OPCODE_FROB_WITH_CB, &CallbackOutput))
	{
		frobber.FrobIt();
	}
}

OK, nothing particularly exciting. There’s a function called FrobWidgetWithCallback that takes a function pointer (a callback), invokes it, and if all goes well–frobs the widget. (Thanks Raymond Chen for the random frobbing, gizmos, and widgets.)

And here’s the matching P/Invoke declaration and how I called it:

struct CALLBACK_OUTPUT
{
    public IntPtr ErrorCodeArray;
    public ulong NumberOfErrors;
}

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
delegate int FROB_WIDGET_CALLBACK(
    int OperationCode,
    ref CALLBACK_OUTPUT CallbackOutput);

class NativeMethods
{
    [DllImport(@"C:\Temp\PInvokeBug\Release\Native.dll",
               CallingConvention = CallingConvention.StdCall)]
    public static extern void FrobWidgetWithCallback(
        string FrobName,
        FROB_WIDGET_CALLBACK Callback
        );
}

class Program
{
    static int Callback(int OperationCode, ref CALLBACK_OUTPUT CallbackOutput)
    {
        Console.WriteLine("Callback invoked with opcode {0}", OperationCode);
        CallbackOutput.ErrorCodeArray = IntPtr.Zero;
        CallbackOutput.NumberOfErrors = 0;
        return 0;
    }

    static void Main(string[] args)
    {
        FROB_WIDGET_CALLBACK callback = new FROB_WIDGET_CALLBACK(Callback);
        NativeMethods.FrobWidgetWithCallback("Simple Frob", callback);
    }
}

Nothing too complicated—redeclare the structure, the callback, and the function, and then just call it with a callback. The callback doesn’t do anything interesting—it just initialises the output struct to all zeros, and returns 0.

See anything fishy yet?

Well, the result at runtime was an access violation somewhere inside the unmanaged library, after the callback was called and returned successfully. Upon closer investigation, here’s the crash site:

Access violation in FrobIt -- m_FrobName is NULL

What? How did m_FrobName become NULL? We just initialized it in the constructor from a perfectly valid value… Or was it valid?

So the first and immediate suspect is that something went wrong with the marshaling of the frob name. The unmanaged signature says char const*, whereas the managed signature says string—could that be the fault here?

Let’s add some printing code to verify:

printf("Received frob name: %s\n", FrobName);

Sure enough, the frob name is received correctly. It somehow gets corrupted later down the line. But where? And by whom?

To analyze this kind of issue effectively, you have to resort to setting memory breakpoints. Let’s figure out where m_FrobName lives and set a memory breakpoint to find when it’s being modified. But oops—Visual Studio doesn’t support memory breakpoints when doing mixed debugging, so we need to turn to our trusty friend WinDbg:

0:000> bu Native!FrobWidgetWithCallback
0:000> g
Breakpoint 0 hit
Native!FrobWidgetWithCallback:
52bc1080 55              push    ebp
0:000> p
Native!FrobWidgetWithCallback+0x6:
52bc1086 ff7508          push    dword ptr [ebp+8]    ss:002b:00c4f0c4=00c4f0cc
0:000> p
Native!FrobWidgetWithCallback+0x20:
52bc10a0 8d45ec          lea     eax,[ebp-14h]
0:000> ba w4 @@(&frobber.m_FrobName)
0:000> g
Breakpoint 1 hit
00ea0702 33c0            xor     eax,eax

Aha! Someone’s just modified m_FrobName! The way data breakpoints work is that you stop on the following instruction, so let’s disassemble a couple of instructions backwards:

0:000> ub .
00ea06e4 a822            test    al,22h
00ea06e6 8e03            mov     es,word ptr [ebx]
00ea06e8 897a04          mov     dword ptr [edx+4],edi
00ea06eb e8b0039e68      call    mscorlib_ni+0x420aa0 (69880aa0)
00ea06f0 33d2            xor     edx,edx
00ea06f2 8916            mov     dword ptr [esi],edx
00ea06f4 c7460800000000  mov     dword ptr [esi+8],0
00ea06fb c7460c00000000  mov     dword ptr [esi+0Ch],0

Looks like the last instruction, mov dword ptr[esi+0Ch],0 is responsible for the write. Indeed, we can confirm that ESI+0c points to m_FrobName. But where are we, even? What is this code?

0:000> k 3
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
00c4f06c 00e5d0aa 0xea0702
00c4f098 52bc10a9 0xe5d0aa
00c4f0bc 00ea05af Native!FrobWidgetWithCallback+0x29

Could be managed code of some sort. Let’s ask SOS:

0:000> !clrstack
OS Thread Id: 0x3434 (0)
Child SP IP Call Site
00c4f038 00ea0702 Managed.Program.Callback(Int32, Managed.CALLBACK_OUTPUT ByRef)
00c4f044 00ea067c DomainBoundILStubClass.IL_STUB_ReversePInvoke(Int32, IntPtr)
00c4f0e0 00e5d0aa [InlinedCallFrame: 00c4f0e0] 
00c4f0cc 00ea05af DomainBoundILStubClass.IL_STUB_PInvoke(System.String, Managed.FROB_WIDGET_CALLBACK)
00c4f0e0 00ea0494 [InlinedCallFrame: 00c4f0e0] Managed.NativeMethods.FrobWidgetWithCallback(System.String, Managed.FROB_WIDGET_CALLBACK)
00c4f144 00ea0494 Managed.Program.Main(System.String[])
00c4f2b8 74841366 [GCFrame: 00c4f2b8]

Hmm. So it looks like the managed callback is corrupting the m_FrobName variable, let’s confirm:

0:000> !u .
Normal JIT generated code
Managed.Program.Callback(Int32, Managed.CALLBACK_OUTPUT ByRef)
Begin 00ea06d0, size 37

C:\Temp\Program.cs @ 32:
00ea06d0 57              push    edi
00ea06d1 56              push    esi
00ea06d2 8bf9            mov     edi,ecx
00ea06d4 8bf2            mov     esi,edx
00ea06d6 b9c43c9169      mov     ecx,offset mscorlib_ni+0x4b3cc4 (69913cc4) (MT: System.Int32)
00ea06db e8142af1ff      call    00db30f4 (JitHelp: CORINFO_HELP_NEWSFAST)
00ea06e0 8bd0            mov     edx,eax
00ea06e2 8b0da8228e03    mov     ecx,dword ptr ds:[38E22A8h] ("Callback invoked with opcode {0}")
00ea06e8 897a04          mov     dword ptr [edx+4],edi
00ea06eb e8b0039e68      call    mscorlib_ni+0x420aa0 (69880aa0) (System.Console.WriteLine(System.String, System.Object), mdToken: 06000a71)

C:\Temp\Program.cs @ 33:
00ea06f0 33d2            xor     edx,edx
00ea06f2 8916            mov     dword ptr [esi],edx

C:\Temp\Program.cs @ 34:
00ea06f4 c7460800000000  mov     dword ptr [esi+8],0
00ea06fb c7460c00000000  mov     dword ptr [esi+0Ch],0

C:\Temp\Program.cs @ 35:
>>> 00ea0702 33c0            xor     eax,eax
00ea0704 5e              pop     esi
00ea0705 5f              pop     edi
00ea0706 c3              ret

I’ve highlighted the instruction which performed the write to ESI+0c. The line of code this corresponds to is:

CallbackOutput.NumberOfErrors = 0;

Now, how exactly could this innocent line be responsible for corrupting m_FrobName? It’s just writing zeros to the output structure. Specifically, it’s a 64-bit zero, so it’s being written in two sittings:

C:\Temp\Program.cs @ 34:
00ea06f4 c7460800000000  mov     dword ptr [esi+8],0     ; lower half
00ea06fb c7460c00000000  mov     dword ptr [esi+0Ch],0   ; upper half

But the real question is something else. If the ESI register points to the beginning of the output structure, why is NumberOfErrors placed at offset 8 from the beginning of the structure? The first field is an IntPtr, which is a 4-byte quantity in 32-bit processes, so shouldn’t the NumberOfErrors field be placed at offset 4? It looks like—for some reason—we’re skipping four bytes and overwriting an unrelated four bytes of memory with zeros!

Now, this is starting to make sense. If the stack is organized such that the frobber local variable is placed immediately after the CallbackOutput variable, and the m_FrobName field is the first field of the Frobber object and the C# code is overwriting an extra four bytes on the stack, that explains how frobber.m_FrobName turns out to be zero after the callback returns.

Only one question remains—why is the structure laid out such that the NumberOfErrors field begins at offset 8? And the answer, ladies and gentlemen, is structure packing.

I conveniently omitted the fact that the C++ structure was defined in a project that used #pragma pack(4), much like the majority of the Win32 headers. The P/Invoke marshaler assumes a structure packing of 8 bytes by default. As a result, the C++ structure is assumed to be 12 bytes, but the marshaler thinks it’s actually 16 bytes: a 4-byte IntPtr followed by 4 bytes of padding, followed by an 8-byte ulong.

As always, the fix is very easy—just add packing instructions and the problem goes away:

[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct CALLBACK_OUTPUT
{
    public IntPtr ErrorCodeArray;
    public ulong NumberOfErrors;
}

If you’d like to experiment with this nasty bug first-hand, you can find the full code example on GitHub. Obviously, local variable order depends on the compiler and the specific optimizations; the example above was produced with Visual C++ 2015.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

3 comments

  1. OfekAugust 18, 2015 ב 1:01 PM

    Sasha – thanks for the nice writeup!
    I encountered a similar case a while ago, and (a) stuck with VS and just attached to the process as ‘native only’ debugger (to enable memory breakpoints), (b) made use of /d1reportClassLayout, exactly for inspecting packing/alignment differences among translation units.

    Reply
    1. Sasha Goldshtein
      Sasha GoldshteinAugust 18, 2015 ב 2:24 PM

      Oh you Visual Studio lover you.

      Reply
      1. OfekAugust 19, 2015 ב 10:41 AM

        Guilty as charged 🙁

        Reply