Manual Stack Walking

July 20, 2011

no comments

Corrupted stacks are no fun at all – when you get a crash dump or a live exception in an application, pretty much the first thing you do is take a look at the call stack. When the stack itself is corrupted, your primary investigation tool is taken away.

Still, it is sometimes possible to reconstruct the stack even in face of a corruption. I’ve been showing how in the .NET Debugging and C++ Debugging courses, but by popular demand will show one example here as well.

You can follow along on your own with the dump file, symbol file, and sources from here.

Here we go – open the dump file in WinDbg (32-bit) obtains the following output:

User Mini Dump File: Only registers, stack and portions of memory are available
. . .

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(1ed0.870): Access violation – code c0000005 (first/second chance not available)
eax=00000000 ebx=00000001 ecx=73536122 edx=00000000 esi=002af37c edi=0000004e
eip=00000000 esp=002af1a8 ebp=00000000 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
00000000 ??              ???
0:000> k
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
002af1a4 00000000 0x0

This is already bad news – the current instruction is at address 0x00000000, which means the instruction pointer (EIP) has been corrupted. You can also see that EBP has been corrupted – its value is 0x00000000 as well, which is why the k command has nothing to report.

Fortunately, ESP seems to have a valid value – well, we can’t really tell if it’s valid or not from looking at it, but we can try reading the memory it points to. If we manage to read the memory, it is almost 100% certain that ESP still points to the stack – because this is a mini dump that contains (almost) only stack memory.

If ESP indeed points to the stack, we can try looking at the stack manually and try to find something that looks like a return address. Immediately before the return address we should find a saved EBP value – unless the frame uses FPO, which I plan to discuss in a future post. This EBP value will provide the foundation for walking the stack further back – EBPs are chained in the sense that EBP always points to the previous saved EBP on the stack, which points to the even earlier saved EBP on the stack, and so on. (Refresh your memory on how an x86 stack is laid out.)

Here’s the raw stack contents from ESP (this would be a good time to set up the symbol path to include the folder which contains BatteryMeter.pdb):

0:000> dds ESP
002af1a8  00000000
002af1ac  002af120
002af1b0  00000000
002af1b4  014cfe90
002af1b8  002af0fc
002af1bc  742fd594 uxtheme!StreamInit+0x36
002af1c0  002af180
002af1c4  01850815
002af1c8  0000029e
002af1cc  00000000
002af1d0  00000000
002af1d4  737990fa
002af1d8  002af210
002af1dc  013719be BatteryMeter!RecurseDeep+0x4e […\batterymeterdlg.cpp @ 135]
002af1e0  00000004
002af1e4  77dbc290 mfc100u!AfxDlgProc […\dlgcore.cpp @ 22]
002af1e8  00000000
002af1ec  002af284
002af1f0  00000001
002af1f4  00a24a74
002af1f8  00a5ec90
002af1fc  00a24cf0
002af200  002af228
002af204  002af198
002af208  002af234
002af20c  73799332
002af210  002af248
002af214  013719be BatteryMeter!RecurseDeep+0x4e […\batterymeterdlg.cpp @ 135]

First of all, it’s nice to see that ESP points into a memory area that is included in the dump – which means we are looking at the stack. There are several things here that might be return addresses – and the addresses immediately preceding them are saved-EBP candidates. To eliminate candidates, we can peek at the memory location they point to – if it’s on the stack, the candidate is viable.

0:000> dd 002af0fc L1
002af0fc  ????????
0:000> dd 002af210 L1
002af210  002af248

The first attempt failed, but the second attempt succeeded – we might have a saved EBP on our hands. We can now proceed with manual reconstruction – the saved EBP points to another EBP, and immediately following it we should find another return address. Repeat several times to see if it makes sense:

0:000> dds 002af210 L2
002af210  002af248
002af214  013719be BatteryMeter!RecurseDeep+0x4e […\batterymeterdlg.cpp @ 135]
0:000> dds 002af248 L2
002af248  002af280
002af24c  013719be BatteryMeter!RecurseDeep+0x4e […\batterymeterdlg.cpp @ 135]
0:000> dds 002af280 L2
002af280  002af2b8
002af284  013719be BatteryMeter!RecurseDeep+0x4e […\batterymeterdlg.cpp @ 135]
0:000> dds 002af2b8 L2
002af2b8  002af2f0
002af2bc  013719be BatteryMeter!RecurseDeep+0x4e […\batterymeterdlg.cpp @ 135]
0:000> dds 002af2f0 L2
002af2f0  002af304
002af2f4  013719f7 BatteryMeter!CBatteryMeterDlg::OnCPUSelectorChanged+0x27 […\batterymeterdlg.cpp @ 142]
0:000> dds 002af304 L2
002af304  002af318
002af308  77d92c8c mfc100u!_AfxDispatchCmdMsg+0x58 […\cmdtarg.cpp @ 112]

We could keep doing this for a while – reconstructing the stack (as long as we don’t run into an FPO frame) until we hit the bottom. So far we have the RecurseDeep function calling itself at least four times before we hit the stack corruption.

There is also a WinDbg command that can perform this reconstruction for us – we only need to give it a guess for EBP, ESP, and EIP – and it constructs a plausible call stack. Our EBP guess can be the first saved EBP we found on the stack, our EIP guess can be the return address immediately following it, and our ESP guess can be the same as EBP, producing the following output:

0:000> k = 002af210 002af210 013719be
ChildEBP RetAddr 
002af210 013719be BatteryMeter!RecurseDeep+0x4e
002af248 013719be BatteryMeter!RecurseDeep+0x4e
002af280 013719be BatteryMeter!RecurseDeep+0x4e
002af2b8 013719be BatteryMeter!RecurseDeep+0x4e
002af2f0 013719f7 BatteryMeter!RecurseDeep+0x4e
002af304 77d92c8c BatteryMeter!CBatteryMeterDlg::OnCPUSelectorChanged+0x27 002af318
77d92e51 mfc100u!_AfxDispatchCmdMsg+0x58 002af334
77dc6d36 mfc100u!CCmdTarget::OnCmdMsg+0x124 002af358
77e1c4cb mfc100u!CPropertySheet::OnCmdMsg+0x1d
002af388 77e1bc7f mfc100u!CWnd::OnNotify+0x7b
002af454 002af478 mfc100u!CWnd::OnWndMsg+0x9e
… source information and the rest of the stack snipped for brevity

We have turned an impossible problem with very little information into a pretty decent call stack which gives us the likely culprit for the stack corruption. Inspecting the sources for BatteryMeter!RecurseDeep drives the point home – the function corrupts the stack, but does so in a sneaky fashion – instead of corrupting its own frame, it goes back several frames earlier on the stack and overwrites a small memory region with zeroes.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>