Diagnosing Native Memory Leaks with ETW and WPA

December 2, 2014


As a followup to my previous post on native memory leaks, here’s a quick walkthrough for diagnosing memory leaks using Event Tracing for Windows. The process is fairly simple. The Windows heap manager is instrumented with ETW traces for each memory allocation and deallocation. If you capture those over a period of time (when your application is leaking memory), you can get a nice report of which blocks were allocated during the trace period and haven’t been freed. If you also ask ETW to capture the call stack for allocation events, you can see where the application is allocating memory that isn’t being freed.

So, here is the detailed workflow (I’m copying this almost verbatim from our C++ Debugging course, where there is a detailed lab covering leak detection).

First, run the leaking application. In the process, you’re also getting rid of any false positives you’d see from global allocations made during the startup process that will remain in memory until the application shuts down. (If you desperately need to profile startup allocations as well, you can use the -PidNewProcess flag instead of the -Pids flag.)

Now, open an administrator command prompt, navigate to the Windows Performance Toolkit installation directory, and run the following commands to create a kernel session and a heap tracing session for the specific process:

xperf -start heapsession -heap -pids 1234 -stackwalk HeapAlloc+HeapRealloc

Now, xperf is capturing allocations (with call stacks) for the process with id 1234. After a while, when you’re convinced you have the leak manifest itself, stop recording and merge the recorded traces using the following commands:

xperf -stop heapsession -d C:\temp\heap.etl
xperf -d C:\temp\kernel.etl
xperf -merge C:\temp\heap.etl C:\temp\kernel.etl C:\temp\combined.etl

Now, you’re good to go and open the combined recording in WPA (just double-click the combined.etl file). You need one of the memory graphs, e.g. Heap Allocations | Outstanding Size by Process and Handle.

WPA with the process and its heaps' growth over time

Pick the heap that’s growing and filter by that heap. In the summary table, put the Process, Type, and Stack columns to the left of the gold grouping bar, and put the Count and Size columns to the right.

Now, load symbols (Trace > Load Symbols) and grab a few cups of coffee. Maybe have lunch and visit some old friends. You should even have time for a quick romantic getaway, perhaps over a weekend or something. When you’re back: the allocation type you are looking for is AIFO, which stands for Allocated Inside Freed Outside. These are the allocations made during the recording of the trace, which haven’t been freed until the end of the recording. If you picked the right timespan to record, these are your leaking allocations.

WPA with the proper column configuration

Now, all that’s left is the simple matter of expanding the stack tree to see where you’re making these allocations. For example, in the following screenshot, you can see TemperatureAndBatteryUpdaterThread being responsible for >2,000 allocations of >8MB overall.

WPA showing the allocation call stack for the leak

In closing, it might be worth mentioning that PerfView is also capable of recording heap allocation events. It’s even somewhat easier. Just run PerfView and in either the Collect > Collect or Collect > Run dialogs, put the process name or process ID in the OS Heap Exe or OS Heap Process boxes:

PerfView collection dialog with the native heap configuration applied

When the report is ready, you’ll have a NET OS Heap Alloc Stacks node that will produce a window similar to the last WPA screenshot above, which you can then filter and play around with.

PerfView report showing the allocation sources

Anyway, I’m not saying you shouldn’t be using Visual Studio 2015 for leak analysis, but you should be familiar with the alternatives. Plus, you can use ETW in a production environment, and even to collect traces on Windows RT devices, which can hardly be said about Visual Studio.

I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>



  1. Alois KrausDecember 4, 2014 ב 12:11 AM

    You should note that you need to add CLR ETW providers to walk mixed mode stacks. And you need Windows 8 for x64 JITed code to get non NGenned call stacks. For more complex sporadic problems you can use PerfView to monitor your memory (e.g. private Bytes counter) to define a stop trigger while recording the Heap data into a ring buffer. It makes also sense to add CPU profiling data to get an idea what your application was up to except allocating memory.

  2. Pingback: ETW Heap Tracing–Every Allocation Recorded | Random ASCII