The Windows heap manager was designed to avoid the overhead of having to allocate virtual memory directly with VirtualAlloc, among other things. If you only need a 20-byte object, it’s a waste to call a system service (involving a user-kernel transition) and allocate a full page. The heap manager avoids that overhead by managing large blocks of virtual memory in user mode—it is implemented in ntdll.dll.
However, when you allocate particularly large blocks of memory (>= 512KB at the time of writing), the heap manager doesn’t see a reason to interfere, so it just forwards your request to VirtualAlloc. It still knows about your allocation by virtue of additional heap headers, but a lot of debugging and profiling features associated with heap allocations, such as allocation stack trace collection, will not work for these large allocations.
Here’s what it looks like in an x86 process on Windows 10 (I allocated chunks of 512KB, 1MB, 2MB and 4MB):
0:004> !heap -h 00fc0000 Index Address Name Debugging options enabled 1: 00fc0000 ... Virtual Alloc List: 00fc009c 03c8d000: 00080000 [commited 81000, unused 1000] - busy (b), tail fill 03d16000: 00100000 [commited 101000, unused 1000] - busy (b), tail fill 03e2b000: 00200000 [commited 201000, unused 1000] - busy (b), tail fill 0403f000: 00400000 [commited 401000, unused 1000] - busy (b), tail fill 0445a000: 00800000 [commited 801000, unused 1000] - busy (b), tail fill 04c65000: 01000000 [commited 1001000, unused 1000] - busy (b), tail fill
The “Virtual Alloc List” is a list of blocks that were directly allocated from VirtualAlloc, bypassing the standard free list management algorithms. Even though the user mode stack trace database is enabled, I can’t get an allocation stack for these blocks:
0:004> !gflag Current NtGlobalFlag contents: 0x00001070 htc - Enable heap tail checking hfc - Enable heap free checking hpc - Enable heap parameter checking ust - Create user mode stack trace database 0:004> !heap -p -a 03e2b000
(The last command didn’t have any output.)
You can still get some poor man’s allocation tracing by simply putting breakpoints in the right places, or by using my tracer extension. For example, if you know that you’re chasing 512KB allocations, set the following breakpoints in ntdll!NtAllocateVirtualMemory:
0:000> uf ntdll!NtAllocateVirtualMemory ntdll!NtAllocateVirtualMemory: 77b68d40 b818000000 mov eax,18h 77b68d45 bab0d5b777 mov edx,offset ntdll!Wow64SystemServiceCall (77b7d5b0) 77b68d4a ffd2 call edx 77b68d4c c21800 ret 18h 0:000> bp ntdll!NtAllocateVirtualMemory "r $t0 = dwo(poi(@esp+0n16)); .printf \"allocating %d bytes\", @$t0; .echo ----; k; gc" 0:000> bp 77b68d4c "r $t0 = poi(poi(@esp+8)); .printf \"allocated block %x\", @$t0; .echo ----; gc"
The first breakpoint prints out the allocating call stack and the allocation size, and the second breakpoint prints out the allocated address.
Yet another option is using VMMap’s allocation tracing, which I covered in the past. It’s a bit more intrusive, but also a lot easier to set up and view results.