Large Win32 Heap Allocations Go Directly to VirtualAlloc

October 23, 2015

2 comments

The Windows heap manager was designed to avoid the overhead of having to allocate virtual memory directly with VirtualAlloc, among other things. If you only need a 20-byte object, it’s a waste to call a system service (involving a user-kernel transition) and allocate a full page. The heap manager avoids that overhead by managing large blocks of virtual memory in user mode—it is implemented in ntdll.dll.

However, when you allocate particularly large blocks of memory (>= 512KB at the time of writing), the heap manager doesn’t see a reason to interfere, so it just forwards your request to VirtualAlloc. It still knows about your allocation by virtue of additional heap headers, but a lot of debugging and profiling features associated with heap allocations, such as allocation stack trace collection, will not work for these large allocations.

Here’s what it looks like in an x86 process on Windows 10 (I allocated chunks of 512KB, 1MB, 2MB and 4MB):

0:004> !heap -h 00fc0000 
Index   Address  Name      Debugging options enabled
  1:   00fc0000 
...
    Virtual Alloc List:   00fc009c
        03c8d000: 00080000 [commited 81000, unused 1000] - busy (b), tail fill
        03d16000: 00100000 [commited 101000, unused 1000] - busy (b), tail fill
        03e2b000: 00200000 [commited 201000, unused 1000] - busy (b), tail fill
        0403f000: 00400000 [commited 401000, unused 1000] - busy (b), tail fill
        0445a000: 00800000 [commited 801000, unused 1000] - busy (b), tail fill
        04c65000: 01000000 [commited 1001000, unused 1000] - busy (b), tail fill

The “Virtual Alloc List” is a list of blocks that were directly allocated from VirtualAlloc, bypassing the standard free list management algorithms. Even though the user mode stack trace database is enabled, I can’t get an allocation stack for these blocks:

0:004> !gflag
Current NtGlobalFlag contents: 0x00001070
    htc - Enable heap tail checking
    hfc - Enable heap free checking
    hpc - Enable heap parameter checking
    ust - Create user mode stack trace database
0:004> !heap -p -a 03e2b000

(The last command didn’t have any output.)

You can still get some poor man’s allocation tracing by simply putting breakpoints in the right places, or by using my tracer extension. For example, if you know that you’re chasing 512KB allocations, set the following breakpoints in ntdll!NtAllocateVirtualMemory:

0:000> uf ntdll!NtAllocateVirtualMemory
ntdll!NtAllocateVirtualMemory:
77b68d40 b818000000      mov     eax,18h
77b68d45 bab0d5b777      mov     edx,offset ntdll!Wow64SystemServiceCall (77b7d5b0)
77b68d4a ffd2            call    edx
77b68d4c c21800          ret     18h
0:000> bp ntdll!NtAllocateVirtualMemory "r $t0 = dwo(poi(@esp+0n16)); .printf \"allocating %d bytes\", @$t0; .echo ----; k; gc"
0:000> bp 77b68d4c "r $t0 = poi(poi(@esp+8)); .printf \"allocated block %x\", @$t0; .echo ----; gc"

The first breakpoint prints out the allocating call stack and the allocation size, and the second breakpoint prints out the allocated address.

Yet another option is using VMMap’s allocation tracing, which I covered in the past. It’s a bit more intrusive, but also a lot easier to set up and view results.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

2 comments

  1. David BarDecember 24, 2015 ב 12:22 AM

    Hi. Thanks for the articles (this and theVMMap). Great stuff.
    On Linux it’s possible to change the threshold for which the memory allocator will pass the allocation request to the OS (using mmap). See M_MMAP_THRESHOLD.
    I’ve been searching very hard, and can’t seem to find an equivalent for Microsoft’s default memory allocator.
    Is there a any way to change the threshold value?

    I’m asking because I have Windows application (64bit) that does frequent 1.5MB memory allocations, and I think I might gain some performance from avoiding the calls to the OS, the zeroing of the pages, the minor page faults, and those goodies…

    Reply
    1. Sasha Goldshtein
      Sasha GoldshteinJanuary 2, 2016 ב 6:19 PM

      No, there is no way to change the threshold value that I’m aware of.

      Reply