All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

Improving Cold Startup Performance (Prefetch)

A few months ago, Alon, Dima and I have visited a customer with a client application experiencing very slow cold startup times. In this post (and possibly another) I will look into some of the things we found out in the process. Some of the details have been slightly changed to protect the innocent, but the general findings might be useful nonetheless.

Elsewhere there is some good reading on cold startup performance in general. First there’s Vance Morrison’s theoretical model. Claudio Caldato’s MSDN Magazine article is more relevant for managed apps, but still quite useful.

When transitioning to the latest release of the application (lots of code added, of course, incorporating managed code into a previously purely unmanaged application), the startup times went ballistic: From approximately 5-15 seconds with the old version to 30+ seconds with the new one. Now, a delay of 10 seconds is not very pleasant but users kind of learn to live with that, especially when launching the app immediately after logging on. 30 seconds, on the other hand (which could be as bad as over a minute on slow machines) is intolerable.

We looked at the obvious things first, trying to understand what the application was doing during these dreaded 30 seconds. We installed the application on multiple hardware combinations, including natively on a Vista laptop and inside an XP virtual machine on another box.

Sidenote: Troubleshooting cold startup performance sucks because to reproduce the scenario you have to reboot the box, and preferably the first thing to do after log on is to launch the app, so you get little time to set up tools for diagnosis.

First we looked at the application in Process Explorer, and saw that it was loading over 300 different DLLs. Its total memory consumption after completing the startup sequence was around 250MB, most of them code. Seeing that most of these DLLs were signed, we immediately suspected that a large portion of the startup time was spent loading these DLLs from disk (which explains why the cold startup times are much worse than warm) and verifying their digital signatures. Most of these DLLs were load-bound, with only some managed assemblies being loaded lazily.

To test this hypothesis, we wrote a small loader application that attempted to load the same DLLs in the same order (to obtain the load order we used Dependency Walker with its profiling option turned on). It took approximately 10 seconds to load them on a machine where the startup time was around 25 seconds, so we were fairly pleased with the result. Next, we tried launching the application after letting the loader app do its magic, and witnessed inconsistent behavior. In the VM, startup times went down proportionally to the time it took the loader app to load DLLs from disk; in the native installation, there was only an insignificant decrease in startup time.

Seeing that pre-fetching the application’s DLLs might have a positive effect on cold startup, we tried experimenting with the Windows prefetcher settings. As you might already know, Windows XP has a built-in prefetching mechanism which observes the first few seconds of your application’s startup, records whatever I/O requests you are making, and in subsequent invocations of your application—tries to prefetch these I/O requests while your app is not utilizing the I/O path. (On Vista, by the way, this mechanism was replaced by a newer one called SuperFetch, which is smarter because it tries to prefetch I/O requests even if there is no application that needs them at this time.)

Dima found the registry settings for the Windows XP prefetcher and modified them so that the prefetcher records more than just the first few seconds of the application’s startup. This improved cold startup performance in a fairly consistent fashion, shaving off several seconds of the startup time even on relatively fast boxes.

The following are the old and new versions of the prefetcher settings in the registry. As always when tinkering with registry settings, bear in mind that “It works on my machine” and that you’re doing it at your own risk. (And also bear in mind that these are the XP settings—on Vista or Windows 7 this could be completely different.)

Before:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PrefetchParameters]
"VideoInitTime"=dword:0000005a
"EnablePrefetcher"=dword:00000003
"AppLaunchMaxNumPages"=dword:00000fa0
"AppLaunchMaxNumSections"=dword:000000aa
"AppLaunchTimerPeriod"=hex:80,69,67,ff,ff,ff,ff,ff
"BootMaxNumPages"=dword:0001f400
"BootMaxNumSections"=dword:00000ff0
"BootTimerPeriod"=hex:00,f2,d8,f8,ff,ff,ff,ff
"MaxNumActiveTraces"=dword:00000008
"MaxNumSavedTraces"=dword:00000008
"RootDirPath"="Prefetch"
"HostingAppList"="DLLHOST.EXE,MMC.EXE,RUNDLL32.EXE"

After:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PrefetchParameters]
"VideoInitTime"=dword:00000064
"EnablePrefetcher"=dword:00000003
"AppLaunchMaxNumPages"=dword:00009c40
"AppLaunchMaxNumSections"=dword:000006a4
"AppLaunchTimerPeriod"=hex:00,79,6c,fc,ff,ff,ff,ff
"BootMaxNumPages"=dword:0001f400
"BootMaxNumSections"=dword:00000ff0
"BootTimerPeriod"=hex:00,f2,d8,f8,ff,ff,ff,ff
"MaxNumActiveTraces"=dword:00000008
"MaxNumSavedTraces"=dword:00000008
"RootDirPath"="Prefetch"
"HostingAppList"="DLLHOST.EXE,MMC.EXE,RUNDLL32.EXE"

Comments

Jason Haley said:

Interesting Finds: July 23, 2009

# July 23, 2009 1:47 PM

Klaus Brockmann said:

Hello Sasha,

application startup performance is alway a great issue, so I appreciate your work.

Can you give me a hint where I can find more information about the registry values you changed, especially "VideoInitTime".

Thanks,

Klaus

# July 24, 2009 9:52 AM

All Your Base Are Belong To Us said:

In the previous installment we’ve seen how tinkering with the prefetch settings on Windows XP improved

# July 24, 2009 12:45 PM

Improving Cold Startup Performance (Prefetch) « Jasper Blog said:

Pingback from  Improving Cold Startup Performance (Prefetch) « Jasper Blog

# July 26, 2009 10:39 AM

Alois Kraus said:

Hi Sasha,

I just tried these settings on my (XP) machine. The effect was that after locking the screen and then trying to unlock it resulted in a deadlock (at least it did take at least several minutes). A kernel dump revealed that there was some problem with prefetching. So I rolled back to normal.

I may be faster but I do not want to reboot my machine every time I try to unlock it.

Yours,

 Alois Kraus

# July 28, 2009 11:02 AM

Indrani said:

Hi Sasha,

I searched the whole MSDN library, the Windows Internals book by Mark

Russinovich and all other online resources but the recipe and workings

of Microsoft's secret prefetcher sauce are unknown to the general

public. If you can fish out some documentation on the following it

would be great (or put me in touch with the authority on the

subject).....

1) Meanings and the functions of the Registry keys under

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session

Manager\Memory Management\PrefetchParameters

2) How big can a .pf file swell into? I mean how many of those

associated dll, exe entries can a .pf file hold? The Logical

prefetcher chapter in the Windows internals book by Mark Russinovich

does not provide information about any caps or size limit on each

single .pf file. Also, if there is a limit is this limit applicable to

the boot prefetch file (NTOSBOOT-B00DFAAD.PF)?

3) How does the prefetcher react to files which are listed in the boot

prefetch file NTOSBOOT-B00DFAAD.PF but then are deleted? They continue

to reside there. Is there any purging mechanism for this....

Thanks in Advance

Indrani

# August 27, 2009 9:02 AM

All Your Base Are Belong To Us said:

Today I had the pleasure of presenting at the IDCC (Israeli Developers Community Conference). This unique

# September 14, 2009 4:25 PM

Looking at a banking Trojan right now. - Page 2 - Raymond.CC Forum said:

Pingback from  Looking at a banking Trojan right now. - Page 2 - Raymond.CC Forum

# January 20, 2010 7:21 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 


Enter the numbers above: