Recently I’ve been getting several questions on hardware acceleration. Some people are running performance profiling tools and noticing that although it indicates that their application is running in hardware, the application is still taxing the CPU. This may seem confusing, so I will try to give some background in this posting.
Windows developers have been using the same display technologies for more than 15 years. A standard Windows application relies on two well-worn parts of the Windows operating system to create its user interface:
- User32 – provides the familiar Windows look and feel for elements such as windows, buttons, text boxes, and so on.
- GDI/GDI+ – provides drawing support for rendering shapes, text, and images at the cost of additional complexity (and often lackluster performance).
Over the years, both technologies have been refined, and the APIs that developers use to interact with them have changed dramatically. But behind the scenes the same parts of the Windows operating system are used. Newer frameworks simply deliver better interaction wrappers.
WPF still relies on User32 for certain services, such as handling and routing input and sorting out which application owns which portion of screen real estate.
DirectX began as a cobbled-together, error-prone toolkit for creating games on the Windows platform. Its design mandate was speed, and so Microsoft worked closely with video card vendors to give DirectX the hardware acceleration needed for complex textures, special effects such as partial transparency, and three-dimensional graphics.
DirectX is now an integral part of Windows, with support for all modern video cards. However, the programming API for DirectX still reflects its roots as a game developer’s toolkit, and therefore is rarely used in traditional types of Windows applications.
The internal architecture of WPF has two rendering pipelines, hardware and software.
Hardware Rendering Pipeline: One of the most important factors in determining WPF performance is that it is render bound—the more pixels you have to render, the greater the performance cost. However, The more rendering that can be offloaded to the graphics processing unit (GPU), the more performance benefits you can gain.
Software Rendering Pipeline: The WPF software rendering pipeline is entirely CPU bound. WPF implements an optimized, fully-featured software rasterizer (Rasterisation is the task of taking an image described in a vector graphics format and converting it into a raster image (pixels or dots) for output on a video display or printer, or for storage in a bitmap file format.). Fallback to software is seamless any time application functionality cannot be rendered using the hardware rendering pipeline.
The biggest performance issue you will encounter when rendering in software mode is related to fill rate, which is defined as the number of pixels that you are rendering. If you are concerned about performance in software rendering mode, try to minimize the number of times a pixel is redrawn. For example, if you have an application with a blue background, which then renders a slightly transparent image over it, you will render all of the pixels in the application twice. As a result, it will take twice as long to render the application with the image than if you had only the blue background.
Forcing Software Rendering
Depending on the machine configuration and the application, software-based rendering is sometimes faster than hardware.
A new API was presented in .NET 3.5 SP1 to allow developers to force software rendering in their application (per window) instead of using the GPU. This should provide developers a much better alternative than setting the global registry key and affecting all WPF applications.
The RenderMode enumeration may be either Default (use hardware rendering, if possible, otherwise use software rendering) or SoftwareOnly (use software rendering).
Display Driver Model
WPF provides better performance under the Windows Vista operating system, where it can take advantage of the new Windows Vista Display Driver Model (WDDM). WDDM offers several important enhancements beyond the Windows XP Display Driver Model (XPDM). Most importantly, WDDM allows several GPU operations to be scheduled at once, and it allows video card memory to be paged to normal system memory if you exceed what’s available on the video card.
WPF offers some sort of hardware acceleration to all WDDM (Windows Vista) drivers and to XPDM (Windows XP) drivers that were created after November 2004, which is when Microsoft released new driver development guidelines.
Level of Support
The goal of WPF is to offload as much of the work as possible on the video card so that complex graphics routines are render-bound (limited by the GPU) rather than processor-bound (limited by your computer’s CPU).
WPF is intelligent enough to use hardware optimizations where possible, but it has a software fallback for everything. So if you run a WPF application on a computer with a legacy video card, the interface will still appear the way you designed it. Of course, the software alternative may be much slower, so you’ll find that computers with older video cards won’t run rich WPF applications very well, especially ones that incorporate complex animations or other intense graphical effects. You might want to consider a design that allows your application to seamlessly switch features when running on different hardware, so that it can take full advantage of each different hardware configuration.
To achieve this, WPF provides functionality to determine the graphics capability of a system at runtime. Graphics capability is determined by categorizing the video card as one of three rendering capability tiers. WPF exposes an API that allows an application to query the rendering capability tier. Your application can then take different code paths at run time depending on the rendering tier supported by the hardware.
The features of the graphics hardware that most impact the rendering tier levels are:
- Video RAM The amount of video memory on the graphics hardware determines the size and number of buffers that can be used for compositing graphics.
- Pixel Shader A pixel shader is a graphics processing function that calculates effects on a per-pixel basis. Depending on the resolution of the displayed graphics, there could be several million pixels that need to be processed for each display frame.
- Vertex Shader A vertex shader is a graphics processing function that performs mathematical operations on the vertex data of the object.
- Multitexture Support Multitexture support refers to the ability to apply two or more distinct textures during a blending operation on a 3D graphics object. The degree of multitexture support is determined by the number of multitexture units on the graphics hardware.
The pixel shader, vertex shader, and multitexture features are used to define specific DirectX version levels, which, in turn, are used to define the different rendering tiers in WPF.
The features of the graphics hardware determine the rendering capability of a WPF application. The WPF system defines three rendering tiers:
- Rendering Tier 0A rendering tier value of 0 means that there is no graphics hardware acceleration available for the application on the device. At this tier level, developers should assume that all graphics will be rendered by software with no hardware acceleration. This tier’s functionality corresponds to a DirectX version that is less than 7.0.
- Rendering Tier 1
A rendering tier value of 1 means that there is partial graphics hardware acceleration available on the video card.The following features and capabilities are hardware accelerated for rendering tier 1:
- 2D rendering
- 3D rasterization
- 3D anisotrophic filtering
- 3D mip mapping
The following graphics hardware features define rendering tier 1:
- DirectX version (7.0 =< ver < 9.0)
- Video RAM (>= 30MB),
- Multitexture units (>= 2)
- Rendering Tier 2
A rendering tier value of 2 means that most of the graphics features of WPF should use hardware acceleration.The following features and capabilities are hardware accelerated for rendering tier 2:
- Tier 1 features
- Radial gradients
- 3D lighting calculations
- Text rendering
- 3D anti-aliasing
The following graphics hardware features define rendering tier 2:
- DirectX version (>= 9.0)
- Video RAM (>= 120MB)
- Pixel shader (version >= 2.0)
- Vertex shader (version >= 2.0)
- Multitexture units (>= 4)
The Tier property allows you to retrieve the rendering tier at application run time. You might choose to scale down
complex effects in the user interface, depending on the level of hardware acceleration that’s available in the client.
There’s one trick, the rendering tier corresponds to the high-order word of the Tier property, so to extract its value from the Tier property, you need to shift it 16 bits:
int renderingTier = (RenderCapability.Tier >> 16);
The following features and capabilities are not hardware accelerated:
- Bitmap effects
- Printed content
- Rasterized content that uses RenderTargetBitmap
- Tiled content that uses TileBrush
- Surfaces that exceed the maximum texture size of the graphics
- Any operation whose video RAM requirement exceeds the memory of the graphics hardware
- Layered windows (only on XP)
Graphics Rendering Registry Settings
WPF provides registry settings for troubleshooting, debugging, and product support purposes. However, because changes to the registry affect all WPF applications, your application should never alter these registry keys automatically, or during installation.
Disable Hardware Acceleration Option
The disable hardware acceleration option enables you to turn off (set its DOWRD value to 1) hardware acceleration for debugging and test purposes. When you see rendering artifacts in an application, try turning off hardware acceleration. If the artifact disappears, the problem might be with your video driver.
A value of 0 enables hardware acceleration.
Maximum Multisample Value
The maximum multisample value enables you to adjust the maximum amount of anti-aliasing of 3-D content (0-16).
WPF defaults to 0 for XPDM drivers and 4 for WDDM drivers.
Required Video Driver Date Setting
In November, 2004, Microsoft released a new version of the driver testing guidelines; the drivers written after this date offer better stability. By default, WPF will use the hardware acceleration pipeline for these drivers and will fall back to software rendering for XPDM drivers published before this date.
The required video driver date setting enables you to specify an alternate minimum date for XPDM drivers. You should only specify a date earlier than November, 2004 if you are confident that your video driver is stable enough to support WPF.
Use Reference Rasterizer Option
The use reference rasterizer option enables you to force WPF into a simulated hardware rendering mode for debugging: WPF goes into hardware mode, but uses the Microsoft Direct3D reference software rasterizer, d3dref9.dll, instead of an actual hardware device.
WPF Performance Profiling Tools
WPF provides (in the Windows SDK tool) a suite of performance profiling tools (WPFPerf.exe) that allow you to analyze the run-time behavior of your application and determine the types of performance optimizations you can apply.
- Event Trace – Use for analyzing events and generating event log files.
- Perforator – Use for analyzing rendering behavior.
- Trace Viewer – Record, display, and browse Event Tracing for Windows (ETW) log files in a WPF user-interface format.
- Visual Profiler – Use for profiling the use of WPF services, such as layout and event handling, by elements in the visual tree.
- Working Set Viewer – Use for analyzing the working set characteristics of your application.
DirectX Diagnostic Tool
The DirectX Diagnostic Tool (~\Windows\System32\Dxdiag.exe) is designed to help you troubleshoot DirectX-related issues.