DCSIMG
UI Virtualization vs. Data Virtualization (Part 1) - Essential WPF

UI Virtualization vs. Data Virtualization (Part 1)

Part 2 

Being an LOB, composite applications infrastructure junky lately, I’ve been working around with several data-binding models and mechanisms to bind the data with the view, in a very efficient way of course. Sometime it was easy, and sometime it was not! I have had to find several workarounds to overcome both architectural and performance issues.

In this post I would like to concentrate on Data Virtualization, and to compare it with UI Virtualization, which are very similar, yet different aspects of the same problem.

After reading this post you'll have an idea of what is Data Virtualization compared to UI Virtualization, and what it is good for.

UI Virtualization

To understand what UI Virtualization is, lets say that you want to display a HUGE amount of data entries in a WPF DataGrid, all at once. In such case, trying to bind a simple WPF DataGrid with, lets say, 100,000 or more items, you’ll have to wait one or two seconds for the initial binding.

public partial class Window1 : Window

{

    public Window1()

    {

        var hugeCollection = new HugeCollection();

        DataContext = hugeCollection;

        InitializeComponent();

    }

}

<Window ...>
    <Grid>
        <tk:DataGrid ItemsSource="{Binding}" />
    </Grid>
</Window>

Well its really nothing!

Now lets change a little flag on the data grid.

<tk:DataGrid ItemsSource="{Binding}" VirtualizingStackPanel.IsVirtualizing="False" />

Now, before you run this app, if you use a lap-top (like me… ;), connect it to an AC power adapter first. Trust me, your laptop will need that.

As you can see, the CPU stuck on 100% and it takes eternity to see the data (I just had to kill the process).

What’s wrong? It’s only a little flag: VirtualizingStackPanel.IsVirtualizing="False".

Well, it’s indeed a little flag that makes a big difference. If this flag is true (which is the default) it instructs the VirtualizingStackPanel (which is the default layout panel in a DataGrid, ListView, ListBox and other controls) to use UI virtualization. Means: don’t create data grid rows that are not visible to the user (in other words, are not in view). So this gives you a clue for what happened when we turned off this flag. We told VisrtualizingStackPanel not to use UI Virtualization, hence the DataGrid control tried to create and layout 100,000 rows! And this is a CPU killer as you saw.

Data Virtualization

WPF internally implements UI Virtualization and it’s great. In the previous case, UI Virtualization is just fine. We’ve got 100,000 items loaded in memory, and all we’ve had to do is to bound them with the DataGrid.

But what if each entry takes more than 1MB?

100,000 * 1MB = 97GB – Huston, we have a problem!

Well, the fast answer of course would be: Don’t load all items at once, instead use chunks. And this is exactly what Data Virtualization is!

Data Virtualization implementation raises several problems, and there is no out-of-the-box solution for Data Virtualization in WPF.

Well, first problem is: How do you fake scrollbars, so the user will think that it has all the data, and will be able to scroll?

Second problem: How do you filter, sort, group data that doesn’t exist?

Third problem: How do you search data (like this search) that doesn’t exist?

These are all tough questions.

So what do you think? Do you have an easy way or acceptable solution to solve this problem?

Next post I’ll introduce a unique solution, but till then I’ll be very glad to hear your opinion on that.

Published Monday, September 07, 2009 12:42 AM by Tomer Shamam

Comments

# re: UI Virtualization vs. Data Virtualization

Monday, September 07, 2009 10:02 AM by Oleg

I am so intrigued!

Usually we make data virtualization to be data layer responsibility (SQL server or special lightweight cache layer). In other words UI requests what it wants and data layer provides. In that case sorting/grouping relies on data access layer.  As what I see, this approach is straightforward but still has grouping/sorting problem. You basically have two options.

1)Group on server. In that case you have to create all calculations in SQL query, which is apparently not good from business logic duplicity point of view. We prefer this because, that’s what data servers for. Unfortunately, come calculation require 3rd party DLLs and it just does not work.

2)Group in cache (or client). In that case you have issue with updating cache when data in storage changed (sometime by some foreign process). Or, you forget about data virtualization.

I am really looking forward to seeing what approach you suggest.

# re: UI Virtualization vs. Data Virtualization

Monday, September 07, 2009 8:28 PM by Tomer Shamam

Hi Oleg,

What you've described sounds very familiar. Indeed data virtualization in many cases solved by working directly with the data-collection (model), and synchronizing the client cache is always a problem, since that by displaying the data to the user you already have a cache (in the presentation layer). So something have to update it either way.

As for the first approach, you still have the "Virtual Scrollbars" problem, and each time you group/sort/filter/scroll, you have to fetch fresh data from the server, and it costs a lot of time. So I really like the second approach which conserves time and user patient.

As for the second approach, this is what Data Virtualization is (at least from my point of view). You change the data that is bound to the UI, but still left with other problems such as described in this post.

# re: UI Virtualization vs. Data Virtualization

Monday, September 14, 2009 6:44 PM by Odi Kosmatos

"there is no out-of-the-box solution for Data Virtualization in WPF."

Not true. Well, from Microsoft, no, but from Xceed, yes. And it works asynchronously (in the background) getting more data without stopping your UI experience.

Imagine your database was Bing, and you entered a search term that could produce millions of results. No problem. Even if Bing's API only returns a few results at a time, you can instantly make it look like all the data is available... and if the end-user scrolls too fast, a built-in loading indicator works.

We built an example of this data virtualization with Bing, to prove the point.

Look at this WPF demo (XBAP, no install, runs in your browser):

http://xceed.com/bingbling

Hope you find it good.

# re: UI Virtualization vs. Data Virtualization

Monday, September 14, 2009 9:03 PM by Tomer Shamam

Hi Odi,

As you said by yourself, "Well, from Microsoft, no, but...", so we don't really have out-of-the-box Data Virtualization. At least not from Microsoft.

But thanks for sharing this valuable information.

I would expect that WPF DataGrid will have that too, also other controls such as ListView.

# re: UI Virtualization vs. Data Virtualization

Monday, November 16, 2009 7:06 AM by gromas

Another way to virtualize WPF data with attachable behaviour:

grominc.blogspot.com/.../wpf-data-virtualization.html

# re: UI Virtualization vs. Data Virtualization (Part 1)

Wednesday, January 06, 2010 4:44 PM by steve

Does anyone have a valid link for Vincent Van Den Berghe Data virtualization source code.

The link located in the Data Virtualization in WPF and beyond pdf file is invalid.

home.scarlet.be/.../DataVirtualizationArticleCode.zip) link does not contains the source code.

# re: UI Virtualization vs. Data Virtualization (Part 1)

Thursday, January 14, 2010 10:23 PM by ksurakka

Vincent classes for virtualization seems to exists in this sample project:

bea.stollnitz.com/.../VirtualizationWPF.zip

in blog:

bea.stollnitz.com/blog

# re: UI Virtualization vs. Data Virtualization (Part 1)

Tuesday, May 10, 2011 11:41 AM by Alexei

ppl rigth solution is give to user filter capability.

if user wants all records he uses no filter , if he knows what he want then he gets what wants.

other practices route to "Programming for the sake of programming" deal:)

and example with bing not what u want. the bing has no filters ,but your application.

it is an ads of xceed :)

# re: UI Virtualization vs. Data Virtualization (Part 1)

Wednesday, September 07, 2011 2:49 PM by Sridharan

Hi,

What will be scrolling performance on UI Virtualizaion. The Loading time will be reduced, but it will create the items on each scroll, wont this affect the scroll performance?  Whether in normal mode(without UI Virtualizaion) too it will create/generate all the items on each scroll ?

Please clarify my doubt

- Sri.

# re: UI Virtualization vs. Data Virtualization (Part 1)

Wednesday, September 07, 2011 2:50 PM by Sridharan

Hi,

What will be scrolling performance on UI Virtualizaion. The Loading time will be reduced, but it will create the items on each scroll, wont this affect the scroll performance?  Whether in normal mode(without UI Virtualizaion) too it will create/generate all the items on each scroll ?

Please clarify my doubt

- Sri.

# re: UI Virtualization vs. Data Virtualization (Part 1)

Monday, September 19, 2011 4:33 AM by M Shoaib Sheikh

Hi everybody,

I might sound naiive here.I came across similar problem once and the work around that i came up with , was to paginate the control.Now there is a limit to number of records that you can show in a page and with that in mind, at the most the number of row can be not more than viewable area on screen which gives you a freedom to load a chunk without worrying about scrolling and maintaining a cache.So thats how i worked with WPF grid anyways...You can sort page by page.Anyways,this is not a bullet-proof solution but still it curtails the responiveness by fair margin.

# re: UI Virtualization vs. Data Virtualization (Part 1)

Thursday, September 22, 2011 3:09 PM by Tomer Shamam

Hi Sridharan,

By default, a ListBox for example uses Virtualization with Recycling mode. This prevents creating new items each time you scroll, instead it reuses the ListBoxItem already created, onl changes the DataContext/Content to be the new item.

# re: UI Virtualization vs. Data Virtualization (Part 1)

Tuesday, March 20, 2012 6:30 PM by CrazyTasty

I recently had to parse a formatted log file that could range in size anywhere from 2k to +300MB. It irks me when my code makes a GUI hang, so I buffer my reads asynchronously and update an ObservableCollection of "LogEntry" objects which is bound to my grid.

Overall all this approach is not as performance savvy, but the user doesn't notice since the grid is updated dynamically everytime the buffer fills and adds to the ObservableCollection.

Leave a Comment

(required) 
(required) 
(optional)
(required) 

Enter the numbers above:
Powered by Community Server (Commercial Edition), by Telligent Systems