Report Builder 3.0 – Shared Data Sets
I’ve been trying the new Report Builder 3.0, which ships with the SQL Server 2008 R2 November CTP and among the various new features, I want to write a bit about the Shared Data Sets feature. Up until this version, report authors could create a data source, and when deploying the report could make the decision whether the data source for this report is shared among other users/authors or be linked only to the individual report they have created. If the report author decided to share the data source, new reports could be based on the shared data source, avoiding duplicate server objects and, even better, avoiding unnecessary work.
In Report Builder 3.0, the Shared Data Sets feature is introduced, where report authors can not only share data sources, but also share specific data sets, based on data sources. Sharing a data set can be used if you want to create several reports which share fairly similar data requirements, but perhaps, different design and parameters requirements.
But putting the reuse of data objects aside, one could even achieve better performing reports, by using the caching options for Shared Data Sets. So say, for example that a report is run at 09:00 in the morning, and a second, different report which uses the same shared data set is run at 10:00, the lucky user who is running the report at 10:00 will get to see the report faster, because the data set is already cached. You can also choose to schedule a cache refresh just before the day begins so every user will benefit from the caching option.
I decided to put this feature to the test, to see if and how this caching mechanism works.
To start, I’ll launch the Report Builder 3.0 from the report manager web page. As you can see, I’ve already set up a shared data source, pointing to the Adventure Works DW 2008 database.
Upon launch, we see a new welcome screen which prompts us either to create a new report or a new shared data source. I’ll choose the shared data source option.
For simplicity sake, I’ll choose the DimProduct table as the source for our data set.
Next, I’ll save the newly created data set to the report server. I’ll call it: DimProductSharedDS.
Looking back at the Report Manager webpage, we’ll be able to see our new shared data set. Clicking on the data source will bring us to the data set properties. We’re interested in the “caching” tab:
I’ll be choosing the “Cache Shared Dataset” option, with an expiry of 30 minutes. you can see that there is an additional cache related tab: “Cache Refresh Options”. On this tab you can define a scheduled cache plan, so the data set will be refreshed every morning, for example, instead of letting the first user in the morning perform the caching implicitly by running a related report. We’ll skip this option for now.
Next, I’ll create two different reports, one displaying product name and size, and the other displaying product name and weight, both use the shared data set I’ve just created and deployed. Running the first report and tracing SQL activity on Profiler, we can see the following SQL statement being issued:
The Report Server issued a SELECT statement which corresponds to the shared data set definition (i.e. all the columns defined in the data set, even though the report itself consumes only 2 of them).
Running the second report, I got to see no activity on Profiler. The Report Server used its cached data set to produce the second report, and we have just saved a trip to our operational/DWH server and hopefully saved us a bit of report execution time.
I then disabled the caching option and re-ran the two reports, and the results were as expected: on both reports I got to see the Report Server issuing the same SELECT query to the AdventureWorksDW database.
To conclude, this new feature looks like a real boon for both developers, who can save time on development of report objects, and DBAs who will get to see their operational databases being accessed less.
-Yaniv
(This is post number 3 for week number 3 of my SQL blog challenge)