PowerPivot – Part 4: what’s inside?
In this 4’th instalment of the PowerPivot blog posts series we’ll take a quick look at PowerPivot internals. Firstly, what you need to understand is that a PowerPivot database is similar to an Analysis Services cube, but it is not exactly the same, certainly not in the way it stores data internally within the Excel file.
There is an easy way to see what is inside an Excel file (thanks to the crowd at the recent Israeli BI User Group meeting…): simply change the Excel file extension from xlsx to zip and then open the “compressed” file to look at its content. What you will see there is a bunch of xml files as well as other files. What we are interested in is the PowerPivot data file, which can be located under the /xl/CustomData/ folder under the name “item1.data” (it may come in other names, I have not made any tests on this, though). You will see that this file is the file that takes up much of the disk space that the Excel file in general takes. So, when you open up an Excel file containing a PowerPivot model. this data file is read into memory and from this point onwards, all the slicing and dicing of data and calculations is done in-memory, which is how PowerPivot models are able to work really fast.
What I could not find in the Excel file zipped content is any trace of the familiar Analysis Services folder structure (such as the map, header files). So, the format is not the same but the behaviour of the PowerPivot model is the same: you can query a PowerPivot model using MDX, you can create named sets and so on. As important is the fact that you can deploy a PowerPivot model to Sharepoint, using the Sharepoint SSAS module and later on you can use this model as a source to an Analysis Services database.
There is an excellent post describing the client architecture of PowerPivot here. You can also read more about how Analysis Services server can hook into a PowerPivot model on BOL here. This pre-release article discusses how SSAS can connect to a Sharepoint PowerPivot model in a VertiPaq mode. The vertiPaq engine is the special PowerPivot data engine that enables the in-memory storage of data and the quick calculations and retrieval of data and aggregations.
-Yaniv
(This is post number 11 for week number 10 of my SQL blog challenge)