Aggregations play the central role for the user experience with large SSAS cubes. There are many important aspects for optimal aggregation design. The order in which the aggregation are defined is actually the one of them.
If a query can be answered using more than one aggregation – the first one from the aggregation definition will be taken – not the smallest one!
It is not a problem if a query produces just a few reads, but in case of complex calculations and large cell spaces with hundreds and even thousands reads the overall query performance can be dramatically different. Don’t forget that the fact data not only should be read from disk, but also aggregated in memory to the requested grain (if it does not match the request).
One more consideration: certain amount of CPU time is consumed when SSAS searches for the matching aggregation. The percentage of this time in the whole “give me data” operation (search-read-aggregate) is noticeably bigger for the small aggregations. Thus it makes sense to hold small aggregations in the head of the search list.
So optimally the cube aggregations should be sorted by size (ascending).
The info about aggregation size can be viewed using BIDS Helper (Physical Aggregation Sizes) or with the following DMV query:
,DATABASE_NAME = 'Adventure Works DW 2008'
,CUBE_NAME = 'Adventure Works'
,MEASURE_GROUP_NAME = 'Internet Sales'
,PARTITION_NAME = 'Internet_Sales_2003')
You can also use the DISCOVER XMLA command with RequestType=DISCOVER_PARTITION_STAT.
The aggregations to be reshuffled can be found in the solution in the .partitions file. Alternatively you could do it directly on the deployed cube using XMLA script with ALTER AGGREGATION DESIGN.
Technically our aim is to sort “Aggregation” XML nodes in the .partitions file (or in the ALTER-Script) using the size values. There are several ways to automate this optimization: using PowerShell, C# Script, XSLT (optionally in SSIS package).
Since you occasionally want to make it manually, here is the straightforward example.
For the sake of simplicity let’s place the size info into “Descr.” elements of Aggregation nodes. Thus we can sort XML nodes using the info that the nodes are featuring.
Let’s say we have following aggregation sizes in BIDS Helper.
You can directly copy it from this dialog to excel for the later usage.
Now we need an XML editor which is able to sort nodes and save them with the new order. We will use Stylus Studio (Professional) in this example.
1. Let’s open .partitions in Stylus Studio and navigate in Tree View to the Aggregations node in the Aggregation Design of interest.
Now we can edit Aggregations in “Table view”.
2. Attention! In order to have the Description visible, we need to have at least one non empty description for our aggregations. You can add it in Stylus Studio (“New element”) or do it directly in BIDS:
Now let’s copy-paste the aggregations from Stylus Studio (Copy As Tab-Delimited in Table View) to Excel and merge then with the size info (from BIDS Helper or DMV).
Note that we use “numeric” form for percentages since they will be sorted as text!
Now copy-paste it back to Table view in Stylus Studio.
Now let’s sort them by Description and press Save.
Note that Stylus Studio puts <!xml> header in the file. Just delete it manually before using the file.
Now after updating your cube and Process-Index for the measure group, here is how the agg sizes should look like:
Now the small aggregations have always preference!
The sorting of aggregations with size not only contributes to performance but also makes an aggregation usage review easier when you consider to delete unused large aggregations, which take a lot of processing and disk space. We recommend to have the sorting of aggregation as a part of your aggregation maintenance strategy.