Adventure Works bashing is quite a fun sport. It’s unfortunate Adventure Works makes such an easy target. In this post I want to pick on how the Adventure Works cube shows N/A for the Statistical Account subtotal.
Obviously it makes no sense to sum together Headcount and Units and Average Unit Price. Setting the Statistical Accounts subtotal to NA does make sense. Adventure Works uses the following code to do this:
// Assignment of constant for Statistical Accounts account member. // This will allow parent member to be visible by default. ( [Account].[Accounts].&, [Measures].[Amount] ) = "NA" ;
Unfortunately, this is not the best way for several reasons.
Strings Break Aggregations
First, assigning a string will null out any subtotals and grand totals above where you assigned the string. This problem isn’t visible in Adventure Works since IsAggregatable is set to false on the [Account].[Accounts] hierarchy which removes the grand total. But if we do the following assignment on a hierarchy with a grand total, we see the problem:
( [Product].[Product Categories].[Category].[Accessories], [Measures].[Internet Sales Amount] ) = "NA";
You can see that when we assigned a string to the Accessories subtotal, it nulled out the grand total making it disappear. And null grand totals are a major usability problem, as I’ll explain in a second.
The second reason the Adventure Works NA approach is flawed is that it eliminates the sparsity in the cube for the Statistical Accounts subtotal. For example, let’s say we put [Date].[Date] on columns, note a bunch of dates with no data except for NA show up. This hurts performance tremendously and makes it more difficult to find meaningful data.
Typically, fixing this problem involves a conditional assignment which ensures empty cells stay empty:
( [Account].[Accounts].&, [Measures].[Amount] ) = IIf(IsEmpty([Measures].[Amount]),null,"NA");
But in Adventure Works, all the Statistical Accounts have a ~ unary operator which means that Statistical Accounts subtotal is always null. So to preserve sparsity and fill in the Statistical Accounts subtotal with NA only where any statistical accounts have data, we would have to do something like:
( [Account].[Accounts].&, [Measures].[Amount] ) = IIf( IsEmpty( Sum( [Account].[Accounts].&.Children ,[Measures].[Amount] ) ) ,null ,"NA" );
Strings Are Less Efficient
The third reason the Adventure Works NA approach is flawed is the degradation in performance due to the use of strings. In talking with the SSAS team, Marius Dumitru summarized this problem by saying, “strings are larger than other data types, harder to manipulate, introduce data structure indirections and memory copies/allocations.”
Mosha Pasumansky mentioned how strings and sparsity impact performance in the discussion of the Ratio calculation in this old blog post.
Null Grand Totals
Recently some co-workers were building a PivotTable and no matter what they did, they could never get anything more than a blank PivotTable. After a bit of head scratching, I realized what was happening was that the grand total was nulled out, so they were never even able to start the PivotTable properly.
To demonstrate this problem, let’s pretend we want to null out the grand total on the Product Categories hierarchy:
( [Product].[Product Categories].[All Products], [Measures].[Internet Sales Amount] ) = null;
Now if we start a PivotTable and put something other than the Product Categories hierarchy on rows, you’ll see this null grand total prevents us from starting the PivotTable since no rows even show up:
Incidentally, you can tell from the “this will allow parent member to be visible” comment in the code at the top of this blog post that the developers of Adventure Works had this very thing in mind. Since the ~ unary operator caused Statistical Accounts to be null always, it made it impossible to expand that subtotal since it would never appear.
Of course, your users could check the “Show items with no data on rows” option under PivotTable Options… Display. But you can’t expect your users to know to do that. PivotTables are designed for top down analysis. And if the top is null, your PivotTable is not user friendly.
So how do we avoid all three Adventure Works problems created by assigning strings? And how do we avoid the problem of null subtotals and null grand totals?
You will note Mosha Pasumansky’s approach in the blog post above was to use NULL and a four-part format string like '#.#;;;NA'. The problem with that approach is that it doesn’t work in Excel PivotTables. Excel retrieves the VALUE and the FORMAT_STRING cell properties then does the formatting itself. But the formatting code in Excel ignores the fourth part, unfortunately. (As a reminder, format strings are semicolon separated. The first part is for positive numbers, the second part is for negative numbers, the third part is for zero, and the fourth part is for nulls.) So our solution will have to limit itself to three-part format strings which work in Excel.
My approach of choice looks like the following:
scope( [Account].[Accounts].&, [Measures].[Amount] ); this = IIf(IsEmpty(Sum([Account].[Accounts].&.Children, [Measures].[Amount])),null,0); Format_String(this) = ";;"; //format zeros as blank end scope;
It is assigning a zero to a cell if any statistical accounts have data in this slice, otherwise it’s assigning null. Then it’s using a custom format string just for the Statistical Accounts subtotal. The format string is just two semicolons. That three-part format string says to format zero as blank. Since the format string is inside the scope statement, we’re only customizing the format string for the Statistical Accounts subtotal.
If you prefer NA, then you would use the following format string code inside the scope statement:
Format_String(this) = ";;\N\A";
And as discussed above, typically IsEmpty(Sum([Account].[Accounts].&.Children, [Measures].[Amount])) would just be IsEmpty([Measures].[Amount]) but we had to use the more complex expression here because of the ~ unary operators in play.
This approach solves all the problems we were trying to avoid. It doesn’t null out the grand total. It preserves sparsity in the cube. It avoids using expensive strings. It avoids null subtotals and null grand totals. And it is usable in Excel.