Home
About Us
News
Solutions
Experience
Blogs
Resources
Greg Galloway
LastNonEmpty Gotchas

I have been collecting a few advanced tidbits about semi-additive measures over the last few months, and it's time to share them.

Role-Playing Time Dimensions

Semi-additive aggregation functions like LastNonEmpty work just like Sum on all dimensions except for the Time dimension. How does it determine which is "the Time dimension"? It looks for the dimension marked Type=Time:

TypeTime

But what happens when, like most cubes, you have multiple role-playing versions of that dimension in your cube? Which does it choose? Flip to the Dimension Usage tab in BIDS. For the column representing that measure group with the semi-additive measure, find the role-playing date dimensions. I've circled them in green below. Then find the topmost one that's connected to the measure group. I've circled that in orange below. That's the semi-additive dimension for this measure group. A LastNonEmpty measure in this measure group will look for the most recent day with data in that Date dimension.

DimensionUsage

If you discover that the wrong dimension is the semi-additive dimension, it's very easy to fix. You just need to make the correct dimension the topmost connected role-playing date dimension by reordering the cube dimensions. To reorder the cube dimensions, flip to the Cube Structure tab, look at the list of cube dimensions in the bottom left, then drag and drop them into the correct order, similar to the following:

ReorderCubeDimensions

I suggest you arrange the cube so that the semi-additive dimension be the same for every measure group. This is probably a topic for a post another time, but I'm a big proponent for having a dimension simply called Date which is connected to every single measure group as the most common meaning of date. This dimension should be the semi-additive dimension.

I was warned that there have been some code changes in the Analysis Services product in this area at some point. So I tested this behavior in AS2005 RTM, SP1, SP2, SP3, and AS2008 RTM. I found it worked as I described above in all those versions. But test your cube thoroughly.

On a related note, the BIDS Helper Validate Aggs feature understands semi-additive measures and role-playing dimensions correctly. If you’ve got a measure group with only semi-additive measures, and if there are any aggs that don’t include the granularity attribute of the correct semi-additive dimension, then it warns you that those aggs will not be used in most situations.

Performance of LastNonEmpty vs. LastChild

On my largest cube, we decided to go with LastChild over LastNonEmpty. The driving reason was that LastNonEmpty hit more partitions than it needed to. To be specific, if we asked for a LastNonEmpty measure for 2008, it would hit every partition in 2008, not just the latest. LastChild only hits the latest partition. I would like to credit Robert Skoglund of Microsoft Consulting Services with making this observation and suggesting the solution.

The only hurdle was that the current month/quarter/year weren't complete, so the LastChild didn't have any data. That was easy to fix with an addition to the MDX script like the following:

// the most recent day with data
CREATE HIDDEN SET [Yesterday]
AS Exists([Date].[Calendar].[Date].Members, [Date].[Yesterday Flag].&[True]);

// The Inventory measures are AggregateFunction=LastChild.
// When we are in a current period, use the most recent day to
// get the inventory counts.
SCOPE(MeasureGroupMeasures("Inventory"));

 
SCOPE(Exists([Date].[Calendar].[Month].Members, [Date].[Yesterday Flag].&[True]));
    this = [Yesterday].Item(0).Item(0);
  END SCOPE;
 
  SCOPE(Exists([Date].[Calendar].[Quarter].Members, [Date].[Yesterday Flag].&[True]));
    this = [Yesterday].Item(0).Item(0);
  END SCOPE;
 
  SCOPE(Exists([Date].[Calendar].[Year].Members, [Date].[Yesterday Flag].&[True]));
    this = [Yesterday].Item(0).Item(0);
  END SCOPE;
 
  [Date].[Calendar].[All] = [Yesterday].Item(0).Item(0);
 
END SCOPE;


Another reason for going with LastChild over LastNonEmpty is that the results are often more intuitive in situations where not every single item is snapshotted every single period. Teo Lachev explained this problem very well in his post Last Non Empty Affairs. Depending on the business requirements, you could workaround the problem as he did with MDX that looks at each Customer, or you could change the ETL to snapshot every single item every period, or you could use LastChild to ensure that your results come from a consistent time period. The LastChild trick won't work in every situation, but it is a handy trick to have in your bag.

Long Navigation Parameters

I was shocked recently when my nice report that was executing in 2 seconds suddenly took 22 seconds to finish after I enabled navigation (i.e. drillthrough links) in every textbox in the report. The point of this post is to explain what’s going on and suggest workarounds to improve performance.


The Symptoms

This particular report was an MDX report in Reporting Services 2005. If your MDX reports are anything like mine, you end up having an obscene number of multi-select parameters to give the end user maximum flexibility. This report was no exception. In SQL reports, the parameter values are often integers. However, in MDX reports, the parameter values are usually MDX unique member names such as [Product].[Category].&[Bikes]. You can already see the problem. If you have many large multi-select parameters, the concatenated list of selected parameter values becomes large. In fact, for this report, that list ended up being about 30KB long. Multiply that times 3,000 textboxes in the report, and your speedy report is now a 90MB beast.

Another interesting thing I noticed is that of the TimeDataRetrieval, TimeProcessing, and TimeRendering columns in ReportServer.dbo.ExecutionLog, the only one that changed when I turned on navigation was the TimeProcessing column.

Also, you might guess shortening the name your “Geography” report parameter to “G” would help, but in fact, the length of the report parameter name does not matter (just the length of the selected parameter values).

The final interesting thing I noticed is that the Drillthrough chunk in ReportServerTempDB was massive. Clearly, part of the expense of lots of huge drillthrough links was in constructing and saving that chunk to ReportServerTempDB. The following query shows you the chunks and their sizes:

select sd.SnapshotDataID
,sd.CreatedDate
,sd.ExpirationDate
,s.ReportPath
,s.Expiration
,s.CreationTime
,c.ChunkName
,datalength(c.content) as ChunkBytes
from ReportServerTempDB.dbo.SnapshotData sd
join ReportServerTempDB.dbo.SessionData s
on s.SnapshotDataID = sd.SnapshotDataID
join ReportServerTempDB.dbo.ChunkData c
on c.SnapshotDataID = sd.SnapshotDataID
order by sd.CreatedDate, c.ChunkName

I did a little research into how drillthrough works in SSRS 2005. Here is my understanding based solely on observation of the HTML in Report Manager and watching SQL profiler events against the ReportServer database. When a report is rendered in Report Manager, it does not pass the entire URL for each drillthrough link to the web browser. All it does is capture the cell coordinates you clicked when you click a drillthrough link. Then it posts that event back to the server. Then the server runs a few SQL queries against ReportServerTempDB to lookup info from that ChunkData table. I was rather impressed with how efficient this lookup was as it did not require reading the entire chunk. From what I can tell, that drillthrough chunk is some sort of binary structure that is internally indexed somehow. When the server goes to lookup the full drillthrough URL for the link you just clicked, it grabs the first few segments of that drillthrough chunk (which I’m guessing is loading the “index” for the entire chunk) and then it skips right to the segment of that chunk which has the info it needs. So my conclusion is that in SSRS 2005, they have done a good job optimizing the click of a drillthrough link. However, the up-front processing required to optimized that click is rather expensive.

Indications so far are that SSRS 2008 will improve this scenario. I tried this in the November CTP of SSRS 2008 and the report with the navigation rendered the first page 2x faster than on SSRS 2005 SP2. Interestingly enough, the time difference between a report with navigation and a report without navigation now shows up in SSRS 2008 under the TimeRendering column, not the TimeProcessing column. Also, the chunks are stored differently in SSRS 2008. Hopefully someone from Microsoft will fill us in on how they were able to achieve this performance improvement. (Update: Robert Bruckner from Microsoft responded here.)


Workaround #1

Short of waiting for SSRS 2008, what is the workaround? I wrote an MDX sproc called ASSP.KeysStrToSet. You can refer to all the technical details, get some usage examples, and download some sample reports at that link. But let me explain the workaround at a high level. There are two parts.

First, you need to find a way to shorten the parameter values. In general, this is pretty easy as you can just use the surrogate key (i.e. use “1” as the key instead of “[Product].[Product].&[1]”). For optimal performance, change the parameter datatype to Integer if possible. However, there are a couple of more advanced scenarios such as composite keys or nullable keys which add some complexity. The link to the Analysis Services Stored Procedure project above explains all these advanced scenarios and the solution.

Second, you need to figure out a way to let MDX accept those shortened parameter values. That’s where the ASSP.KeysStrToSet function comes into play. Here’s an example. The following two expressions return the same set:

StrToSet("{ [Product].[Product].&[1],[Product].[Product].&[2] }")
ASSP.KeysStrToSet("[Product].[Product]", "1,2")

How much improvement should you expect from this workaround? Our tests on reports with lots of navigation-enabled textboxes have shown anywhere from a 2x to an 8x performance improvement in the time it takes to render to the browser in Report Manager. Of course, that depends upon how many textboxes have navigation enabled and on how large the drillthrough links are for each textbox.

I suspect utilizing this workaround with SSRS 2008 will make the performance improvement even greater. (Unfortunately, I can’t test that theory on the November CTP at the moment because of a bug.)

If you agree with this approach, please vote for this feature suggestion so that KeysStrToSet functionality becomes available in a built-in MDX function.


Workaround #2

Another possible workaround is to add a “Show Drillthrough?” parameter that controls whether or not to enable drillthrough on the textboxes. Then in the “Jump to Report” property, put the following expression:
=IIf(Parameters!ShowDrill.Value,"MyDrillReportName",Nothing)

In SSRS 2005 (it’s not necessary in SSRS 2008 apparently), you also have to wrap each parameter you hook up to the drill report in that way:
=IIf(Parameters!ShowDrill.Value,Parameters!City.Value,Nothing)

You should probably use workaround #2 in conjunction with workaround #1 to achieve good performance when they flip the “Show Drillthrough?” parameter to True.

Note you may have to make that “Show Drillthrough?” parameter a string to workaround a bug.


Conclusions

Enabling drillthrough on every cell in a report can drastically slow report rendering if the drillthrough parameters are very long. However, you can adequately workaround this problem by shortening the parameter values and dynamically disabling drillthrough when it is not needed.

Retrieving Cell Properties

The point of this post is to discuss how to retrieve cell properties. Specifically, I want to call attention to the ReturnCellProperties setting on an Analysis Services connection and command. I will also talk about how you can use cell properties in Reporting Services reports.

It must be said that Teo Lachev beat me on publishing this tip. I had already drafted most of this post and had a couple of extra things to say on the topic, so I’m publishing it anyway.

In an MDX query, you’ve got dimension properties and cell properties. Dimension properties are intrinsic member properties such as MEMBER_CAPTION in addition to member properties you define in your dimension (such as the Customer attribute having a member property called Email). Cell properties are properties such as VALUE, FORMATTED_VALUE, and FORMAT_STRING which are simply extra information about a cell.

In most cubes, an effort has been made to properly set the FORMAT_STRING for all measures so that percents are formatted as percents and currency is formatted as currency. (Sometimes, the calc script even customizes the FORMAT_STRING for particular slices of the cube.) Therefore, it’s important to know how to retrieve the FORMATTED_VALUE cell property so that you can fully leverage all the cube provides you.

When executing an MDX query, you can retrieve the results in two-dimensional (DataReader/DataSet) or multi-dimensional (CellSet) format. The CellSet object lets you easily access any cell properties the query returns. On the other hand, it is not obvious how it is possible to retrieve any cell property other than VALUE in a DataReader/DataSet.

Putting the ReturnCellProperties=True property on the connection string changes the columns you get in a DataReader/DataSet:

AdomdConnection conn = new AdomdConnection();
conn.ConnectionString =
"Data Source=(local);Initial Catalog=YourDB;ReturnCellProperties=True";

Similarly, setting the ReturnCellProperties property on a command object will produce the same effect:

AdomdCommand cmd = new AdomdCommand();
cmd.Properties.Add("ReturnCellProperties", true);

This is obviously relevant if you’ve got a custom .NET application that executes MDX queries because it allows you to retrieve cell properties. It’s also relevant if you’re using MDX queries in Reporting Services. If you’re using the OLE DB provider to connect to Analysis Services (which is less common in SSRS2005), you can retrieve cell properties by adding ReturnCellProperties=True to the connection string which causes the extra columns to show up. However, a “Microsoft SQL Server Analysis Services” data source in Reporting Services automatically sets the ReturnCellProperties property (and the DbpropMsmdFlattened2 property, which I won’t discuss) to true on the underlying command object. That data source also hides the extra columns from you when you’re previewing your dataset, so you just have to trust that they are there.

select [Measures].[Internet Sales Amount] on 0,
[Product].[Category].Members on 1
from [Adventure Works]

If your MDX query for your report does not specify a CELL PROPERTIES clause (as in the query above), then you get the VALUE and FORMATTED_VALUE properties by default in Reporting Services. Obviously, =Fields!FieldName.Value retrieves the VALUE cell property. You can retrieve the FORMATTED_VALUE cell property with =Fields!FieldName.FormattedValue expression. More information on all the field properties you can use in SSRS expressions is available here.

select [Measures].[Internet Sales Amount] on 0,
[Product].[Category].Members on 1
from [Adventure Works]
CELL PROPERTIES VALUE, FORMATTED_VALUE, FORMAT_STRING

If your MDX query for your report does specify a CELL PROPERTIES clause (as in the query above), you override the defaults. This is useful for instructing Reporting Services to retrieve fewer properties for performance reasons. (For instance, a best practice is to specify “CELL PROPERTIES VALUE” so you don’t retrieve FORMATTED_VALUE if you’re not going to use the FormattedValue expression in your report.) Overriding the default CELL PROPERTIES is also useful to pull back extra properties such as FORMAT_STRING. Then in a report you can retrieve the FORMAT_STRING for a cell by using the =Fields!FieldName("FORMAT_STRING") expression.

There is a known limitation such that you cannot use the SSRS Aggregate expression and FormattedValue together. This limits the usefulness of the FormattedValue property. A workaround is to format the textbox in Reporting Services instead of using the FormattedValue property. This formatting could be hardcoded, but sometimes measures have different format strings for different slices of the cube. So if the format string is not constant, you can make the SSRS formatting dynamic by putting a =Fields!FieldName("FORMAT_STRING") expression on the Format property of a textbox. For footers, it would need to be =First(Fields!FieldName("FORMAT_STRING")). Note that .NET format strings (which SSRS uses) aren’t quite the same as SSAS format strings, so you may need to write a small function in the Code tab of your report and do replacements such as changing Currency to c0.

As a side note, Darren Gosbell blogged about an easy way to see the connection string properties for SSAS, but unfortunately, ReturnCellProperties is one of several properties that don’t appear in that list. You have to look at this page to see all the available properties. Mosha says that you can use all the XMLA properties on the connection string.

Reporting Services SP2 and MDX Summary Rows

If you have built any Reporting Services reports which use an Analysis Services datasource, SQL Server 2005 SP2 contains a fairly major change to be aware of. I’m going to describe the way it worked before SP2, then I’ll describe the change.

Prior to SP2

There’s been a good deal of frustration about how not all rows from an MDX query are shown as detail rows when used in an SSRS report. For instance, the following query returns two rows.

select [Measures].[Internet Sales Amount] on 0,
{[Product].[Category].[Bikes], [Product].[Category].[All]} on 1
from [Adventure Works]

But if used in an SSRS report, you only get one detail row because SSRS ignores any rows which contain an All member in the tuple defining that row. To explain that differently, the leftmost columns in the dataset originate from the “on rows” or “on 1” axis of your MDX query. If any of those contain an All member, that row will not be shown as a detail row. Notice how the All member is rendered as a null when previewing the dataset:

DatasetSummaryRows

Of course, you can still use the summary rows from the MDX query by putting the following expression in a table/matrix group or in the header/footer:

=Aggregate(Fields!Internet_Sales_Amount.Value)

This behavior is sometimes helpful and sometimes annoying, especially for experienced MDX writers who expected the report to match their MDX query. It turns out that you now have an option with SP2.

With SP2

If you reference any field from that dataset as the parameter for the Aggregate function in an expression, then the old behavior continues. But if you don’t, then all rows from the MDX query become detail rows in the report. This is on a per dataset basis, not a per table/matrix basis.

If your reports were built with correct usage of the Aggregate expression, they will work fine in SP2. The only way SP2 could break your report is if your MDX dataset contains summary rows which you never intended to be used. Obviously, that should never be the case, but it is easy to see how a developer who didn’t understand how it worked before SP2 could build a report that might break with SP2. For instance, let’s say you built a report off the query mentioned above and setup the Layout tab as follows:

ProblemUsingSumNotAggregateLayout   

When this report is run prior to SP2 it looks like:

ProblemUsingSumNotAggregatePreviewPreSP2

When this same report is run with SP2 you get totally bogus numbers:

ProblemUsingSumNotAggregatePreview

It’s worth carefully testing before you apply SP2 to your production servers.