The concept of incremental processing is well documented by T.K. Anand's fabulous whitepaper on processing (which you should probably digest before continuing), but there is a severe lack of full working example XMLA commands. The purpose of this post is to provide full working examples of ProcessAdd against Adventure Works.
1. ProcessAdd Partition (download ProcessAdd Partition.xmla)
This example uses an out-of-line query binding to run a ProcessAdd on the 2004 partition of the Internet Sales measure group. Real-life use for such an XMLA command would require that you specify an out-of-line SQL query which retrieves rows that currently do not exist in the cube. No such rows were available in AdventureWorksDW, so I just specified a query which would add duplicate rows to the partition for illustration purposes. This XMLA command can be created in Management Studio by right-clicking on a partition, choosing Process, changing to Process Incremental, clicking Configure, setting a query, then clicking the Script button:
2. ProcessAdd Dimension (download ProcessAdd Dimension.xmla and ProcessAdd Dimension 2008.xmla)
This example uses an out-of-line data source view to run a ProcessAdd on the Sales Territory dimension. Because dimensions can be built off of multiple tables from the DSV, you cannot use an out-of-line query binding like what was used for example #1. Instead, you have to use an out-of-line DSV. Real-life use for such an XMLA command would entail specifying a named query in the out-of-line DSV which returned only rows which did not exist in the Analysis Services dimension. Rows which already exist will be ignored, so it is wasteful to allow your query to return them. No such rows were available in AdventureWorksDW, so I just specified a query which returned rows that already exist in the dimension for illustration purposes. With the example out-of-line DSV in “ProcessAdd Dimension.xmla” using the query it does, the Sales Territory dimension will not actually get any extra members. However, the example in “ProcessAdd Dimension 2008.xmla” uses a query that will add Texas to the Sales Territory dimension.
Update: Note that the default error configuration changes in AS2008 so that duplicate key errors are not ignored during processing. Because of this design change, if you created your dimension in the development tools for AS2008, you have to override the ErrorConfiguration in the Batch in which you’re doing ProcessAdd on dimensions unless the rows you’re adding contain completely new values for every attribute in the dimension. In the “ProcessAdd Dimension 2008.xmla” example, we are adding a new region “Texas” to an existing country “United States”. Because “United States” already exists, we had to override the ErrorConfiguration and set KeyDuplicate to IgnoreError.
There is no UI for creating this XMLA command in Management Studio. So I recommend the following steps if you wish to build your own ProcessAdd command for a dimension.
- Right-click on the dimension in Management Studio, choose Process, change to ProcessUpdate, and click the Script button. Change ProcessUpdate to ProcessAdd in the XMLA script.
- In Business Intelligence Development Studio (BIDS), right-click in Solution Explorer on the Data Source object that currently provides the connection for processing this dimension, and choose View Code. Copy that XML. Paste it into the <Process> tag. Remove all the properties of the DataSource except for ID, Name, and ConnectionString. Note that even if you're not changing the connection string, it is still necessary to include an out-of-line data source with the out-of-line DSV. This is a known issue that will hopefully be fixed sometime after SP2.
- In BIDS, open the data source view which your dimension is currently processed off of. Right-click on the table(s) your dimension currently uses, and choose "Replace Table... With New Named Query..." Then edit the query specifying the appropriate where clause. Do not save the DSV changes. Right-click on the .dsv file in Solution Explorer and choose View Code. Copy the entire XML and paste it into the <Process> tag. Remove the CreatedTimestamp and LastSchemaUpdate properties (to avoid a warning message). There are other large sections (such as the Annotations) you can remove if you like, but they will not hurt anything.
Note that if you have more than one <Process> command, the out-of-line DSV must contain the tables required for all dimensions or partitions being processed which use that DSV normally. Here is such an example: Process Multiple Objects With Out-of-Line DSV.xmla
Normal processing involves Analysis Services querying a SQL database for the necessary data. Analysis Services also supports push mode processing where the data itself is provided in the XMLA command. Integration Services uses push mode processing whenever you have a Partition or Dimension Destination:
3. Push Mode ProcessAdd for Partition (download Push Mode ProcessAdd for Partition.xmla)
This example shows how to process a partition using push mode by specifying an out-of-line rowset. Generally push mode processing is done using Integration Services. If you need to do it without the aid of SSIS, I still recommend you build a simple package in SSIS, then run Profiler while you run the SSIS package. That XMLA query you will see in Profiler will not have the rowset schema or the actual data, but it will get you started. Note that in SSIS when you're using a Partition or Dimension destination, if a column is used twice (such as in the key and the name), you need to duplicate that column and put it in the pipeline twice. This is only a restriction of the UI in SSIS, not a restriction of push mode processing.
4. Push Mode ProcessAdd for Dimension (download Push Mode ProcessAdd for Dimension.xmla)
This example shows how to process a dimension using push mode by specifying an out-of-line rowset. See above for more information on push mode processing.
- If a dimension or partition is unprocessed, you can't do ProcessAdd on it. So make sure your processing procedures check this and do a ProcessFull the first time.
- The MemoryUsage class of ASSP uses push mode processing to load a cube incrementally without reading from a SQL database.
- This post goes hand-in-hand with another post which examines the performance of ProcessAdd.
- Push mode processing is only allowed on partitions or dimensions which use one table from the DSV. So that eliminates snowflaked dimensions or measure groups with reference dimensions.