Wednesday 16 December 2015

Exception aggregation modeling with Graphical Calc view

The business scenario is quite common in the BI reporting. Consider a KPI, required to be calculated at a low level of granularity, but the reporting is required at the higher level of granularity. With the default aggregation behavior of the calc view, the constituents of the Calculated KPI might also be aggregated at the reporting granularity level, resulting in incorrect value for the Calculated KPI.

To elaborate more on this, consider the following data model for the Sales Transaction. The transaction data is captured for Product Sold on a given date in a store. The table structure can be defined as follows:

( STORE_ID    nvarchar(10),  --- Store Id
  PROD_ID    nvarchar(10),  --- Product Id
  PROD_CAT    nvarchar(10),  --- Product Category
  PROD_PRICE  decimal(10,2), --- Product Price
  SOLD_QTY    integer,      --- Product Quantity Sold in a Sales transaction
  TRANS_DATE  nvarchar(8),  --- Sales Transaction Date
  STORE_REG  nvarchar(10)  --- Region in which the Store is located

The business rule could be:
The product price may vary each day, hence the Net Sales need to be calculated for the Product Price on the transaction date and Quantity Sold on the day.
*** In ideal scenario, the Net Sales should be calculated at the time of data loading and should be persisted in the data model. This can have positive impact on the performance and is also a recommended modeling option. The following modeling option explained below should be implemented, when the calculation persistence is not feasible due to more complex requirements or different sources / data stores for Product price and Sales Transaction.

The sample Sales transaction data is as follows:

SAP HANA Certifications and Material

In the above example, consider the Net Sales in the Sales transaction data with calculation as Product Price * Quantity Sold. The Net sales need to be calculated at the transaction level granularity.

The standard Calc view model, based on a standard Analytic view using the transaction table is as follows:

SAP HANA Exception aggregation modeling

Please note that the Multidimensional Reporting property is set to True which adds default aggregation node to the Calc view. The default behavior of all the base KFs is set to SUM. The Calculation of the Net Sales is defined as below hence executed as “Calculate After aggregation” resulting in the following output.

Exception aggregation modeling with Graphical Calc view

The output of the query is as follows:

Exception aggregation modeling with Graphical Calc view

The Product Price is shown for the explanation purpose. The Product price at Product category level does not really make any business sense.

The expected correct output for the above query is as follows:

Exception aggregation modeling with Graphical Calc view

To achieve the above output, we need to make the following changes to the model.
  1. Create the model with the Multidimensional property set to False, hence adding the default Projection node to the model.
  2. Add an aggregation node on the Analytic view to define the aggregation behavior.
  3. Set the Keep Flag property for all the attributes which defines the Exception aggregation criteria. In this case, the aggregation is expected to happen at Store, Product and Transaction date irrespective of the reporting granularity. The Keep flag will always add these columns in the group by clause even if they are not selected in the query. Please note that the added columns in aggregation will impact the performance as each column join is added to the resultset.
  4. Define the Calculated KF for Net Sales in the Aggregation node. The calculation remains same as shown in the earlier screenshot.
SAP HANA Certifications and Material

Exception aggregation modeling with Graphical Calc view

The query executed on this model will result in the required output.

Exception aggregation modeling with Graphical Calc view

Well, now that works as expected..

Performance impact:

Please note that this implementation has performance impacts. If we look into the visual execution plans for both the queries, then we can see that the second query includes all the columns in the SELECT statement along with all the columns with Keep flag property set to true.

Execution for the first query on ZGCV_PROD_SALES looks something like:

Exception aggregation modeling with Graphical Calc view

The execution plan shows only those columns required in the SELECT statement.

Execution for the second query on ZGVC_PROD_SALES_EXCP_AGGR (with Exception aggregation) looks like:

SAP HANA Certifications and Material

As we can see from the execution plan above, the performance gets impacted due to additional column joins resulting in higher data volume, 20 rows, being passed from one engine to another, as compared to 3 rows in the earlier execution. The generated resultset depends upon the granularity specified with the Keep flag setting.

SAP HANA Certifications and Material

As you can see from the screenshot above, all the columns in the SELECT statement and with Keep flag set to true are used in column joins to generate the result set.

As mentioned earlier, the better option is to have such values persisted in the database. In case, due to complex business or technical requirements such persistence is not feasible, then the above option should work fine. But at the cost of some performance impact.


No comments:

Post a Comment