Thursday 26 November 2020

Tuning SAP IQ NLS in SAP BW on HANA and BW/4HANA

Introduction

In any SAP BW on HANA and BW/4HANA implementations, customers use an NLS (Near Line Store) solution to offload and archive data from the main system.  The offering from SAP as well as a few partners uses SAP IQ as the database for the NLS solution.  Invariably, situations arise when performance against data in the NLS system just seems slower than it should be.

The focus of this blog is to point you in the different directions that can be at the root cause of any performance issues.  As always, we must be prepared to change our expectations.  On countless occasions, I have had customers demand that the NLS system perform at the same level as the HANA system.  That is just not realistic and should never be expected.  SAP HANA operates on data that is in-memory and was designed to be an in-memory platform.  SAP IQ is a general purpose database that was designed for disk based storage, albeit highly optimized.

Before we begin looking into SAP IQ and the NLS infrastructure, it is important to reset HANA and the queries to default operations.  By this, I simply mean that if we are using an hints to force certain behavior, we must undo those.  We can’t properly gauge performance and improvement if we are overriding the decisions that the system will automatically make.

The most commonly used hints for SAP IQ NLS queries includes adding this to the end of the HANA SELECT statement:

SELECT * FROM T1 WITH HINT( REMOTE_JOIN_RELOCATION, REMOTE_JOIN );

SAP HANA Tutorial and Material, SAP HANA Learning, SAP HANA Guides, SAP HANA Career, SAP HANA Exam Prep, SAP HANA Prep

There are variations of these options that can be set in other places.  SAP Note 2142945 has more details on when and how to set these options.

SAP IQ Hardware and OS Monitoring

For many years, I have been a huge proponent of looking at the hardware rather than SAP IQ to gauge performance issues.  This has always been my first step to diagnosing issues with performance.  When looking at the hardware there are two main areas to look at: CPU usage and IO patterns.

First, choose an appropriate tool to monitor the host like htop, top, dstat, iostat, and/or vmstat.  They are all great tools and offer a little different view on what’s happening on the host.  While these are mainly Linux commands, if you SAP IQ NLS is on a different OS, there are always variants to use.

When an NLS query is being run, focus first on the CPU usage on the host.  If the system is using 100% of the CPUs (cores) then likely the issue is simply one of the hardware not being large enough for the workload.  Given that most situations are not this simple, we would not want to simply stop at this point and assume that the hardware is just not large enough.

While monitoring the CPU usage, also pay attention to the IO.  Typically, IQ can push 50-100 MB/sec of IO per core on the host.

For details on hardware sizing, see this pdf.

If just a single query is running, it is not normal to see a significant amount of IO for most BW queries.  Chances are that high IO means that SAP IQ is not properly tuned or optimized for the workload.  At this point, we must look to some of the application and database internals:

◉ HANA statistics for the NLS objects

◉ Indexes on the SAP IQ NLS objects

◉ SQL formatting

SAP HANA Statistics

At this point, it is assumed that SAP IQ NLS has been configured and is being used.  First, though, let’s verify that the tables are actually remote tables that point to the SAP IQ NLS system.  That data can be found in this query:

SELECT * FROM VIRTUAL_TABLES

WHERE SCHEMA_NAME = 'SCHEMA_NAME'

AND TABLE_NAME = 'TABLE_NAME';

Replace SCHEMA_NAME with the proper HANA schema owner of the NLS virtual table and TABLE_NAME with the virtual table name.

Now, to verify whether or not statistics exist for an NLS table, this sample query can be used:

SELECT * FROM DATA_STATISTICS

WHERE DATA_SOURCE_SCHEMA_NAME = 'SCHEMA_NAME'

AND DATA_SOURCE_OBJECT_NAME = 'TABLE_NAME';

Replace SCHEMA_NAME with the proper NLS schema owner of the NLS virtual table and TABLE_NAME with the NLS table name.

If a row exists, then there are some stats on the table.  Want to make sure that there are HISTOGRAM stats on columns in the following steps (look at DATA_SOURCE_COLUMN_NAMES column).  No rows returned means that no statistics exist.

At this point we know that virtual tables exist and point to SAP IQ.  We also know whether or not any statistics exist on those objects.

SAP IQ Indexes

As part of gathering background information, we want to check to see if there are any indexes on the SAP IQ objects.  To check SAP IQ for the existence of indexes, this query can be used:

select

    table_name

    ,table_owner

    ,column_name

    ,index_type

    ,index_name

from sp_iqindex( table_name='TABLE_NAME'

    ,table_owner='SCHEMA_OWNER' )

where index_type <> 'FP'

Replace SCHEMA_OWNER with the proper NLS schema owner of the NLS virtual table and TABLE_NAME with the NLS table name.

If no rows are returned, there are no indexes on the table besides the default index (FP).  If indexes are present, note the column name for the next steps.

We have the final bit of background filled in now.  We covered making sure that we were using SDA for the NLS virtual objects and determining whether or not those objects have statistics.  We have now determined whether or not the SAP IQ NLS objects have indexes.

Recommended Tuning

Indexes

That blog series refers to a use case that is much broader than the SAP BW use case.  I would strongly recommend following the following advice for SAP IQ NLS indexing:

◉ Only on tables used for NLS queries.  This should be all tables, but there are scenarios where NLS shares resources with non-SAP BW systems.

◉ Only on columns in the WHERE, GROUP BY, and ORDER BY clauses.  This is important.  There is no need to index all columns, only focus on those used to search, filter, or process data.

Once you have identified which columns are used, we want to put an HG index on that column.  While there are other advanced indexes that could be used, the HG index will provide the best all-around improvement.

create HG index HG_INDEX_NAME

    on SCHEMA_NAME.TABLE_NAME ( COLUMN_NAME );

Replace SCHEMA_NAME with the owner of the NLS object, TABLE_NAME with the proper table name, and COLUMN_NAME with the column used in the WHERE, GROUP BY, and ORDER BY clauses.  HG_INDEX_NAME is a unique name for the index.  This name must be unique for the entire SAP IQ server.  Generally, I use an index name of the format SCHEMA_NAME”_”TABLE_NAME”_”HG:

create HG index BWSCHEMA_MY_TABLE_COLNAME_HG

    on BWSCHEMA.MY_TABLE ( COLNAME );

If you wish to build all indexes in a single statement to SAP IQ, there is a way to have them created in parallel. 

begin parallel iq

    create HG index table_a_date_field_HG on SCHEMA_NAME.TABLE_NAMEA ( COLUMN_NAME );

    create HG index table_b_date_field_HG on SCHEMA_NAME.TABLE_NAMEB ( COLUMN_NAME );

    create DTTM index table_a_date_field_DTTM on SCHEMA_NAME.TABLE_NAMEA ( COLUMN_NAME );

    create DTTM index table_b_date_field_DTTM on SCHEMA_NAME.TABLE_NAMEB ( COLUMN_NAME );

end parallel iq;

One side note that may show up in the query plans or SQL as you are investigating a performance issue is related to the use of indexes in SAP IQ.  Generally, adding an index to SAP IQ will give the optimizer more paths through the data and more efficient ways to access the data.

One interesting development in SQL execution has come to light over the past year or so.  I have seen more and more HANA queries that use functions in a WHERE or GROUP BY clause to convert, prune, trim, and massage data.  SAP HANA, as expected, will pass along these functions into SAP IQ NLS to be processed next to the NLS data.  This is a normal occurrence and happens as part of the SDA functional mapping and compensation stages of query optimization.

The abnormal part is the types of functions that users are using in the SQL.  Functions like a CAST, CONVERT, and TO_DATE are properly sent to IQ and IQ will continue to use indexes on these columns.  If we look at functions that manipulate the data like RTRIM, LTRIM, SUBSTRING, and REPLACE, we must understand how SAP IQ can optimize these.  Because they change the data it invalidates the use of an index as the index can only operate on unaltered data.

This HANA query (very cut down) will result in SAP IQ using indexes:

select * from MY_NLS_TABLE

where TO_DATE( DATE_FIELD ) = '2020-10-01';

This HANA query, though, will force IQ to column scan the data and run much slower:

select * from MY_NLS_TABLE

where RTRIM( IDCOL ) = '12345';

Be on the lookout for any code that uses functions to manipulate or massage data as they will likely force a less than optimal path through the data that cannot be optimized.

Create and Update statistics

Once the indexes are created, it is important to create or update the HANA statistics for these virtual objects.  Statistics should be updated or created after the indexes simply because the indexes can be used to more quickly access the column data.

There is no need to update statistics for all objects.  I would recommend the following best practice:

◉ Only on tables used for NLS queries.  This should be all virtual tables, but there are scenarios where SAP BW shares resources with non-SAP BW systems.

◉ Only on columns in the WHERE, GROUP BY, and ORDER BY clauses.  This is important.  There is no need to index all columns, only focus on those used to search, filter, or process data.

To create statistics on all columns of a table, this sample syntax can be used.

CREATE STATISTICS ON SCHEMA_NAME.TABLE_NAME TYPE HISTOGRAM;

If statistics exist you can update pre-existing statistics on all columns on a table that have statistics:

REFRESH STATISTICS ON SCHEMA_NAME.TABLE_NAME CASCADE TYPE ALL;

Can also do on a single column:

CREATE STATISTICS ON SCHEMA_NAME.TABLE_NAME (COLUMN_NAME) TYPE HISTOGRAM;

Sample command to update stats:

REFRESH STATISTICS ON SCHEMA_NAME.TABLE_NAME (COLUMN_NAME) CASCADE TYPE ALL;

At this point, we have covered two of the major components to proper tuning of SAP IQ NLS: statistics and indexes.  We can now begin testing and comparison of pre- and post-tuning execution times.

Rinse and Repeat

Once you have built the proper indexes and updated statistics in HANA, it is time to test out performance.  Begin iterating through the test cases to determine if performance did improve.  Don’t forget to continue monitoring the hardware so that you can determine whether IQ is CPU or IO bound and whether or not the hardware may be undersized.

If you wish to collect more information on what is happening inside SAP IQ, this can be done by enabling query plans and having them dumped to a filesystem on the SAP IQ host.

The following options are recommended to capture HTML based query plans in SAP IQ:

--

-- Query Plan options

--

set option public.query_plan = 'off';

set option public.query_detail = 'on';  -- turn off when not needed

set option public.query_timing = 'on';  -- turn off when not needed

set option public.dml_options10='on';  -- turn off when not needed

set option public.query_plan_as_html='on'; -- turn off when not needed

set option public.query_plan_after_run='on';

set option public.query_plan_after_run_directory='/tmp';

These options will allow SAP IQ to generate HTML based query plans.  By default, these files end up in the same directory as the SAP IQ catalog file (.db extension).  The option query_plan_after_run_directory overrides this behavior and can be used to redirect the files to a specific directory.

Query plans can be quite chatty.  To limit the number of generated plans, it is recommended that you use the option query_plan_min_time to specify the minimum amount of time, in milliseconds, that a query must run before a plan is generated.

SAP IQ query plans will provide a wealth of information about what is happening for a particular query and prove invaluable when debugging further performance issues.

No comments:

Post a Comment