Friday 26 March 2021

Run Modes of SAP BW Streaming Process Chains

This blog post will focus on the streaming process chain run modes and provides URLs to additional sources in the end. I use the general term “SAP BW” which refers to both SAP BW 7.5 and SAP BW/4HANA (any release).

In general, acquiring data as fast as possible should not be regarded as first choice. Instead, SAP recommends leveraging virtualization and data federation wherever possible. In combination with SAP HANA, BW provides multiple suitable modeling options, mainly based on SAP HANA integration services (SDA, SDI), SAP HANA modeling options (Calculation Views), and Open ODS Views or CompositeProviders as virtual InfoProviders on the BW application side.

Only if these approaches do not meet your business requirements, loading data in “real-time” into SAP BW should be evaluated as a serious alternative. Introduced long ago in 2006 with NetWeaver 7.0, “Real-Time Data Acquisition (RDA)” was the related concept for InfoCubes and DSOs (classic). However, with the introduction of ADSOs, SAP decided to retire RDA and to enhance the existing framework of process chains instead.

Streaming is an operation mode of BW process chains which can be set in the properties of each process chain. When a process chain is running, each step takes a certain amount of time; the interesting question is what actually happens, when the chain is started again -maybe even several times- during this period, and the execution reaches the process which is already active from a previous execution:

◉ A classic process chain is scheduled in the background. Then the second execution waits 10 minutes for the previous execution to complete. If the previous run is not completed within this period, the second execution is canceled, the process chain is interrupted and fails with an error.

◉ A process chain in streaming mode is executed using a new Queued Task Manager Framework. This enables a flexible number of “worker” jobs that get the next process from a queue. If a second, third, and further execution reaches the current active process, these requests to start this process once more are written to a queue only. No more resources are used, and the system does not wait. As soon as the current process is finished, the worker job receives all of these start requests for a given process and restarts the process just once: It takes the most recent chain execution from the queue and executes it. This means that the second and third chain executions are dropped, and only the most recent chain execution is continued.

In a nutshell, the main value-added the Streaming mode is that it facilitates the frequent data transfer from a source system to SAP BW without scheduling errors and without unreasonable waste of resources. As a result, data can be acquired in SAP BW in almost real-time. It is not 100% real-time, because the realistic latency is measured in minutes in the best case. So do not expect latency times of milliseconds to seconds which are common in SAP SLT replication scenarios for example.

In general, there are different run modes to be distinguished: A Push and two Pull run modes. The main difference between them is whether the consuming system or the source system is triggering the data transfer.

A) The PUSH Run Mode

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Prep, SAP HANA Guides, SAP HANA Career

Main Characteristics:

SAP BW receives a trigger from a so-called ODP Daemon in the source system if a new record is generated there. This event starts the immediate execution of the process chain which transfers the new data to the SAP BW InfoProvider.

This is the preferred mode for ODP sources, because the process chain runs only when new data is available in the source. The process chain is scheduled once in streaming mode, the definition of periodic repetitions is not required.

The daemon has to be scheduled as regular job ODQ_DAEMON_CLIENT_xxx in the SAP source system (tr. ODQMON, GoTo/Real-Time Daemon…). By default scheduling, It checks all 15 sec for updates and sends an event to BW if there are any.

Preconditions:

1. This mode is only available for ODP DataSources of types SAP Extractors, ABAP CDS Views, BW InfoProviders and SLT.

2. Another precondition is that DataSources are Delta-capable:

◉ ODP_SAP or ODP_CDS: The delta-capability is dependent on the implementation.

◉ ODP_BW: The delta-capability depends on the source InfoProvider. Supported sources are DSOs (classic), InfoCubes and ADSOs.

◉ ODP_SLT: All DataSources are delta-capable in general.

3. Finally, the DataSources need to support Streaming (“Streaming can be triggered by this DataSoure“):

◉ ODP_CDS, ODP_BW, ODP_SLT: All delta-capable DataSources support this property by default.

◉ ODP_SAP: This property depends on the individual implementation of the extractor. For example, LO Cockpit DataSources (2LIS_*) are generally not streaming-enabled due to their complexity (e.g. due to the queued delta), but other SAP delivered DataSources might support this feature. However, most of the SAP delivered DataSources are not streaming-enabled in the delivered version. There are some corrections provided (see SAP notes 1692484, 1780297, 1782173), but in most of the cases so-called Realtime-flag has to be set manually in the SAP source system (e.g. by a customer report updating ROOSOURCE-REALTIME to value ‘X’, see also SAP note 905089, which is not released currently and so SAP internal only). However, consider that this makes only sense for DataSources which are able to provide new data in real time, which means a delta is available at every point of time.
Customer developed DataSources do not have this property initially as well. It must be enabled explicitly in the DataSource definition (Tr. RSO2). The corresponding property is called Real-Time enabled and is available in the customizing settings for the Delta mode.

B) The Intelligent PULL Run Mode

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Prep, SAP HANA Guides, SAP HANA Career

Main Characteristics:

SAP BW checks very frequently (each 60 seconds) if a new record is generated in the source. If this check is successful, the process chain is started immediately to transfer the new data to the target InfoProvider. The process chain is scheduled once in streaming mode, the definition of periodic repetitions is not required.

This is the preferred mode for HANA sources, because the process chain runs only when new data is available in the source system.

Preconditions:

1. This mode is available for HANA DataSources only.

2. The DataSources need to be Delta-capable.

3. If the source system type is HANA_LOCAL or HANA_MDC (BW/4HANA) / HANA with connection type “Local SAP HANA Database Schema” (BW 7.5).

4. If the source system type is HANA_SDA (BW/4HANA) / HANA with connection type “SAP HANA Smart Data Access” (BW 7.5), it is important which type of remote connection is in use:

◉ SAP HANA Smart Data Access (SDA) does not support this run mode in general.

◉ SAP HANA Smart Data Integration (SDI): The intelligent PULL mode is available only for adapters which support “Real-time Change Data Capture” as documented in the PAM functional overview of SAP HANA SDI 2.0 (pages 20-22). Currently those are: ASEAdapter, ASEECCAdapter, DB2ECCAdapter, DB2ECCLogReaderAdapter, MssqlECCAdapter, MssqlLogReaderAdapter, OracleECCAdapter, OracleLogReaderAdapter, PostgresSQLLogReaderAdapter, HanaAdapter, TeradataAdapter, AbapAdapter, CloudDataIntegrationAdapter and TwitterAdapter.

In a nutshell, the SDI adapter replicates each change to a generated BW table (/BIC/CMT*) in real-time. You can identify this table in the BW DataSource definition, there is a button called SAP HANA objects. This is the location the streaming process chain checks for new data regularly.

Note: For BW DataSources related to source system HANA_* the Streaming property is always off (“Streaming cannot be triggered by this DataSource“). Do not be confused – I was confused myself quite a long time…it seems like this property only refers to the PUSH mode and thus to ODP sources.

C) The Scheduled PULL Run Mode


SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Prep, SAP HANA Guides, SAP HANA Career

Main Characteristics:

SAP BW executes a process chain which leverages the streaming framework, no matter if there are new data in the source or not.  The Streaming process chain is scheduled as a periodic job and it can be executed very frequently if required (up to each minute). If there are updates available, they will be loaded immediately to SAP BW. On the other hand, if there is no new data, then an empty request is created in the target InfoProvider. 

Preconditions:

1. This mode is available for ODP DataSources which are not streaming-enabled.
2. This mode is also available for HANA DataSources which rely on SAP HANA SDA, or they are based on SAP HANA SDI connections which do not support “Real-time Change Data Capture” (all adapters which are not listed above).
3. Another precondition is that the DataSource is Delta-capable.

Note: The fact that the preconditions the PUSH mode (A) or the Intelligent PULL mode (B) are not met, does not mean, that you cannot run process chains in streaming mode. You can still leverage the queued task manager framework to run them with high frequency for Delta-capable DataSources. In this case, the process chain needs to be scheduled as a periodic job to pull data from the source in up to 1 minute frequency, instead of the process chain being triggered only when new source data is available.

No comments:

Post a Comment