Friday 31 December 2021

Estimation of SAP HANA Replication Network Requirements in Optimized Way

The network connection between the primary and the secondary system impacts the overall performance of the SAP HANA systems if not configured correctly .For SAP HANA system replication it is critical that the network throughput (or bandwidth) and network latency requirement are sufficient to fulfill the optimal network requirements.

In this article, I am going to explain – How to measure optimal network bandwidth requirement for HANA replication setup and other important considerations for network throughput and latency.

Let’s take an example of 3-tier multi-target replication setup.

From DC1–> DC2 {SYNC | logreplay | delta_datashipping}

From DC1–> DC2 {ASYNC | logreplay | delta_datashipping | Compression enable for Data and Logs}

SAP HANA Replication, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Guides, SAP HANA Tutorial and Materials, SAP HANA Career, SAP HANA Materials
3 tier replication setup

For Network Latency – SAP HANA Replication modes plays important role .

◉ If the distance between DC1 and DC2 is less than 100 km, use the synchronous replication modes: SYNC or SYNCMEM. In our above example we have used SYNC considering SAP HANA HA scenario < 100 KM
◉ If the data centers are more than 100 km apart ( DC1 –>DC3), the asynchronous replication mode ASYNC is recommended. In our above example we have used ASYNC considering DR scenario > 100 KM. For this particular scenario, Kindly make sure to have reasonable TCP windows size

Here we are using enable_log_compression and enable_data_compression at DC3 site, It helps Network bandwidth in positive manner and reduce the Network bandwidth requirement. Please refer last section of this article for more details.

Basic Formula to calculate Network Throughput :

Bandwidth required for initial HANA data shipping in an acceptable time + [ Bandwidth required to ship generated HANA logs in an acceptable time ~ logreplay, Bandwidth required to ship delta HANA data snapshots in an acceptable time ~ delta_datashipping”].

The requirements regarding the network throughput are influenced by the selected operation mode ( Logreplay | delta_datashipping ) as it influences the size of the shipped data over the network.

To estimate the required throughput, Kindly proceed as below

◉ Calculate the size of the data and log that are generated during your daily workload also include peak load duration e.g month-end load

◉ Execute HANA_Replication_SystemReplication_Bandwidth.txt from the SQL Statements.zip file from SAP Note 1969700  & aggregate by DAY ( provided sample output from script as below) 

SAP HANA Replication, SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Guides, SAP HANA Tutorial and Materials, SAP HANA Career, SAP HANA Materials

◉ After executing script statement, we receive the above shown result
    ◉ This result outcome is starting point to calculate HANA network bandwidth
◉ Following column description provide more details in judgement

– SNAPSHOT_TIME: Time of snapshot
– HOST: Host name
– PERSISTENCE_GB: (Current) persistence data size (GB)
– DATA_SIZE_GB: Total amount of data written to disk (GB)
– LOG_SIZE_GB: Total amount of logs generated (GB)
– TOTAL_SIZE_GB: Total amount of data and logs generated (GB)
– LOG_PCT: Log compared to total (%)
– AVG_BANDWIDTH_MBIT: Average required network bandwidth to replication side (Mbit), only available for certain TIME_AGGREGATE_BY values
– SIMPLE_BANDWITH_MBIT: Simple network bandwidth calculation (Mbit) based on the formula that it should be possible to ship the persistence once per day

Bandwidth required to ship HANA logs: ( Operation Mode – Logreplay )

◉ The LOG_SIZE_GB column from above result outcome indicates total amount of logs generated (GB).
◉ Network bandwidth should be capable to ship the log backups size of one day from the primary system to the secondary system within one day ( from DC1 to DC2 and From DC1 to DC3) – focusing on peak values like 110.41 GB in the example above

Bandwidth required to do the initial HANA data shipping in an acceptable time: 

◉ It must be possible to do the initial data transfer of the complete data in acceptable time
◉ Column PERSISTENCE_GB from above result represents the currently valid size of the persistently stored data.
◉ Also following statement can be used to determine the size of the used persistence:“select host, port, data_volume_name, round (sum(used_block_count * page_size) / 1024 / 1024 / 1024, 3) as “Persistent Data Size MB“ from m_data_volume_page_statistics group by host, port, data_volume_name”

Example: Given is 4,3 TB of persistently stored data that need to be transferred in 1 day.
Required throughput: 4,3 TB/day which is ~50 MByte/s
=> ~0,5 GBit/s connection required ( from help.sap.com)

Bandwidth required to ship delta HANA data snapshots in an acceptable time” (only in operation mode delta_datashipping)

◉ This is the bandwidth required for regular delta data transfer happening every 10 minutes.
◉ As this is also related to the rate of changed data, it might very roughly be estimated as “bandwidth required to ship HANA logs”.
◉ We can also assume that it must be possible to transfer the complete database to the secondary site over a period of 1 day (assuming that throughout 1 day, all blocks will have been modified by merges.

Compression in Replication :

You can activate compression of log and / or data using the following SAP HANA parameters:

global.ini -> [system_replication] -> enable_log_compression [Enables compression of log when it is sent to the secondary site]
global.ini -> [system_replication] -> enable_data_compression [Enables compression of data when it is sent to the secondary site]

◉ The activation of the compression reduces the required network bandwidth, but at the same time some additional CPU overhead for compressing and decompressing data and logs

◉ Using compression is particularly useful in case of long distances between primary and secondary site or in case of bandwidth limitations ~ mainly in DR scenario.

◉ By default compression of content is disabled

No comments:

Post a Comment