Monday 29 July 2019

HANA Backup Recovery Strategy On AWS

Purpose


The purpose of this blog is to highlight backup solution that can be designed for HANA database on AWS without any overhead of using third-party tool. The idea of specifically mentioning AWS is because I’m using AWS Simple Service Storage (S3) for this solution, otherwise you can incorporate this approach in any other cloud provider (Azure, GCP) or even on-premise.

NOTE: Backup/Recovery solution is generally designed based on business RTO/RPO. This blog is just one of the many backup/recovery solution that can be used for HANA database.

Enough has been said about backup/recovery with reference to RTO/RPO, so I won’t be highlighting that part here, so let’s directly dig into solution highlight.

Solution Highlights


You have installed HANA database on AWS (on one of the availability zone in a region) and now you want to design backup and recovery strategy. Below is the solution highlight that can be incorporated to achieve backup/recovery in AWS.

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

1) HANA Backup on Local EBS Volume: Perform/Schedule complete, Incremental, differential and log in local EBS volume (/backup). To save some bucks, you can use throughput optimized HDD (st1) for /backup file system

You can schedule HANA database using any of the below options –

Option 1: Configure job in DB13. For SYSTEMDB or JAVA System TenantDB you need to select one ABAP System and create DBCO entry providing hostname and port number of SYSTEMDB and configure job in DB13 by selecting database connection you have created in DBCO.

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

Option 2: With each new Cockpit version (released quarterly), SAP incorporate new functionality and with the latest cockpit version you can schedule “Backup” jobs and enable retention policy, which makes cockpit as a central tool for backup.

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

Option 3: Configure database backup using script and schedule a cron job for the same.

2) Schedule Incron Job: Instead of scheduling cron job (time based to move files from local system to S3, you can use event driven incron job to set real-time inotify kernel watches on the backup directories and execute a given task (the S3 upload) when needed for the given file.

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

In above figure, data and log backup happens in /backup (local file system). So, when a log backup is generated in /backup/log/SYSTEMDB, it will trigger an event whose action is mentioned in incrontab job which will move file from /backup/log/SYSTEMDB/<filename> to S3 bucket.

The advantage of this using incrontab is that your backups are in sync with S3 at all time and you have copy of backup in case your /backup (local file system) gets corrupted or deleted by some reason.

3) Adjust S3 lifecycle: Once the files are in S3, you can adjust the storage class based on business feasibility.

In my opinion there is no need of log backup in S3 after certain time, so it advisable to direct delete it, whereas for data backup you can move file to less expensive storage after certain number of days and delete data backup as well after x days based on retention policy defined in the organization.

Solution Implementation


Even though SAP HANA database is In-Memory database i.e. data residing in memory, it has got its own persistency. Data is always saved from memory to disk at save points and changes are written into redo log files. So, in case of failure it can be used for recovery.

You have first gather information on backup frequency and retention policy in tabular format to get picture on what you are implementing.

Schedule  The path on Local File System  Retention Period – /backup  Retention Period – S3 
Data Backup Incremental Data Backup – Daily
Complete Data Backup – Weekly
System DB: /backup/data/SYSTEMDB
Tenant DB: /backup/data/DB_<SID>
7 or 15 Days Depends on business
Log Backup  Automatic – 15 Minutes System DB – /hana/log/SYSTEMDB
Tenant DB – /hana/log/DB_<SID>
7 or 15 Days Depends on business*
*No use of keeping log backup for long on S3 

Method to Move Backup to S3


Incron is like cron, but instead of running a command based on time, it can trigger commands when a file/directory event occurs (e.g., a modification to a file or to a file’s permissions). This makes it an outstanding tool to use for monitoring folders. In our case we will be monitoring folders under /backup file system.

Below is the process to install incron in your system. Incron (incrond/incrontab) is installed for SuSE Linux Enterprise Server from the OpenSuSE build repo. Check with below command whether incron is installed in your server or not

# rpm -qa | grep -i incron

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

NOTE: Version of incron differs based on the Operating System release and version.

Check service of incron is running with below command

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

If it is not installed, you can run below command

SLES # zypper install incron

RHEL # yum install incron

Configuration


As you may have surmised, using incrontab is like using crontab. You would edit your incrontab file with the command incrontab -e. This command will open the incrontab file for editing.

The format of the incrontab looks like:

<path> <mask> <command>

<path> – This is the path to the directory you want to watch. Do note the Incron is not capable of watching subdirectories. Only files within the path will be monitored i.e. if you want tenant database backup directory to be monitored you have to provide /backup/data/TENANT_DB/. If you have any other folder inside that, it won’t monitor (only files).

<mask> – This is one of several options:
  • IN_ACCESS File was accessed (read)
  • IN_ATTRIB Metadata changed (permissions, timestamps, extended attributes, etc.)
  • IN_CLOSE_WRITE File opened for writing was closed
  • IN_CLOSE_NOWRITE File not opened for writing was closed
  • IN_CREATE File/directory created in watched directory
  • IN_DELETE File/directory deleted from watched directory
  • IN_DELETE_SELF Watched file/directory was itself deleted
  • IN_MODIFY File was modified
  • IN_MOVE_SELF Watched file/directory was itself moved
  • IN_MOVED_FROM File moved out of watched directory
  • IN_MOVED_TO File moved into watched directory
  • IN_OPEN File was opened
We will be using “IN_MOVED_TO” mask to perform our operation

<command> – This is the command that will run should an event be triggered. In place of a command, you can always use wildcards. The wildcards will report basic information in syslog. The available wildcards are:

◈ $$ – Prints a dollar sign
◈ $@ – Add the watched filesystem path
◈ $# – Add the event-related file name
◈ $% – Add the event flags (textually)
◈ $& – Add the event flags (numerically)

# incrontab -l

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

Make sure you are performing this activity with root user as we will maintaining incron table for root user. Initially there won’t be any entry maintained in table for root.

To maintain entry in incron, use below command (incron command is same as cron job)

# incrontab -e (It will open editor)

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

Insert below line (Separate line must be maintained for data and log of SYSTEMDB and each tenant)

/backup/data/SYSTEMDB IN_MOVED_TO aws s3 cp $@/$# s3://bucketname$@/$# --sse
/backup/data/DB_SID IN_MOVED_TO aws s3 cp $@/$# s3://bucketname$@/$# --sse
/backup/log/SYSTEMDB IN_MOVED_TO aws s3 cp $@/$# s3://bucketname$@/$# --sse
/backup/log/DB_SID IN_MOVED_TO aws s3 cp $@/$# s3://bucketname$@/$# --sse

Decoding Query

/backup/data/SYSTEMDB is the file system that is monitored

IN_MOVED_TO is the mask, which identify which file is moved into watched directory

aws s3 cp <source> <destination> is the command that will run when a file is moved to watched directory

◈ $@/ – /backup/data/SYSTEMDB/
◈ $# – COMPLETE_DATA_BACKUP_databackup_0_1
◈ s3://bucketname$@/$# – ◈ s3://bucketname/backup/data/SYSTEMDB/COMPLETE_DATA_BACKUP_databackup_0_1
◈ sse – AWS S3 Server-side encryption

IMPORTANT NOTE: Existing file in local folder won’t be transferred to S3. It needs to be manually moved but after setting incrontab job whenever a new file is created in watched directory it will be moved to S3.

S3 Bucket Permission


Define Policy and assign it to IAM role and this IAM roles should be attached to HANA VM Instance. Below policy allows the Get and Put of S3 bucket objects in the log and data buckets. This way you can restrict permissions of S3 buckets only to the Hana host in a given landscape.

{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Sid": "AllowListAllBuckets",

            "Action": [

                "s3:ListAllMyBuckets"

            ],

            "Effect": "Allow",

            "Resource": "arn:aws:s3:::*"

        },

        {

            "Sid": "sc4AllowGetandPut",

            "Action": [

                "s3:GetObject",

                "s3:PutObject",

                "s3:ListBucket"

            ],

            "Effect": "Allow",

            "Resource": [

                "arn:aws:s3:::bucketname*",

                "arn:aws:s3:::bucketname*",

            ]

        }

    ]

}

Monitoring Copy of Backup to S3


You can find the logs for backup transfer to S3 in /var/log/messages and message will be something like below –

May 9 18:36:34 <hostname> incrond[92053]: (root) CMD (aws s3 cp /backup/log/DB_SID/log_backup_2_0_13968158912_13968278400.1557455793999 s3://bucketname/backup/log/DB_<SID>/log_backup_2_0_13968158912_13968278400.1557455793999 --sse)

Checking S3 Bucket from Command Line


With below command you can get folders inside bucket and each folder belong to separate tenant or database which on providing to the command, it gives all the files that are in S3.

# aws s3 ls s3://bucketname/backup/log/

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

As I have maintained policy in a way, that I cannot see production bucket from sandbox server and on doing so I will get below message

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

So, bucket visibility and access depend on how you define the policy.

Restore File from S3


With below command you can see all the backup that has moved to S3

# aws s3 ls s3://bucketname/backup/data/SYSTEMDB/

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

# aws s3 cp s3://bucketname/backup/data/SYSTEMDB/2017-08-06_18-00_databackup_0_1 /tmp

SAP HANA Tutorials and Materials, SAP HANA Certification, SAP HANA Study Materials, SAP HANA Online Exam

Local Backup File Retention


As described in “Solution Highlight” section, configure retention policy to delete local /backup files older than X days. There are three ways to configure retention for local backup file and each has its pros and cons, so you can apply the one which is best suited for your landscape.

HANA Cockpit: As highlighted in “Solution Highlight”, with new version of cockpit we can schedule retention policy. It also takes care of housekeeping task of backup catalog and deletion of respective local backup files.

HANACleaner: Configure HANA cleanup with SAP HANACleaner, an expert tool designed by SAP Support. With this script you can perform housekeeping of backup catalog, backup and several other cleanup tasks like cleanup of trace files, audit logs etc.

Custom Script: You can develop a custom script to tackle a situation like “If your incrontab service are not running and backups are not syncing to S3”, so if you go with option 1 and 2 it won’t perform a check whether your backup has been moved to S3 or not, it will straight away delete local backup files. So, in this case you can develop a script which perform a sync between your local backup folder and S3 bucket before it deletes local files.

NOTE: I repeat, there might be many better ways to perform backup/recovery for HANA database on cloud (based on business RTO/RPO) but I find this solution good as it does not involve any third party tool.

Also this backup/recovery solution can be used in other cloud as well as on-premise.

No comments:

Post a Comment