Monday 8 February 2021

Data Federation between SAP HANA Cloud and Amazon S3 to Blend Business Data with External Data

We will talk about how to perform data federation between SAP HANA Cloud and Amazon S3 bucket so that we can incorporate external data into our business data while keeping both data sitting in their own places. We will be focusing a bit more on the integration part where we need to install and run SAP Data Provisioning Agent on Linux environment. The architecture for the result looks like the following: 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Please note that our demo case is using EC2 instance as host for our DPAgent and storage gateway service as a virtual storage mounted to EC2 instance. Your own case might be a bit different, for instance, you have your own on-premises system to host DPAgent. (DPAgent, also named Data Provisioning Agent which hosts all SAP HANA smart data integration Adapters and acts as the communication interface between HANA, on which the SAP Data Warehouse Cloud is built, and the Adapter.) 

Part 1: Creating S3 bucket and Storage Gateway 

Since there are already many documents regarding this part, we are just going to mention what we need here. First, we need a private S3 bucket with our csv data uploaded; Second, we create a Storage Gateway in NFS mode and then a File Share connection with our s3 bucket. 


Please note the commands showing how to connect to our file share as we will use them in later section. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Part 2: Installation and Configuration of SAP Data Provisioning Agent  

Next, we create an ubuntu ec2 instance with inbound rules set to all (this is just for testing purposes and not allowed for production use, you require more secure settings according to your needs). 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

After creating the instance, we go to SSH client tab and follow instructions to log into your ubuntu instance.

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

On your local machine (Mac OS for our case), go to SAP Software Download Center and download the latest patch of DPAgent for linux. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Now, we copy this installation file into our ec2 instance using the following command (all later commands are used under mac terminal): 

scp -i s3_linux_dpa_key_pair.pem <path/downloaded DPAgent>  ec2-user@<your ec2 host>.compute-1.amazonaws.com:/home/ec2-user 

Login into our ubuntu ec2 instance, unzip the file we transported and get into the folder where hdbinst.sh file is sitting. After creating 3 folders (/usr/sap/dataprovagent, /usr/sap/dataprovagent/s3,  /usr/sap/dataprovagent/agentconfi), we use the following command to install our DP Agent: 

./hdbinst --silent --batch  --path="/usr/sap/dataprovagent" --agent_listener_port=5050 --agent_admin_port=5051 

Don’t forget to mount our S3 file share to /usr/sap/dataprovagent/s3. 

Now at the command line, navigate to <DPAgent_root>\bin, for our case which is /usr/sap/dataprovagent/bin. Then we run 

./agentcli.sh --configAgent 

to open the DPAgent Configuration Tool: 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

We need to choose option 2 to start our agent first. Then choose option 6 to connect to our HANA cloud instance. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Choose option 1 to follow instructions and input information to connect to our hand cloud instance. If you do not have a HANA user for agent messaging, it is ok as it will create one for you. 

Next, we register our agent to the HANA instance. The agent host name is the public IP of your ec2 instance. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Before we start to register our FileAdapter, we need to configure it.  We use 

./agentcli.sh –configAdapters

to do so but if you have problems you can edit  file/usr/sap/dataprovagent/configuration/com.sap.hana.dp.adapterframework/FileAdapter/ FileAdapter.ini directly:  

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Next, we will set up an access token for our FileAdapter using 

./agentcli.sh –setSecureProperty

Choose 11 and use the token ‘Accesstokentest12345’ for example. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Now use the configuration tool to register our FileAdapter. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Don’t forget to use the following command to generate a csv config file and move it to /usr/sap/dataprovagent/agentconfi.  

./createfileformat.sh -file <path of folder which contains your csv data file> -cfgdir <path of folder which contains your csv data file> -format "CSV" -colDelimiter ,
 
If everything goes well, now you are able to see your adapter and agent under your SAP HANA Cloud instance. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Part 3: Creating Virtual Table in SAP HANA Cloud  

In the HANA Database Explorer, under catalog -> remote sources, choose Add Remote Sources. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Set the token using the one we created earlier. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

Choose the csv file name and click Create Virtua Objects. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

If it is successfully created, you will be able to find the virtual table under table catalog, and be able to run select queries over the table. 

SAP HANA Exam Prep, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Material, SAP HANA Career

In summary, we have explained how to federate data between SAP HANA Cloud and CSV data sitting inside Amazon S3 bucket without the need of data replication.

No comments:

Post a Comment