Friday 13 August 2021

Integrating SAP HANA Data Lake to Google Big Query – DL2BQ

A Simple Architecture:

SAP HANA Cloud, BW SAP HANA Data Warehousing, SAP HANA Tutorial and Materials, SAP HANA Certification, SAP HANA Guides, SAP HANA Learning

Pre-requisites: You must have your btp trial account up and running & data lake instance should also be running & have your credentials also ready for an open database connectivity

You should also have your gcp trial account ready – & make sure you have downloaded the gcp credentials in json format locally in your system.

Mainly both the cloud accounts should be up and running.

Data Lake Instance:

SAP HANA Cloud, BW SAP HANA Data Warehousing, SAP HANA Tutorial and Materials, SAP HANA Certification, SAP HANA Guides, SAP HANA Learning

GCP Instance & Big Query:

SAP HANA Cloud, BW SAP HANA Data Warehousing, SAP HANA Tutorial and Materials, SAP HANA Certification, SAP HANA Guides, SAP HANA Learning

◉ Now go to your python prompt and install the library

pip install hdltobq

◉ If installation is successful then you will be able to import it

import hdltobq​

◉ After Installation try these imports if that’s fine then all good to go

##Import below libraries...

import hdltobq
from hdltobq.hdltobq import BQConnect​

◉ Methods for connecting to GCP BQ, Creating tables, Creating datasets & transporting contents

Sample Inputs
###You should have your project & credentials ready for migrating data from Data Lake to BQ
bq_dataset     = 'bigquery-public-data:hacker_news'    ## Your BQ Dataset if created else create one
bq_project     = 'igneous-study-316208'             ### This is Mandatory
bq_credentials = r'C:\Users\ABC\Downloads\igneous-study-316208-d66aebfd83ea.json' ##Mandt

##Initialize BQ
bq =  BQConnect(bq_dataset,bq_project,bq_credentials)

##Initialize BQ
bq =  BQConnect(bq_dataset,bq_project,bq_credentials)

bq_client, bq_ds = BQConnect.connect2bq(bq)​

◉ Create Dataset

###Create new Dataset for your tables first.
lv_ab = BQConnect.create_dataset(bq_client,'HANADL')

Output
Creating DataSet.....
Created.. Thanks

◉ Create Table 

### Create table ...
BQConnect.create_tab(bq_client, df, 'HOTEL')

Ouput:
Started Creating table.....
igneous-study-316208.HANADL.HOTEL
Preparing Schema...
Ready.....
CRITICAL:root:Dataset igneous-study-316208.HANADL.HOTEL already exists, not creating.
 
◉ Finally to transport data to BQ

####Command for BQ Insert
df.to_gbq('HANADL.HOTEL',project_id=bq_client.project,if_exists='append')

Data Preview from Data Lake

SAP HANA Cloud, BW SAP HANA Data Warehousing, SAP HANA Tutorial and Materials, SAP HANA Certification, SAP HANA Guides, SAP HANA Learning

GCP BQ Output

SAP HANA Cloud, BW SAP HANA Data Warehousing, SAP HANA Tutorial and Materials, SAP HANA Certification, SAP HANA Guides, SAP HANA Learning

◉ So here we come to an end where we have successfully transferred data from SAP HANA Data Lake to Big Query, Probably we will see the transfer from Big Query to SAP HANA Data Lake in next post – till then take care & Keep learning.

No comments:

Post a Comment