Monday 1 April 2019

The tale of SAP HANA, SAP Analytics Cloud, and Brexit

This blog I wanted to show you an end to end example of getting unstructured JSON data, loading it into SAP HANA, enriching with geo-spatial attributes and exposing to SAP Analytics Cloud for further analysis.

The problem with most tutorials usually – they are focused on some randomly generated abstract data (like SFLIGHT or EPM Model data) and for some people this doesn’t really mean much, so I thought a real life example of a real up to date data analysis would be very beneficial for everyone.

Now, there are many ways to automate most of the tasks shown in this post, but I just wanted to show quick and simple example of data consumption, transformation and analysis which can be done in a few hours over the weekend. Now let’s begin.

If we go to a certain petition page we would see a page which looks like this:
SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

That doesn’t mean much and we can’t easily extract any meaningful data. But there is a way – if we change the link to https://petition.parliament.uk/petitions/241584.json we would be able to see the data in JSON format (quite unstructured on the screenshot):

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

There is a JSON view plugin for Chrome to help us see the formatted version:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Install the extension and reload the page:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Ok, now we are getting somewhere. The next step is – how do we get this data into our SAP HANA system? I decided to create a set of SQL statements to create the necessary unstructured table and insert data into it. To do that, I firstly had to find a way to easily create the SQL DDL and DML statements without going manually one by one through each line of JSON file.

There is another great online service for that: Sqlify.io – it allows you to feed it JSON or CSV files and creates a set of DDL and DML statements for you, which after some tweaking can be executed in SAP HANA system.

Let’s go to SQLify.io and enter our link from above:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Click “Convert to SQL” and see the result:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

As I mentioned, we will end up with the table of unstructured data, which is fine with us as we would then create the necessary HANA Calculation Views on that table. We will just go ahead and remove the unnecessary fields.  After tiding up we will end up with just a handful of meaningful fields:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Click “Save schema and continue”. Save the resulting SQL and open it locally (I use VS Code for any SQL, JS and sometimes ABAP development work):

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

It looks great (and perfectly unstructured), but hardly ready for SAP HANA SQL. What needs to be done is a little makeover which results in the following script which we can run in HANA. I have created a new schema for this blog post, so that everything is neatly contained in one place (the link to the full script can be found at the end of this post):

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Open SQL console of your HANA DB (use the default tenant or whichever tenant you like) and run the SQL script to create petitions data table with all the values.

Check the created tables and values:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Ok, so far so good. Now, in order for us to have Geo-analysis we need to have geo coordinates table somewhere, I have found the table with countries and longitude and latitude details online and also converted it to the SQL script for HANA:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Run the script and check the resulting table:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

All good with Countries. A bit harder with UK Constituencies coordinates. I went to the Office for National Statistics website and downloaded the full Postcode Directory (a bit of an overkill for this small exercise, but very useful in future). You can download the full archive from here:

http://geoportal.statistics.gov.uk/items/ons-postcode-directory-february-2019

Create the table definition using SQL command:

CREATE TABLE ONSPD_FEB_2019 (
pcon VARCHAR(10) NULL,
lat FLOAT NULL,
long FLOAT NULL
);

After the table is created, go to Eclipse and choose File->Import

Note: I switched to Eclipse here, because the file is massive and SAP HANA Web-based Development Workbench won’t be able to handle it. You can use Eclipse to complete all the steps in this post, if you prefer.

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Choose Import data from local file:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Pick up the file from the downloaded archive and make other relevant selections:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Create the mappings and click “Finish” to start the upload (which would take quite a while if you are using the remote HANA system, so go grab some lunch or a cuppa):

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

After the upload is completed, check the data in the table:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

The problem is that ONSPD goes into a finer granularity than we need, so that all the coordinates go way down to Ward and Parish, but we only need one set of coordinates per constituency. Therefore, I have decided to take the distinct value of ONS code from PCON field with average coordinates and use the coordinates of that record for the constituency, which isn’t really precise or ideal, but… There might be an easier way to find the coordinates based on the ONS code from Petitions site, but I haven’t found it and would love to hear your comments on this in the comments section below.

Now, before we proceed, I strongly suggest you check the following SAP Note:

https://launchpad.support.sap.com/#/notes/0002395407

In order to create ST_Geometry Location coordinates the steps above must be completed, otherwise the scripts below would fail.

After you have completed the prerequisites, run the script to create Location tables which are based on the countries and constituencies tables above. It creates the location definitions which can be consumed by SAP Analytics Cloud:

CREATE COLUMN TABLE "SAC_SOURCE"."Countries_Location" (
"Country_Code_LD" VARCHAR(2) PRIMARY KEY, "Location" ST_GEOMETRY(3857));
UPSERT "SAC_SOURCE"."Countries_Location" ("Country_Code_LD")
SELECT "COUNTRY_CODE" FROM "SAC_SOURCE"."COUNTRIES" GROUP BY
"COUNTRY_CODE";
UPDATE "SAC_SOURCE"."Countries_Location"
SET "Location" = new ST_GEOMETRY('POINT(' || "LONGITUDE" || ' ' || "LATITUDE" ||
')', 4326).ST_Transform(3857)
FROM (
SELECT MIN("LATITUDE") "LATITUDE", MIN("LONGITUDE") "LONGITUDE", "COUNTRY_CODE"
FROM "SAC_SOURCE"."COUNTRIES" GROUP BY
"COUNTRY_CODE"),
"SAC_SOURCE"."Countries_Location"
WHERE "COUNTRY_CODE" = "Country_Code_LD";

CREATE COLUMN TABLE "SAC_SOURCE"."Constituencies_Location" (
"ONS_CODE_LD" VARCHAR(10) PRIMARY KEY, "Location" ST_GEOMETRY(3857));
UPSERT "SAC_SOURCE"."Constituencies_Location" ("ONS_CODE_LD")
SELECT DISTINCT "PCON" FROM "SAC_SOURCE"."ONSPD_FEB_2019" WHERE
"PCON" IN ( SELECT "CONSTITUENCY_ONS_CODE" FROM "SAC_SOURCE"."PETITION_241584_DATA" WHERE "CONSTITUENCY_ONS_CODE" <> '' )
GROUP BY
"PCON";
UPDATE "SAC_SOURCE"."Constituencies_Location"
SET "Location" = new ST_GEOMETRY('POINT(' || "LONG" || ' ' || "LAT" ||
')', 4326).ST_Transform(3857)
FROM (
SELECT DISTINCT "PCON", AVG("LAT") "LAT", AVG("LONG") "LONG"
FROM "SAC_SOURCE"."ONSPD_FEB_2019" WHERE
"PCON" IN ( SELECT "CONSTITUENCY_ONS_CODE" FROM "SAC_SOURCE"."PETITION_241584_DATA" WHERE "CONSTITUENCY_ONS_CODE" <> '' ) GROUP BY
"PCON"),
"SAC_SOURCE"."Constituencies_Location"
WHERE "PCON" = "ONS_CODE_LD";

This would create two location supporting tables:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Check the contents of any:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Lovely, just what we are looking for!

Now let’ carry on building 4 very simple HANA Calculation views – 2 scripted and 2 graphical (just to illustrate different options).

Both are very simple containing a single table with just a different output selections:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Add petitions table to the aggregation node:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Map just three required fields from the source:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Set Country_Code as the key in Semantics node:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Save and run:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Create another view similarly by adding Constituency relevant fields:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Set ONS code as the key:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Save and run:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Marvellous.

Now we only need to create location supporting views for Country and Constituency in special package SAP_BOC_SPATIAL (this is very important).

I have decided to create them as scripted CVs to just show how they look.

The first one is Country Location:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Create two columns on the right and paste the code:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Set Country_Code_LD as the key and run the view:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Great. Define the second scripted view ZCV_CONSTITUENCY_LOCATION:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Set ONS_CODE_LD to key and run the view:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Okay, job’s done on HANA side. We now have 4 views and they are ready to be consumed by SAP Analytics Cloud:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

SAP Analytics Cloud


We will be creating Live HANA connection from SAC to our tenant, if you don’t have it yet, it’s not that trivial to set up (this involves SSL).

I have created the Live connection already so we can proceed.

Go to your SAC tenant and create new model:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Select “Get data from a datasource” and pick up your HANA Live connection:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Click on Data Source selector and choose our Country Calculation View:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Call the Model “Signatures_by_country” and confirm.

Check that the measure and dimensions are present:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material
SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Now we need to add location information to our model. Remember, we even have the supporting view for that!

Click “Create a location dimension” button:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

And select the values as follows:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

What we did is we actually told the system where the location information actually resides and ordered system to join that location info with our model based on the relationship between COUNTRY_CODE and COUNTRY_CODE_LD (Country Code Location).

Confirm and save the model.

Now before we proceed further, let’s create another model “Signatures_by_constituency” and add the relevant views to it – repeat all the previous steps with the following selections:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Map the location view like this:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Confirm and save the model.

Now let’s get to the cherry on the cake (finally!). Create new story:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

With the canvas page:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Add Geo Map to canvas:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

You can leave it light grey or choose many options for the base layer:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

I chose “Streets” because it’s more colourful! Add new layer, call it “World” and add our “Country” model as the data source:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Switch layer to “Heat map” and add our Location dimension and Signatures as the measures:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Confirm, and see the results right away!

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Play with different modes of maps until you’ve got the view that is most suitable for your analysis. For example, switch map to “Bubble layer”. It gives you the details when you mouse over:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Now go and create another Layer and call it “UK” and choose signatures by constituency:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Zoom in on the UK:

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

It gives you the heat map based on the UK constituencies now. Neat!

If you zoom out a bit you will see both layers –  Bubble Layer showing the World and Heat map showing the UK :

SAP HANA, SAP Analytics Cloud, SAP HANA Certifications, SAP HANA Guides, SAP HANA Learning, SAP HANA Tutorial and Material

Now we can add another page to our story and select “Grid”:


Insert models to the different pages:


This way we can have the map and two grids with raw data for our analysis:



Save it as the “Petitions story”:


Now you can try and enhance the base model with whatever you like, such as 2016 referendum data per constituency to see the correlations between voters who voted Remain and signatories of the petition to revoke Article 50. Go ahead and play!

Key takeaways and epilogue


◈ SAP Analytics Cloud Story is fully dynamic and based on live date from HANA system. If you change the value in the table the story will update right away.
◈ The end to end example did not use… any BW!
◈ HANA Calculation views can be expanded to read other data, such as referendum, election results, poverty, homelessness figures, etc. and all that extra data can be analysed quickly and efficiently on the fly using the provided example framework.

No comments:

Post a Comment