Thursday 6 February 2020

Customizing Time Zones and DST Adjustments in HANA

I am building a HANA version of an accounting system for an electric power grid. The design comes from a legacy system which has been in operation for over 40 years. The system stores MWh power output readings for 250 generating units for 77 power plants. MW values are integrated continuously over the hour and are stored at the end of the hour. For example, MW values collected between 00:00:01 and 01:00:00 are stored as a single MWh value with a timestamp of 01:00:00.

It sounds so simple… However, this type of time reference introduces some big time-related problems in HANA.

There is very little information and very few published solutions for handling time zones and daylight savings in general, and the HANA documentation for time zones offers no specific examples. Hence, the purpose of this article.

Configuring Defaults


We have agreement from the engineers on two things. First, the data will be stored in UTC time, and second, the USA eastern time zone is either 4 or 5 hours away from UTC time depending on the time of year. Configuring HANA to run in UTC time is straightforward, as explained in TROUBLE WITH TIME? After configuring the system, we can verify our time base by noting that current_time() and current_utctime() return the same time:

SAP HANA Prep, SAP HANA Learning, SAP HANA Certifications, SAP HANA Study Materials

select current_time, current_utctime from dummy;
16:07  16:07

Now we verify the local time offsets. If you are using the HANA Express Edition then beware of indexserver.ini. There are three copies of this file for the installation:

/hana/shared/HXE/global/hdb/custom/config/indexserver.ini
/hana/shared/HXE/global/hdb/custom/config/DB_HXE/indexserver.ini
/hana/shared/HXE/exe/linuxx86_64/HDB_2.00.040.00.1553674765_c8210ee/config/indexserver.ini

The first two files do not contain a time zone configuration. The third file does. This mistakenly led me to believe that the ‘sap’ data set would be honored by utctolocal() and localtoutc() after loading the SystemDB SYSTEM.TTZ* tables. The third file is actually junk and the first two are important. It turns out that since the first two files have a missing configuration, HANA defaults the data set to ‘platform’, which takes time translations from the operating system files in the /usr/share/zoneinfo folder. No matter how hard you try to configure the TTZ* tables, nothing happens. I finally found the problem by running these commands:

ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini','SYSTEM') SET ('global','timezone_default_data_schema_nane') = 'SYSTEM' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini','SYSTEM') SET ('global','timezone_dataset') = 'sap' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini','SYSTEM') SET ('global','timezone_default_data_client_name') = '001' WITH RECONFIGURE;

This magically sets the time zone configuration in first indexserver.ini file:

[global]
timezone_default_data_schema_name = SYSTEM
timezone_default_data_client_name = 001
timezone_dataset = sap

The ‘sap’ data set is now honored, and utctolocal() and localtoutc() take their time conversions from the TTZ* tables as expected. I am using this ‘sap’ configuration rather than ‘platform’ because I do not trust the quality of the binary data in the /usr/share/zoneinfo folder. I see recent changes from Time Zone News so I’d much rather examine the rules in the TTZ* tables directly and get updates from SAP as they become available.

Running a Standard Query


We want to run a MWh application query in SystemDB first since we now know that the default time zones are working here. With some test data for 2020, we select data for November 1st, which is the 25-hour fall DST day:

select
    to_varchar(utctolocal(utctime,'EST'),'DD-MON-YYYY  HH24:MI') "LOCTIME",
    UNIT1 "UNIT_1_MWH", UNIT2 "UNIT_2_MWH", UNIT2 "UNIT_2_MWH"
from PlantMWhView
where utctolocal(utctime,'EST') > '2020-11-01 00:00:00'
and utctolocal(utctime,'EST') <= '2020-11-02 00:00:00'
and plant = 29
order by utctime;

LOCTIME            UNIT_1_MWH  UNIT_2_MWH  UNIT_3_MWH
01-NOV-2020 01:00  48          67          98
01-NOV-2020 01:00  50          64          96
01-NOV-2020 02:00  46          62          84
01-NOV-2020 03:00  40          61          76
01-NOV-2020 04:00  38          61          72
01-NOV-2020 05:00  41          60          68
01-NOV-2020 06:00  43          61          67
01-NOV-2020 07:00  47          62          69
01-NOV-2020 08:00  48          64          71
01-NOV-2020 09:00  51          65          72
01-NOV-2020 10:00  55          66          72
01-NOV-2020 11:00  58          67          72
01-NOV-2020 12:00  60          70          72
01-NOV-2020 13:00  63          72          75
01-NOV-2020 14:00  65          74          78
01-NOV-2020 15:00  67          73          83
01-NOV-2020 16:00  64          72          85
01-NOV-2020 17:00  64          71          87
01-NOV-2020 18:00  60          71          94
01-NOV-2020 19:00  59          70          93
01-NOV-2020 20:00  57          69          96
01-NOV-2020 21:00  52          70          97
01-NOV-2020 22:00  51          71          98
01-NOV-2020 23:00  50          69          97
02-NOV-2020 00:00  49          68          96

It might look good, but we do have problems with this. HANA shows the last hour of the day as the zero hour of the next day. We see 02-NOV-2020 00:00, but what we really want to see is 01-NOV-2020 24:00. This is a simple fix with some formatting.

Next, we have a showstopper. The repeated hour is shown as 01-NOV-2020 01:00. Under normal circumstances, this would be correct. The DST period in the United States begins at 02:00 local time, so the hour from 2:00:00 to 02:59:59 does not exist in the night of the switch. It is skipped as clocks spring forward from 1:59:59 standard time to 3:00:00 Daylight Saving Time. For our hour-end accounting, this is not correct because data collected from 01:00:01 to 02:00:00 is stored at 2 AM, not 1 AM. Therefore, we want to show 2 AM the repeated hour, not 1 AM.

In addition, we don’t know which is the first hour of DST and which is the second, unless we bring the UTC Time into the query for ordering. Even then, the users won’t have a visual indicator without extra formatting.

What we really want is a system where we can (a), move the repeated hour to a different time, (b), identify the first and second repeated hour for both local-to-UTC and UTC-to-local time conversion, and (c), show the last hour of the day as 24, not 00.

Reconfiguring Time Zones


First, let’s move the repeated hour from 1 AM to 2 AM. We will do this only in the tenant database, which is where our POWERGRID schema resides, and we will leave the System DB and all other databases in their CLIENT 001 standard configuration. To do this, we create CLIENT 002 for the tenant database by adding a time zone configuration to /hana/shared/HXE/global/hdb/custom/config/DB_HXE/indexserver.ini:

[global]
timezone_default_data_schema_name = SYSTEM
timezone_default_data_client_name = 002
timezone_dataset = sap

CLIENT 002 is now configured for this tenant database. Now we create the TTZ* tables in the SYSTEM schema on the tenant database and populate them as follows:

insert into TTZZ values ('002','EST','M0500','USA','X');
insert into TTZR values ('002','M0500','050000','-','X');
insert into TTZDV values ('002','USA','1900','04','1','1','030000','10','1','5','030000');
insert into TTZDV values ('002','USA','2007','03','1','2','030000','11','1','1','030000');
insert into TTZD values ('002','USA','010000','X');

The DST hour is now 030000 instead of 020000. The final configuration has unmodified CLIENT 001 TTZ* table data in the SystemDB SYSTEM schema, and modified CLIENT 002 TTZ* table data in the tenant database SYSTEM schema. The global indexserver.ini specifies that all databases run as CLIENT 001, and the DB_HXE indexserver.ini file overrides this by specifying CLIENT 002.

Now we fix the first and second DST hour indicator as well as the format of the last hour. To do this, we come up with a custom version of utctolocal() and localtoutc():

create function my_utctolocal(utctime timestamp, timezone nvarchar(6), timezone_dataset nvarchar(8))
returns loctime timestamp, he char(2), dst char(1)
language sqlscript reads sql data as
begin
    declare last timestamp;
    loctime := utctolocal(utctime,timezone,timezone_dataset);
    he := lpad(to_varchar(hour(ADD_nano100(utctolocal(utctime,timezone,timezone_dataset),-1))+1),2);
--  If the hour just before this one has the same local time, then this is the 2nd DST hour
    select utctolocal(add_seconds(utctime,-60*60),timezone,timezone_dataset) into last from dummy;
    if :last = :loctime
    then
      dst := '2';
    else
      dst := ' ';
    end if;
end;

create function my_localtoutc(loctime timestamp, timezone nvarchar(6), timezone_dataset nvarchar(8), dst char(1))
returns utctime timestamp
language sqlscript reads sql data as
begin
    declare next timestamp;
    utctime := localtoutc(loctime,timezone,timezone_dataset);
--  If the next local hour is the same as this one, and dst = 2, then return the next hour
    select utctolocal(add_seconds(utctime,60*60),timezone,timezone_dataset) into next from dummy;
    if :next = :utctime and dst = '2'
    then
      utctime := next;
    end if;
end;

Running a Custom Query


Here is the custom query in the tenant database which is now 100% acceptable to our users:

select
    my_utctolocal(utctime,'EST','sap').he "HE",
    my_utctolocal(utctime,'EST','sap').dst "DST",
    UNIT1 "UNIT_1_MWH", UNIT2 "UNIT_2_MWH", UNIT2 "UNIT_2_MWH"
from PlantMWhView
where my_utctolocal(utctime,'EST',’sap’) > '2020-11-01 00:00:00'
and my_utctolocal(utctime,'EST',’sap’) <= '2020-11-02 00:00:00'
and plant = 29
order by utctime;

HE   DST      UNIT_1_MWH  UNIT_2_MWH  UNIT_3_MWH
01            48          67          98
02            50          64          96
02   2        46          62          84
03            40          61          76
04            38          61          72
05            41          60          68
06            43          61          67
06            47          62          69
08            48          64          71
09            51          65          72
10            55          66          72
11            58          67          72
12            60          70          72
13            63          72          75
14            65          74          78
15            67          73          83
16            64          72          85
17            64          71          87
18            60          71          94
19            59          70          93
20            57          69          96
21            52          70          97
22            51          71          98
23            50          69          97
24            49          68          96

In HANA, this type of customization is not hard at all, given two nice HANA features that let us (a), create a custom time zone for a particular database, and (b), create a user-defined function that has the ability to return multiple properties (he, utc, and dst). Good job.

What is this HANA Time Zone Data?


Even though the TTZ* tables solve our custom time zone problem, they are not useful at all beyond utctolocal() and localtoutc(). This may become a problem for other applications in our system. The TTZ* tables define rules about time zones. They are easy to configure but cannot be used for SQL query purposes. They do not contain timestamp values to be used as join search arguments to application data, in order to apply time offsets from UTC.

Fortunately, the TTZ* rules originates from the public domain Time Zone Database. This database contains descriptive rules which can be converted into any format you choose. For HANA, SAP has produced the TTZ* reference model. The other SAP database products (ASE, IQ, SQL Anywhere, Ultralite) do not have time zone support. Therefore. if you want to translate between local and UTC then you must invent your own reference model. This has been done in the past with Handling Time Zones and Daylight Savings Time Changes. The solution described there produces joinable tables.

If you find yourself doing custom development in HANA, you can start with public domain TZ database and finish with either the TTZ* reference model which is good for the HANA built-in functions localtoutc() or utctolocal(), or you might design an entirely different relational reference model which is good for joins. All three models are variations of the same data:

1. TZ database (northamerica file excerpt):

# Rule  NAME  FROM    TO    TYPE  IN  ON      AT      SAVE    LETTER/S
Rule    US    1974    only    -   Jan 6       2:00    1:00    D
Rule    US    1975    only    -   Feb lastSun 2:00    1:00    D
Rule    US    1976    1986    -   Apr lastSun 2:00    1:00    D
Rule    US    1987    2006    -   Apr Sun>=1  2:00    1:00    D
Rule    US    2007    max     -   Mar Sun>=8  2:00    1:00    D
Rule    US    2007    max     -   Nov Sun>=1  2:00    0       S

2. TTZ reference model (TTZZ, TTZR, TTZDV, TTZD tables for EST):

TTZZ values  ('002','EST','M0500','USA','X');
TTZR values  ('002','M0500','050000','-','X');
TTZDV values ('002','USA','2007','03','1','2','030000','11','1','1','030000');
TTZD values  ('002','USA','010000','X');

3. Relational reference model for joins (TimeTran table for EST):

Zone  UTCStart             UTCStop                     DST  Offset  ZoneAsc
 372  Mar 10 2019  07:00   Nov  3 2019  06:59:59.999        -240    EDT
 372  Nov  3 2019  07:00   Nov  3 2019  07:59:59.999   2    -300    EST
 372  Nov  3 2019  08:00   Mar  8 2020  06:59:59.999        -300    EST
 372  Mar  8 2020  07:00   Nov  1 2020  06:59:59.999        -240    EDT
 372  Nov  1 2020  07:00   Nov  1 2020  07:59:59.999   2    -300    EST
 372  Nov  1 2020  08:00   Mar 14 2021  06:59:59.999        -300    EST

In my view, doing custom time zone development in HANA works out well. In addition to getting the benefit of HANA’s TTZ* data, I can also come up with my own solution if needed.

No comments:

Post a Comment