Monday 11 March 2019

[HANA] Unleash the performance of your VM

Most performance issues which I was working on turned out to be basic issues regarding HANA / Linux parameter and configuration of the Hypervisor. Virtualization is regardless if big or small systems also in HANA environment an often-chosen architecture. If you want ensure good performance and how to check it in your environment keep on reading.

Most systems run on VMware but also more and more systems are planned or already running on Power. Here I only speak from on premise installations, because the ones in the cloud from Hyperscaler like azure (Hyper-V), AWS (own KVM) or GCP (own KVM) you can´t real take influence on. For the biggest instances there are bare metal installation which make it pretty easy for the NUMA configuration. The application HANA is NUMA aware.

NUMA is a good keyword to start because this one of the most ignored / intransparent performance issues. What is NUMA and why should you take attention on it when you install a HANA on a hypervisor.

NUMA – Non-uniform Memory Access

“NUMA is a method of configuring a cluster of microprocessor in a multiprocessing system so that they can share memory locally, improving performance and the ability of the system to be expanded.”

=> OK, this sounds not really self-explanatory, or?
Let’s take an example with a picture:

SAP HANA, SAP HANA Certifications, SAP HANA Learning, SAP HANA Study Material

The performance impact depends on type of CPU, vendor (topology) and number of sockets.

This means a local access is always 2-3 times faster than a remote one. But how can you have influence on the placement of a VM (=virtual machine)?

The hypervisor should normally take care of this. But in special cases like big HANA VMs or wrong default settings of the VM you have to adjust it manually. This should be done for all productive HANA servers. Normally the person who have installed the HANA should be aware of this, but experience shows that in 90% of the installations nobody cares about.

IBM Power (PPC)


On IBM Power an optimization is pretty easy with the latest HMC versions:

# on ssh shell of the HMC
# Listing of all servers
$ lssyscfg -r sys -F name


# dynamic platform optimizer (DPO) => NUMA optimization

$ hscroot@<ip-hmc>:~> lsmemopt -m <pServer Name> -r lpar -o currscore

$ hscroot:~> lsmemopt -m pserv1-r lpar -o currscore
lpar_name=hana1,lpar_id=1,curr_lpar_score=100
lpar_name=hana2,lpar_id=2,curr_lpar_score=100
lpar_name=hana3,lpar_id=3,curr_lpar_score=none
lpar_name=hana4,lpar_id=4,curr_lpar_score=100
lpar_name=hana5,lpar_id=5,curr_lpar_score=none
lpar_name=hana6,lpar_id=6,curr_lpar_score=100
lpar_name=hana8,lpar_id=8,curr_lpar_score=32 << improvable LPAR


# on ssh shell of the HMC
# use DPO for optimization
$ optmem -m <Power Server Name> -o start -t affinity -p <name(s) of improvable LPAR(s)>

$ optmem -m pserv1 -o start -t affinity -p hana8

# check running background activities
$ lsmemopt -m <Power Server Name> 

VMware


On VMware this is trickier than on IBM Power, because also the sizing rules differ.

With VMware you can use half socket sharing or if your VM I bigger than one NUMA Node / Socket, you have to round up and must allocate the full socket. This leads to some resource wasting.

Here a picture from ©VMware:

SAP HANA, SAP HANA Certifications, SAP HANA Learning, SAP HANA Study Material

Every VM which is bigger than one socket is called ‘wide VM’.

One example for you which can be also checked in your environment by using the shell on your ESX.

Alternatively I’m sure you find a way how to contact me

Example  – remote Memory access / Overprovisioning

####################
ESX
E5 – 2695 v4
18 cores per socket
2 sockets
72 vCPUs
1 TB RAM
####################

HANA Sizing:

600GB RAM

36vCPU

Current Setup:

768 GB RAM

36vCPU

 Sizing rules:

1 TB RAM (=> 2 sockets, because one NUMA node has 512GB and we need more than this)

72vCPU

This is currently one the famous mistakes which I can see in about 60% of all environments, because the VM admin is not aware of the sizing rules of the SAP HANA and most of them are not aware which influence their VM settings can have on the topology and the resulting performance. So, attention to placement and overprovisioning.

ESX view


groupName           groupID    clientID    homeNode    affinity     nWorlds   vmmWorlds    localMem   remoteMem  currLocal%  cummLocal%
 vm.78924              58029           0           0         0x3          16          16    73177088           0         100          99
 vm.78924              58029           1           1         0x3          16          16    72204288           0         100         100
 vm.1237962         76880487           0           0         0x3          16          16    18254012   250242884           6          53
 vm.1237962         76880487           1           0         0x3          16          16   267603968      831488          99          66
 vm.1237962         76880487           2           0         0x3           4           4   145781060   121605820          54          56

Here we see an ESX with 2 VMs 1237962 is our hdb01 HANA DB which has 16+16+4 vCPUs (3 Sockets) and we can see it consumes remote memory. Wait a moment – 3 sockets? Our physical server has only 2. Yes, this is possible with VMware, but it is an additional overhead and costs performance. You can also create an 8-socket server within a 2 socket ESX, but it doesn’t make sense in context of HANA. There are other applications where this feature is useful.

But all of this “virtual sockets” are located on the physical socket-0. This leads to an overprovisioning of this node because the other VM additionally uses some resources. 

nodeID        used        idle    entitled        owed  loadAvgPct       nVcpu     freeMem    totalMem
           0        5408       30591        5356           0          14          52    26703288   536736256
           1        1574       34426         926           0           3          16    85939588   536870912

Socket-0 using 52 vCPU and Socket-1 16? Seems to be that this ESX is a little unbalanced and overprovisioned.

vmdumper -l | cut -d \/ -f 2-5 | while read path; do egrep -oi “DICT.*(displayname.*|numa.*|cores.*|vcpu.*|memsize.*|affinity.*)= .*|numa:.*|numaHost:.*” “/$path/vmware.log”; echo -e; done

DICT                  numvcpus = "36"
DICT                   memSize = "786432"
DICT               displayName = "hdb01"
DICT        sched.cpu.affinity = "all"
DICT        sched.mem.affinity = "all"
DICT      cpuid.coresPerSocket = "4"
DICT      numa.autosize.cookie = "360001"
DICT numa.autosize.vcpu.maxPerVirtualNode = "16"
DICT        numa.vcpu.preferHT = "TRUE"
numaHost: NUMA config: consolidation= 1 preferHT= 1
numaHost: 36 VCPUs 3 VPDs 3 PPDs
numaHost: VCPU 0 VPD 0 PPD 0
numaHost: VCPU 1 VPD 0 PPD 0
numaHost: VCPU 2 VPD 0 PPD 0
numaHost: VCPU 3 VPD 0 PPD 0
numaHost: VCPU 4 VPD 0 PPD 0
numaHost: VCPU 5 VPD 0 PPD 0
numaHost: VCPU 6 VPD 0 PPD 0
numaHost: VCPU 7 VPD 0 PPD 0
numaHost: VCPU 8 VPD 0 PPD 0
numaHost: VCPU 9 VPD 0 PPD 0
numaHost: VCPU 10 VPD 0 PPD 0
numaHost: VCPU 11 VPD 0 PPD 0
numaHost: VCPU 12 VPD 0 PPD 0
numaHost: VCPU 13 VPD 0 PPD 0
numaHost: VCPU 14 VPD 0 PPD 0
numaHost: VCPU 15 VPD 0 PPD 0
numaHost: VCPU 16 VPD 1 PPD 1
numaHost: VCPU 17 VPD 1 PPD 1
numaHost: VCPU 18 VPD 1 PPD 1
numaHost: VCPU 19 VPD 1 PPD 1
numaHost: VCPU 20 VPD 1 PPD 1
numaHost: VCPU 21 VPD 1 PPD 1
numaHost: VCPU 22 VPD 1 PPD 1
numaHost: VCPU 23 VPD 1 PPD 1
numaHost: VCPU 24 VPD 1 PPD 1
numaHost: VCPU 25 VPD 1 PPD 1
numaHost: VCPU 26 VPD 1 PPD 1
numaHost: VCPU 27 VPD 1 PPD 1
numaHost: VCPU 28 VPD 1 PPD 1
numaHost: VCPU 29 VPD 1 PPD 1
numaHost: VCPU 30 VPD 1 PPD 1
numaHost: VCPU 31 VPD 1 PPD 1
numaHost: VCPU 32 VPD 2 PPD 2
numaHost: VCPU 33 VPD 2 PPD 2
numaHost: VCPU 34 VPD 2 PPD 2
numaHost: VCPU 35 VPD 2 PPD 2

Here we can see that the mapping of VPD to PPD is 1:1, but there is no physical third socket in an E-5 server

At first, we have a wide VM. This means preferHT should be disabled. Another bullet point is the limitation of 16vCPU which lead to this 3 socket setup: 36/16=2,25 => ~3

Ok such numbers are fine but some pictures to realize what exactly this means:

SAP HANA, SAP HANA Certifications, SAP HANA Learning, SAP HANA Study Material

SAP HANA, SAP HANA Certifications, SAP HANA Learning, SAP HANA Study Material

In the last picture you can see that the 768GB doesn’t fit into 512GB, so a remote access is used to satisfy the need. The other VM should not be spread over two NUMA nodes. This has bad affects on the HANA performance.

So, in the end you have two options:

◈ Reduce the size of your HANA and resize the VM that it fits into one NUMA node
◈ Move the second VM away, so that the whole ESX can be used by the HANA VM

It is not allowed to share a socket for a prod. HANA VM with another VM (regardless if it is SAP application or not). This means also that overprovisioning is not allowed.

The shown example is not supported in many ways. SAP can discontinue support, but I haven’t heard from customers or colleagues that this ever happened, but what often be done is that VMware support will be contacted and be pretty sure that they will find this and your issue will be processed if you have supported setup.

No comments:

Post a Comment