« Previous 1 2 3 Next »
Getting your virtual machine dimensions right
Tailor Made
Big Enough Buffer
Availability and performance are crucial, especially for monster VMs. By choosing VM dimensions that are slightly smaller than the available NUMA node or host resources, you can prevent competition between the VM and the hypervisor. The kernel functions listed earlier are hypervisor functions, which means they take priority over the VM.
If you were to size VMs such that all the RAM and CPU were used up and the VMkernel now had to perform another task (e.g., a vMotion migration), the VMs would be deprioritized in terms of their access to CPU scheduling; on the guest operating system and in the application this choice would at best lead to performance hits or, in the worst case, a freeze. It is essential to prevent this through correct VM sizing.
What does correct sizing look like; that is, how much CPU and RAM does the hypervisor actually need? The answer is: It depends on which features you are using on the ESXi. If you use the hypervisor out of the box, including clustering (DRS/HA), you can expect to use between five and seven percent of your RAM and CPU resources. If you add vSAN, you will naturally require more RAM and more CPU time, which can quickly take the share up to 10 percent or more.
An example shows how to determine the CPU overhead:
- The server has four 28-core CPUs (i.e., 56 hyperthreads per CPU).
- Five percent of 28 cores is 1.4 cores.
- Seven percent of 28 cores is 1.96 cores.
If you reserve two cores per socket for the kernel and virtualization overhead, you are on the safe side. Therefore, you have 26 cores or 52 hyperthreads per NUMA node for the VM. Figure 2 shows a four-socket server with 28 cores per socket and 6TB of RAM.

Sizing VMs Correctly
The question is how to size monster VMs correctly given the above rules? In Figure 3 you can see the server illustrated in Figure 2 with 6TB of RAM and four sockets, each with 28 cores or 56 hyperthreads. If you now create a VM, you could get the dimensions completely wrong and go for 56 vCPUs and 1.5TB of vRAM (shown in the figure as the red monster VM2). On the other hand, you could do it the right way: 26 cores or 52 hypertreads per NUMA node. I've been generous here and allotted 10 percent for the RAM overhead. The resulting clean VM size is 1.35TB of RAM and 52 vCPUs (i.e., the number of hyperthreads that correlates to 26 cores). This VM is the Monster VM1 in green.

When talking about monster VMs, you will often see significantly larger VMs being built: VMs that are larger than a NUMA node. Figure 3 also shows a dual-NUMA-node VM that spans nodes 2 and 3. The same rule applies here: two cores per NUMA node as a reserve for the ESXi host and 10 percent RAM. The resulting size is 104 vCPUs and 2.7TB of RAM. If you inflate the VM to the max (call it monster VM4), it hogs all four NUMA nodes and is sized at 5.4TB with 208 vCPUs:
4(28 – 2) cores
x 2 hyperthreads
= 208 hyperthreads
So far, so good. As you will remember, the goal for SAP HANA, Oracle, and other business-critical monster VMs is worry-free operation, maximum availability, and maximum performance. A few more arrangements are in order by reserving all the RAM and the entire configured CPU performance of the VM for vMotion advantages later on.
Note that this reservation is not made as a CPU count, but as the correlating megahertz/gigahertz number. Going back to the NUMA alignment, if the VM is larger than a NUMA node, it is advisable not just to give the VM vCPUs, but also to adapt the virtual CPU topology accordingly (i.e., to work with virtual cores and sockets). If you create a new VM, the default for the CPU topology is Assigned at power on . You could also say that ESXi selects the appropriate topology.
However, for multi-NUMA-node VMs, explicitly determining the topology makes sense in the VM Options tab, where you enter the appropriate number in the Cores per socket field. For example, if 16 vCPUs are assigned to the VM and it is a dual-NUMA-node VM, you would enter eight cores per socket here. You now automatically get to see Sockets: 2 and a performance warning; however, the settings are correct, and you can ignore the warning with a clear conscience.
Controlling NUMA Alignment
If you now assign VM options to the monster VM4 from the previous example it would be configured with 208 CPUs and 52 cores per socket, resulting in four virtual sockets. The VM therefore has a NUMA layout that perfectly matches that of the physical server.
Back at the smaller monster VM (single or dual NUMA node), you have already configured the perfect size, but how do you ensure that the VM is really restricted to the physical NUMA node and not unfavorably distributed across several nodes? If a VM fits perfectly on a NUMA node, it is referred to as "NUMA aligned;" if not, it is "NUMA unaligned." You will want to avoid the second case, because it causes issues with remote memory access and unwanted latency, which I referred to earlier.
Monster VMA in Figure 4 is therefore perfectly aligned, whereas VMB is unaligned. How can you remedy this situation? To begin, you want the vCPUs to use the hyperthreads on the local socket and not switch to other cores on other sockets, which you can achieve with a VM advanced setting in the VM's vmx
file:
numa.vcpu.preferHT=TRUE
An additional parameter will ensure that VMA in the example is really restricted to NUMA node 0:
numa.nodeAffinity = 0
If you also want to clean up the monster VMB NUMA alignment and assign it to NUMA node 1, you would need the following configuration for VMA:
numa.vcpu.preferHT=TRUE numa.nodeAffinity = 0
and this configuration for VMB:
numa.vcpu.preferHT=TRUE numa.nodeAffinity = 1
In Figure 5, monster VMC is assigned to the remaining two NUMA nodes and now spans two nodes (NUMA nodes 2 and 3). This VM needs the
numa.vcpu.preferHT=TRUE numa.nodeAffinity = 2.3
configuration parameters.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
