Optimizing Windows Server 2016 performance
Torque Booster
Windows Server 2016's default settings might not always meet your network requirements. Depending on its purpose, the server requires different tweaks to unleash its true performance. In this article, I will be optimizing the RAM, CPU, cache, and storage media.
The performance of Windows Server 2016 depends on the underlying hardware. Ideally, you should use the most up to date Server 2016 hardware possible [1] (e.g., 64-bit processors). The cores should run at the highest possible frequency, because a processor with fewer cores but twice the clock speed can be significantly faster than a processor with several cores and a normal clock speed. Therefore, more cores are not always significantly faster than fewer cores.
Above all, Hyper-V performance benefits from clock speed, because the hypervisor distributes a server's resources to the virtual machines (VMs), which can share a core. In this case, a higher clock speed is infinitely preferable to having more cores that are not used at all by the VM in question.
The RAM performance and size, as well as the storage media's I/O performance, must match the processor. Even if obsolete server hardware can be pimped with new processors, line bottlenecks still can occur quickly if the rest of the server's hardware does not match the new processor's performance.
If you are using Hyper-V, the processor must be able to handle Second Level Address Translation (SLAT). The function is integrated into Intel processors in the form of Extended Page Tables (EPT) and into AMD processors as Nested Page Tables (NPT). The function can be read out with systeminfo.exe
and is shown as Second Level Address Translation
. SLAT allows the hypervisor to accelerate memory access.
Measuring in the Right Place
In general, before optimizing, you should use performance monitoring to measure exactly where the server performance bottlenecks originated and if they are attributable to the processor or memory (Figure 1). The processor performance can be a bottleneck if there is not enough main memory for the CPU. After all, swapping out pages is bound to affect the processor.
CPU usage does not pose a problem if it is above 90 percent for a short period of time; however, if it stays at this level for an extended period of time, it does become a problem. In multiprocessor systems, the focus is on the System object performance indicators in performance monitoring. Information from several system components is summarized there.
The Processor Time performance indicator of the Processor object is also of interest. If many different processes are running, fairly even load distribution is important. In a single process, it is important to divide the load into balanced threads. A thread is a process execution unit. If a process uses several threads, they can be executed on different processors. The distribution is based on the utilization of the individual CPUs by the system. A large number of queues means that several threads are available for computing, but the system has not yet assigned them any computing time. The rule of thumb for this value is that it should not be too frequently greater than 2 . However, if the CPU usage is relatively low on average, this value plays only a minor role.
A constantly high CPU utilization rate clearly shows that the processor in a server is overloaded. The Windows Server 2016 Performance Monitor shows you the performance indicator Processor: %Processor Time , which is the time required by the CPU to process a thread that is not idle. A constant status of 80 to 90 percent is too high. For multiprocessor systems, you need to monitor a separate instance of this performance counter for each processor. This value represents the sum total of processor time for a specific processor.
Additionally, you can monitor the processor via Processor: %Privileged time , which gives you the percentage of the total time the processor takes to execute Windows kernel commands, such as processing I/O requests. Further important indicators are Processor: %User Time , which returns the percentage of the total time required by the processor to run user processes. System: Processor Queue Length also includes the threads waiting for CPU time. A processor bottleneck occurs if a process's threads require more processor cycles than are available. If many processes are trying to take up processor time, you need to install a faster processor.
Optimizing RAM and Processor Cache
Microsoft recommends processors with the largest possible L2 and L3 caches. Some CPUs also offer an additional L4 cache known as Last Level Cache (LLC). Any cache can increase a server's processor performance far more than can a higher clock speed. Even with Windows Server 2016, you should install as much RAM as possible in the server. If the amount of memory is not sufficient to run a server application, Windows 2016 transfers data from memory to the hard disk. Even if the machine is equipped with SSD or a flash drive, memory significantly affects performance.
The best way to monitor memory on servers is to monitor performance during operation. First, pay attention to the value of Memory: Available Bytes , which shows how many bytes of RAM are currently available for use by processes. Low values can indicate that the total amount of memory available on the server is too low or that an application does not free up memory. Second, use Memory: Pages/sec to determine the number of pages that were read from or written to the disk due to page errors in order to free up space. A high value can indicate excessive swapping. Monitor Memory: Page Faults/sec to ensure that the disk activity is not caused by swapping out.
Fast Disks and PCIe
Of course, Windows Server 2016 should be able to access the fastest possible hard disk systems. Microsoft recommends PCI Express (PCIe) interfaces for the server's primary memory, but also for connecting the network adapters. You should also use at least PCIe x8 and network adapters with 10Gbps or more.
Data media on servers should have the highest possible revolutions per minute. The more revolutions per minute, the lower the access times. In general, data carriers with 15,000rpm are recommended. Here, 2.5-inch enterprise disks often offer shorter access times than their 3.5-inch counterparts. In general, Microsoft recommends the use of SSD or flash memory. NVMe SSDs in particular offer very high performance.
In Windows Server 2016, three storage tiers can be used in the storage spaces: NVMe, SSD, and HDD. NVMe memory is used for caching data, whereas SSDs and HDDs are used for traditional data storage and archiving. However, you can create different combinations of storage tiers with these three volume types. On servers, it makes sense to use different types of memory to get the best possible performance.
Buy this article as PDF
(incl. VAT)