Fault tolerance – SMP-FT, reservations and shares
Released in early 2015, vSphere 6.0 is the latest VMware virtualization platform. It presents a high-performance, reliable and resilient environment that allows users and companies to centralize their resources and management.
Up until vSphere 5.5, only virtual machines with 1 virtual CPU could be configured to support FT. It wasn’t very useful. However, with the arrival of vSPhere 6.0, the extremely useful features already available were upgraded to lower downtime on business-critical VMs and applications. One of the brand-new features of 6.0 is SMP-FT (Symmetric Multi-Processor Fault Tolerance). This new feature allows VMs with up to 4 vCPUs and 64GB of RAM to maintain productivity with zero-downtime.
In order to enable Fault tolerance, the following requirements must be met:
– A vSphere Standard/Enterprise license (for 2vCPUs) or Enterprise Plus (for 4vCPUs)
– Enough storage in all the datastores taking part in the Fault Tolerance process
– 10Gb connection (both in the hosts and in the VM)
– Create a FT portgroup across all hosts taking part in the process
– Create a vmkernel in hosts to the FT portgroup
Shares and reservations
Shares and reservations are different tuning approaches that can modify the performance of a particular VM. In this module key concepts and differences will be explained.
Reservations: A reservation is a guarantee on either memory or CPU for a VM. It works in a different way depending on the resource.
On CPU the reservation is a guarantee for clock cycles. They are defined in MHz. The CPU scheduler will give at least that amount of resources to the selected VM. If these reosurces are not used, they will be given to another one.
With virtual memory, the same thing happens. The total amount of a virtual machine memory is equal to the sum of the reservation and the swap file. If a VM has a reservation of 0MB, the swap file will be as big as the VMs Memory. E.g., If a 4GB VM has a reservation of 1GB, the swap file will be of 3GB. That is, the VM will always have 1GB of memory available, and the other 3 will depend on the host’s load.
Shares: Shares determine how much access you will get compared to something else. By default, each VM has 1000 shares. It only affects CPU, and it’s important to mention that shares only take place in an environment where contention is present. In other words, where VMs compete for the resources. To picture this image, if VM A has 3000 shares and VM B has 1000. In a contention environment, VM A will get 75% access to the CPU and VM B only 25%. However, when contention is not present, both machines will get 100% of the physical CPU.
Shares and reservations alongside Fault Tolerance
In this article I’ll explain how to enable Fault-Tolerance and how shares and reservations interact with this function.
For the purpose of this demonstration a lab deploying 2 virtual ESXi hosts 6.0 with 4vCPUs and 12GB RAM was created.
To begin with, a virtual machine with Windows Server 2008 R2 and VMware tools installed will be used. It’s important to note that all actions must be performed from the vSphere Web Client, as new features included in vSphere 6.0 and beyond are only available via this mean.
After enabling Fault tolerance, the VM settings cannot be modified, so shares and reservations will be set before powering it on.
CPU properties.
Memory properties.
At the moment of enabling Fault Tolerance, a wizard that will guide us through the process of resource allocation will be prompted.
At this stage, we’ll have to decide where the shadow VM will reside.
For practicity purposes, I will place it in the same datastore as the original one.
Now, select the host. Remember, it can’t be the host that contains the original VM.
After a few seconds, we’ll see the Fault Tolerance feature has been enabled in the virtual machine. These VMs can be easily identified checking their color, which will be a darker blue than the ordinary VMs.
We’ll monitor the state of the VM when Fault tolerance becomes active and brings in the shadow VM from another host. To do this, we’ll be both performing a continuous ping to the VM and interacting with the VM console.
To simulate the host failure, we’ll shut down the nested host that contains the running VM.
It can be seen that fault tolerance took place and only 2 packets were lost. In other words, we barely didn’t notice.
If we check the events of the vCenter, we’ll be notified that FT took place successfully on the VM. It is important to note that the VM settings are maintained through the process, and the VM has the same shares and reservations as it used to.
Regarding Fault Tolerance, it comes in handy when a hardware failure occurs, as it can prevent virtual machines from disconnecting. However, it doesn’t support software or application failures of any kind, as all problems that happen in the original VM are reflected in the shadow VM.
This is all for today. I’d like to thank you for reading. For more information about Fault Tolerance and it’s capabilities, please refer to this KB.