Saturday, October 4, 2014

How VMware vMotion works ?

How VMware vMotion works ?


VMware vMotion enables live migration of running virtual machines from one physical to another with zero downtime, continuous service availability & complete transaction integrity. This feature improves availability of conducting maintenance without disrupting business.

VMware vMotion is enabled by three underlying technologies:

1)  Entire state of virtual machine is encapsulated by set of files stored on shared storage

2)  Active memory page & system state of virtual machine (preCopy) is rapidly transferred over high-speed vMotion network allowing to switch from source host to destination host. Keeps track of on-going memory transaction in a memory bitmap. Once entire memory & system state are copied to destination host, the source virtual machine is Quiesced. Memory bitmap does not have contents of memory; instead it has addresses of that memory (also called dirty memory). Target host reads the addresses in the memory bitmap file and requests the contents of the addresses from the source host. After copying the bitmap to target host, the virtual machine resumes on the target ESX host. The entire process takes < 2 seconds.

3)   The network is also virtualized, ensuring even after the migration virtual machine network identity & network connections are preserved. VMware vMotion manages Virtual MAC. Once the destination machine is activated, vMotion sends RARP message to the physical switch to ensure that it is aware of the new physical location of the virtual MAC address. After virtual machine successfully operating on the target host, memory on the source host is deleted.

VMware vMotion also migrates resource allocation (CPU & memory) from one host to another. On a bad day, ping -t may result in loss of one ping packet. Applications can withstand loss of more than a packet or two.

VMware vMotion requirements:

1) Shared storage

2) Gigabit Ethernet NIC with VMkernel port defined for vMotion on each ESXi hosts

A Successful vMotion relies on the following host conditions:

1) Source & destination host must be configured with identical virtual switches with vMotion enabled port group.

2) All port groups to which virtual machine being migrated must exists on both ESXi - case sensitive, VLANs.

3) Processors must be compatible.

A successful vMotion relies on the following virtual machine conditions:

1) Virtual machine must not be connected to any physical device available only to one ESXi host.

2) Virtual machine must not be connected to an internal-only virtual switch.

3) Virtual machine must not have CPU affinity set to specific CPU.

4) Virtual machine must have all the files on VMFS or NFS datastore accessible to both ESXi hosts.


High priority migration does not proceed if the resources aren’t available to be reserved for the migration. Standard priority migration might proceed slowly and might fail to complete if enough resources aren't available.

At 14%, a pause occurs while the hosts establish communications and gather information for pages in memory to be migrated. 

And at 65%, another pause occurs when the source virtual machine is quiesced and dirty memory pages are fetched from the source host.

Sometimes, the vMotion progress action fails at certain percentage. Below are the reasons for the vMotion failure at certain percentage:

1)      9% - issue with the ESXi NICs – upgrade the NIC firmware to resolve the issue

2)      10% - datastore of the VM mounted was mounted read-only – needed read/write access

3)      10% - log.rotateSize value in the virtual machine's .vmx file is set to a very low value, it causes the vmware.log file to rotate so quickly that by the time the destination host is requesting the vmware.log file's VMFS lock. The destination host is then unable to acquire a proper file lock, and this causes the vMotion migration failure.

4)      14% - fails if there are multiple VMkernel ports in the same network (or) incorrect VMkernal interfaces selected for vMotion.

5)      78% - NFS storage path UUID mismatch

6)      82% - caused by incorrect datastore information or out-of-date virtual hardware.

7)      90% - attempting to vMotion 64-bit virtual machines from an ESX/ESXi host with VT enabled in the BIOS to an ESX/ESXi host with VT not enabled in the BIOS.

8)      90% - occurs if the host Migrate.MemChksum value is set to 1, but the other hosts are set to 0.

No comments:

Post a Comment