Saturday, October 4, 2014

Whats new in vSphere 5.5 ?

Whats new in vSphere 5.5?
  • Hot-Pluggable SSD PCI Express (PCIe) Devices - ability to hot-add or hot-remove the solid-state disks.
  • Support for Reliable Memory Technology - ESXi runs on Memory, if error occurs, it crashes and VMs too. To protect against memory errors, ESXi takes advantage of hardware vendor enable Reliable memory technology.
  • Enhancements for CPU C-States.
  • Storage - Support for 62TB VMDK- vSphere 5.5 increases the maximum size of a virtual machine disk file (VMDK) to 62TB (note the maximum VMFS volume size is 64TB where the max VMDK file size is 62TB).  The maximum size for a Raw Device Mapping (RDM) has also been increased to 62TB.
  • 16GB E2E support.
  • Expanded vGPU Support - 5.1 was limited to Nvidia - now supports Nvidia & AMD GPUs.
  • Doubled Host-Level Configuration Maximums.
  • 16Gb End-to-End Support – In vSphere 5.5 16Gb end-to-end FC support is now available.  Both the HBAs and array controllers can run at 16Gb as long as the FC switch between the initiator and target supports it.
  • Graphics acceleration now possible on Linux Guest OS.
  • vSphere App HA - works in conjunction with vSphere HA monitoring and VM monitoring to improve application up-time. can be configured to restart application service when issue is detected. Can also reset VM if Application fails to start.
  • For hosts with different CPU vendors in a cluster:
    • Per-Virtual-Machine CPU masking - hide or show NX/XD bit (No Execute / Execute Disable)
    • VMware Enhanced vMotion compatibility - On the hardware side - Intel & AMD put functions in the CPUs that would allow them to modify the CPU ID value returned by the CPUs. Intel calls this functionality as FlexMigration. AMD - embedded this into the AMD-V virtualization extenstions. On Software side, VMware created s/w that takes advantage of this hardware functionality to create a common CPU ID baseline for all servers within the cluster. Introduced in ESX/ESXi 3.5 Update2.

How Storage vMotion works ?

How Storage vMotion works?

Storage vMotion enables live migration of running virtual machine disk files from one storage location to another with no downtime or service disruption.

  • This simplifies storage array migration or storage upgrades.
  • Dynamically optimize storage I/O performance.
  • Efficiently utilize storage and manage capacity.
  • Manually balances the storage load.

Storage vMotion process:

  1. vSphere copies the non-volatile files that make up a VM: vmx, swp, logs & snapshots.
  2. vSphere starts a ghost or shadow VM on the destination datastore. Because the ghost VM does not yet have a virtual disk (that hasn't been copied over yet), it sits idle waiting for its virtual disks.
  3. Storage vMotion first creates a destination disk. Then a mirror device - a new driver that mirrors I/Os between source & destination.
  4. I/O mirroring in place, vSphere makes a single-pass copy of virtual disks from source to destination. As the changes are made to the source, the mirror driver ensures that changes are also reflected at the destination.
  5. When the virtual disk copy completes, vSphere quickly suspends & resumes in order to transfer control over to the ghost VM on the destination datastore.
  6. Files on the source datastore are deleted.

Virtual mode RDM storage vMotion - if you want to migrate only the vmdk mapping file - select “Same Format as source”.

Physical mode RDMs are not affected.

How VMware DRS works ?

How VMware DRS works?

VMware DRS aggregates the computing capacity across a collection of servers and intelligently allocates the available resources among the virtual machines based on predefined rules. When the virtual machine experiences increased load, DRS evaluates its priority.

VMware DRS allows you to control the placement of virtual machines on the hosts within the cluster by using affinity rules. By default, VMware DRS checks every 5mins to see if the cluster's workload is balanced. DRS is needed to be enabled for resource pools to be created.

DRS is invoked by certain actions in the cluster
  • adding or removing the ESXi host
  • changing resource settings on the VM

Automatic DRS mode determines the best possible distribution of virtual machines and the manual DRS mode provides recommendation for optimal placement of the virtual machines and leaves it the system administrator to decide.

Manual – every time you power on the VM, the cluster prompts you to select the ESXi host where the VM should be hosted. Recommends migration

Partially Automatic – every time you power on the VM, the cluster DRS automatically selects the ESXi host & Recommends migration

Fully Automatic – every time you power on the VM, the cluster DRS automatically selects the ESXi host & migration. Scaled from Conservative to Aggressive
  • Apply priority 1 recommendations - affinity rules & host maintenance
  • Apply priority 1 & 2 recommendations - promise significant improvement to cluster load balance
  • Apply priority 1, 2 & 3 recommendations - promise at-least good improvement to cluster load balance
  • Apply priority 1, 2, 3 & 4 recommendations - promise moderate improvement to cluster load balance
  • Apply all recommendations - promise even a slight improvement to cluster load balance. 

There are three major elements here:
  1. Migration Threshold
  2. Target host load standard deviation
  3. Current host load standard deviation

When you change the “Migration Threshold” the value of the “Target host load standard deviation” will also change. Two host cluster with threshold set to three has a THLSD of 0.2, a three host cluster has a THLSD of 0.163.

While the cluster is imbalanced (Current host load standard deviation > Target host load standard deviation) select a VM to migrate based on specific criteria and simulate a move and re-compute the “Current host load standard deviation” and add to the migration recommendation list. If the cluster is still imbalanced (Current host load standard deviation > Target host load standard deviation) repeat procedure.

How does DRS selects the best VM to move?

For each VM check if a VMotion to each of the hosts which are less utilized than source host would result in a less imbalanced cluster and meets the Cost Benefit and Risk Analysis criteria.

How VMware vMotion works ?

How VMware vMotion works ?

VMware vMotion enables live migration of running virtual machines from one physical to another with zero downtime, continuous service availability & complete transaction integrity. This feature improves availability of conducting maintenance without disrupting business.

VMware vMotion is enabled by three underlying technologies:

1)  Entire state of virtual machine is encapsulated by set of files stored on shared storage

2)  Active memory page & system state of virtual machine (preCopy) is rapidly transferred over high-speed vMotion network allowing to switch from source host to destination host. Keeps track of on-going memory transaction in a memory bitmap. Once entire memory & system state are copied to destination host, the source virtual machine is Quiesced. Memory bitmap does not have contents of memory; instead it has addresses of that memory (also called dirty memory). Target host reads the addresses in the memory bitmap file and requests the contents of the addresses from the source host. After copying the bitmap to target host, the virtual machine resumes on the target ESX host. The entire process takes < 2 seconds.

3)   The network is also virtualized, ensuring even after the migration virtual machine network identity & network connections are preserved. VMware vMotion manages Virtual MAC. Once the destination machine is activated, vMotion sends RARP message to the physical switch to ensure that it is aware of the new physical location of the virtual MAC address. After virtual machine successfully operating on the target host, memory on the source host is deleted.

VMware vMotion also migrates resource allocation (CPU & memory) from one host to another. On a bad day, ping -t may result in loss of one ping packet. Applications can withstand loss of more than a packet or two.

VMware vMotion requirements:

1) Shared storage

2) Gigabit Ethernet NIC with VMkernel port defined for vMotion on each ESXi hosts

A Successful vMotion relies on the following host conditions:

1) Source & destination host must be configured with identical virtual switches with vMotion enabled port group.

2) All port groups to which virtual machine being migrated must exists on both ESXi - case sensitive, VLANs.

3) Processors must be compatible.

A successful vMotion relies on the following virtual machine conditions:

1) Virtual machine must not be connected to any physical device available only to one ESXi host.

2) Virtual machine must not be connected to an internal-only virtual switch.

3) Virtual machine must not have CPU affinity set to specific CPU.

4) Virtual machine must have all the files on VMFS or NFS datastore accessible to both ESXi hosts.

High priority migration does not proceed if the resources aren’t available to be reserved for the migration. Standard priority migration might proceed slowly and might fail to complete if enough resources aren't available.

At 14%, a pause occurs while the hosts establish communications and gather information for pages in memory to be migrated. 

And at 65%, another pause occurs when the source virtual machine is quiesced and dirty memory pages are fetched from the source host.

Sometimes, the vMotion progress action fails at certain percentage. Below are the reasons for the vMotion failure at certain percentage:

1)      9% - issue with the ESXi NICs – upgrade the NIC firmware to resolve the issue

2)      10% - datastore of the VM mounted was mounted read-only – needed read/write access

3)      10% - log.rotateSize value in the virtual machine's .vmx file is set to a very low value, it causes the vmware.log file to rotate so quickly that by the time the destination host is requesting the vmware.log file's VMFS lock. The destination host is then unable to acquire a proper file lock, and this causes the vMotion migration failure.

4)      14% - fails if there are multiple VMkernel ports in the same network (or) incorrect VMkernal interfaces selected for vMotion.

5)      78% - NFS storage path UUID mismatch

6)      82% - caused by incorrect datastore information or out-of-date virtual hardware.

7)      90% - attempting to vMotion 64-bit virtual machines from an ESX/ESXi host with VT enabled in the BIOS to an ESX/ESXi host with VT not enabled in the BIOS.

8)      90% - occurs if the host Migrate.MemChksum value is set to 1, but the other hosts are set to 0.