Tuesday, August 6, 2013

vMotion of virtual machines fails at 82%

vMotion of virtual machines fails at 82% (How to ?)

vMotion of virtual machine fails at 82% with a error “A general system error occurred: Source detected that destination failed to resume.”. You won't be able to migrate a virtual machine.

The issue is caused by incorrect datastore information (UUID mismatch). 

So to confirm this run the below command on all ESX host in the cluster.

# vdf -h

The output appears similar to:

[ESX01 ~]# vdf -h
/vmfs/volumes/1dd794c6-cc279de7
600G 438G 161G 73% /vmfs/volumes/NFS01

[ESX02~]# vdf -h
/vmfs/volumes/36132c1c-6f72083e
600G 438G 161G 73% /vmfs/volumes/NFS01

[ESX03~]# vdf -h
/vmfs/volumes/1dd794c6-cc279de7
600G 438G 161G 73% /vmfs/volumes/NFS01

[ESX04~]# vdf -h
/vmfs/volumes/1dd794c6-cc279de7
600G 438G 161G 73% /vmfs/volumes/NFS01

[ESX05~]# vdf -h
/vmfs/volumes/1dd794c6-cc279de7
600G 438G 161G 73% /vmfs/volumes/NFS01

Compare the UUID on all ESX host got from the output of the command (given in bold). 

Here, the second host ESX02 sees the datastore with a different UUID when compared to other ESX hosts.

For ESX01, ESX03, ESX04 & ESX05, the UUID of the datastore remains the same (show below). Hence virtual machine can be recognized on the other host if it is vmotion’ed.

/vmfs/volumes/1dd794c6-cc279de7

And for ESX02 the UUID of the datastore is different (shown below).

/vmfs/volumes/36132c1c-6f72083e

To resolve this, run “esxcfg-nas –l” command in one of the ESX which is working perfectly and make a note of the NFS path. 

Then un-mount the datastore from the faulty ESX (ESX02) and remount the NFS datastore using the NFS path (in exact) noted from other ESX server.

Steps to un-mount the datastore & remount

1)    Change the cluster DRS setting to “manual”. (Or) Move the ESX to “maintenance mode” -> then, move ESX out of the ESX cluster -> and exit “maintenance mode”

2)    Make sure there are no powered on virtual machines in ESX02.

3)    Login to ESX server ESX02 using vsphere client.

4)    Go to “configurations”.

5)    Select “storage”.

6)    Right click the "NFS01" datastore and click “unmount”.

7)    Perform step 6 for all the NFS datastore until there is no datastore mounted.

8)    Then click “Add Storage”.

9)    Select “Network File System” -> click “next”

10) Use the output of “esxcfg-nas –l” command run on ESX

NFS01 is /vol/pecs_esx_nfs_vol01/esx_nfs_vol01_q from 10.xx.xx.xx mounted

Datastore Name: NFS01

Folder: /vol/pecs_esx_nfs_vol01/esx_nfs_vol01_q

Server: 10.xx.xx.xx

11) Fill the properties in “Add Storage” dialog box with the information as above.

12) Click “Next” -> Click “Finish”

13) Perform the steps 8 through 12 to mount all the NFS data stores.

14) Once done, compare the output of “esxcfg-nas –l” got from ESX02 with other ESX servers. Output should be similar.

15) Perform the vMotion of a test virtual machine to ESX02 and migration should 
be successful.

16) If the migration of test virtual machine is successful, then change the cluster DRS setting to “Automatic”. (Or) Move the ESX to “maintenance mode” -> then, move ESX into the ESX cluster -> and exit “maintenance mode”

No comments:

Post a Comment