vSphere 6.5: Additional considerations when migrating to VMFS-6 – Part 1

For those who use the Virtual Machine File System (VMFS) datastores, one of the steps when upgrading to vSphere 6.5 is to migrate them to VMFS-6.

VMFS6-01

VMware provides a detailed overview of VMFS-6 on the StorageHub, as well as an example of how the migration from VMFS-5 can be automated using PowerCLI.

However, there are three edge cases that require extra steps to continue with the migration. They are as follows:

All those objects, if they exist, prevent the ESXi host from unmounting the datastore, and they need to be moved to a new location before migration continues. The required steps to relocate them will be reviewed in the paragraphs below.

Relocating the system swap

The system swap location can be checked and set via vSphere Client in Configure > System > System Swap settings of the ESXi host.

VMFS6-02

Alternatively, the system swap settings can be retrieved via PowerCLI:

The script above can be modified to create the system swap files on a new datastore:

Note: The host reboot is not required to apply this change.

Moving the persistent scratch location

A persistent scratch location helps when investigating the host failures. It preserves the host log files on a shared datastore. So they can be reachable for troubleshooting, even if the host experienced the Purple Screen of Death (PSOD) or went down.

To identify the persistent scratch location, filter the key column by the ‘scratch’ word in Settings > System > Advanced System Settings of the ESXi host in vSphere Client.

VMFS6-03

You only need to point the ScratchConfig.ConfiguredScratchLocation setting to a new location and reboot the host for this change to take effect.

Note: Before doing any changes, make sure that the .locker folder (should be unique for each configured host to avoid data mixing or overwrites) has been created on the desired datastore. Otherwise, the persistent scratch location remains the same.

To review and modify advanced host parameters including the persistent scratch location via PowerCLI, look for two cmdlets named Get-AdvancedSetting and Set-AdvancedSetting. This procedure is well-documented in KB 1033696.

An information about how to automate the diagnostic coredump file relocation will be covered in Part 2 or this series later this month. Keep you posted!

vSphere 6.5: Failed to deploy OVF package from the Content Library

Working on automating the virtual machine (VM) provisioning from an OVF template in a content library in vSphere 6.5 Update 2, I ran across an interesting behaviour. Some virtual machines were created without any issues, whereas others have been failing with the error message ‘Failed to deploy OVF package.’

OVF-Template-Issue-00

OVF-Template-Issue-01

In the vpxd.log on vCenter Server, this error message looks as follows:

info vpxd[7F83D22C5700] [Originator@6876 sub=Default opID=SelectResourcePageMediator-validate-335519-ngc:70031913-ae-71-7e-92-01] [VpxLRO] — ERROR task-51355 — TEST-VM-01 — ResourcePool.ImportVAppLRO: vim.fault.OvfImportFailed:
–> Result:
–> (vim.fault.OvfImportFailed) {
–> faultCause = (vim.fault.NotFound) {
–> faultCause = (vmodl.MethodFault) null,
–> faultMessage = (vmodl.LocalizableMessage) [
–> (vmodl.LocalizableMessage) {
–> key = “com.vmware.ovfs.ovfs-main.ovfs.object_not_found”,
–> arg = (vmodl.KeyAnyValue) [
–> (vmodl.KeyAnyValue) {
–> key = “0”,
–> value = “/TEST-VM-01/nvram”
–> }
–> ],
–> message = “The specified object /TEST-VM-01/nvram could not be found.”
–> }
–> ]
–> msg = “The specified object /TEST-VM-01/nvram could not be found.”
–> },
–> faultMessage = <unset>
–> msg = “”
–> }
–> Args:
–>

The NVRAM file contains information such as BIOS settings. As VMware states here, a new NVRAM file is created ‘when a virtual machine is migrated, migrated with vMotion, cloned or deployed from a template.’

Looking through the knowledge base articles on the vendor’s website, I have found the following hint in KB 2108718:

Error 2: Could not find the file

This issue occurs if the NVRAM file mentioned in the virtual machines configuration file(*.vmx) is not available anymore.

As you might know, templates are stored in OVF format in the content library. Two files compose a template: an OVF descriptor (.ovf) and a virtual disk file (.vmdk). During the VM provisioning phase, the settings listed in the OVF descriptor form the virtual machine’s configuration file (.vmx). So my assumption was that one of those settings caused a conflict that prevented vSphere from creating those VMs.

When we clone a template to the content library, only minimum required settings are stored in the OVF descriptor. However, there is an option available to include extra configuration.

OVF-Template-Issue-02

I decided to compare a vanilla OVF descriptor with the one with extra settings. While doing this comparison, one particular line in the file with extra configuration caught my eye:

<vmw:ExtraConfig ovf:required=”false” vmw:key=”firmware” vmw:value=”efi”/>

For some reason, the provisioning task is not able to succeed in creating new virtual machines with the EFI firmware from the template in vSphere 6.5. This issue happens regardless of the guest operating system version and content library type (local or subscribed).

Surprisingly, I do not see any problem with deploying VMs with the EFI firmware from the content library in vSphere 6.0. I need some time to test this functionality in vSphere 6.7, before opening a support request with VMware.

Workaround: Feel free to use a PowerCLI script below to set the firmware type to EFI after provisioning a new VM.

 

vSphere 6.0: The operation failed due to The operation is not allowed in the current state.

Working on the PowerCLI script to automate VM provisioning in vSphere 6.0, I ran across the following error message:

New-VM-Issue-02

The script was simply creating a new virtual machine from the Content Library template.

New-VM -Name $vmName -VMHost $vmHost -Location $vmLocation -Datastore $vmDatastore -ContentLibraryItem $vmTemplate

Investigating this behaviour further, I have found that it was related to a failing cross vMotion placement. In vSphere Client those errors were looking like this:

New-VM-Issue-01

It was then I concluded it might be related to DRS being disabled for that cluster. Setting DRS to the Manual mode resolved this issue.

The next step was to check whether this problem could be reproduced for other templates. Surprisingly, the VM provisioning from some of them went smoothly, regardless of the DRS status.

Remembering about my previous articles (here and here) related to the Content Library, I decided to have a look into the VM configuration files (*.ovf) for both templates to compare. The only difference was related to the following storage settings which exist for the faulty templates:

<vmw:StorageGroupSection ovf:required=”false” vmw:id=”group1″ vmw:name=”EMC VNX VM Storage Policy”>
<Info>Storage policy for group of disks</Info>
<vmw:Description>The EMC VNX VM Storage Policy storage policy group</vmw:Description>
</vmw:StorageGroupSection>

<vmw:StorageSection ovf:required=”false” vmw:group=”group1″>
<Info>Storage policy group reference</Info>
</vmw:StorageSection>

As expected, removing those block of data from the OVF-file and re-uploading it to the Content Library resolved provisioning issue completely.

A couple of notes:

  • There is no way to identify any problems with the VM configuration using the Content Library configuration page or ‘New VM from Library…’ dialogue in vCenter Web Client.
  • Even if a modified OVF-file has been uploaded to the local Content Library, this change is not going to propagate to the subscribed ones. You need to repeat this process for all subscribed Content Libraries manually.

VMware: StorageHub Portal Refresh

For those of us who have been interested in getting explicit information about VMware vSAN, Site Recovery Manager, and vSphere storage in general, VMware StorageHub was a unique source of technical documentation.

It is great to see the vendor working on improving this portal with the design and user interface refresh.

SorageHub-01

Now it is possible to choose between English (US) and Mandarine languages for some of the articles.

SorageHub-02

All seems quite logical, and I personally like navigation and how fast search works.

SorageHub-03

Well done, VMware!

vSAN 6.5: Virtual Machine with more than 64GB memory fails to Storage vMotion to vSAN cluster

VMware has just posted an article in the Virtual Blocks blog which describes this behaviour. It happens only when trying to Storage vMotion a virtual machine with a swap file larger than 64GB to the vSAN datastore.

The task fails and generates the following error messages:

SvMotion-Issue-01

There are two possible workarounds available: either increase the swap file maximum size on the destination ESXi host or set a reservation of memory on the virtual machine. The former one is more preferable, as it does not require host reboot.

VMware provides a KB 2150316 with “more log samples and specifics for identifying the issue as a cause of a migration failure”.

vSphere Next: A new beta refresh and more

Beta Testing

Less than two months since VMware announced the availability of a vSphere beta and it has been refreshed with the new features and bugfixes. To participate in the program, candidates should indicate their interest by filling out this simple form.

I personally think time around Christmas holidays is the best one for tech geeks to dedicate some of their time and have an understanding of what’s next.

The beta refresh is available as a downloadable media and as a hosted environment in the Hands-on-Labs.

For those folks who need access to the full range of technologies from VMware, the VMware User Group has just announced a 10% discount on the VMUG Advantage subscription. This offer is available until December 31st, 2017.

All this sounds like a great Christmas gift from the vendor. Thank you, VMware!

vSphere 6.0: Templates are shown as ‘Unknown’ in the local Content Library

Another day another case… This time, I was surprised to see an empty list when provisioning a new virtual machine from a Content Library.

CL Issue - 01

I went to check the Content Library status and found all templates were shown as ‘Unknown’ in there.

CL Issue - 02

Funny enough, this behaviour was happening only with the local Content Library. A subscribed one didn’t have any issues at all, and the synchronisation between those two was still working.

CL Issue - 03

More interestingly, the objects of other types were not affected at all.

There is not enough information about how to troubleshoot the Content Library in vSphere 6.0. Some of the diagnostic files can be found in the /var/log/vmware/vdcs directory on vCenter Server Appliance (VCSA). Unfortunately, they are not that informative.

So I opened the case with VMware GSS (SR # 17504701707) and the response was that “this issue is occurring as there is a corrupted or stale PID for the content library service which has not been cleared from the previous running state.”

VMware is working on this to be resolved, but no ETA at the moment.

A workaround provided by VMware:

  1. Connect to the vCenter Server Appliance using SSH and root credentials.
  2. Navigate to /var/log/vmware/vdcs.
  3. Create a new folder to move the PID file to.
  4. Move the vmware-vdcs.pid file to the folder created in step 3.
  5. Reboot the vCenter Server Appliance (In case of external PSC, reboot the PSC first and then the vCenter).

I personally found that restarting VCSA resolves this issue. However, it reappears after some time.