A tip when using the vCenter Server Converge Tool

As you might know, VMware is dropping support for the external Platform Services Controller (PSC) deployment model with the next major release of vSphere.

To make a smooth transition, you have to use a command-line interface of the vCenter Server Converge Tool to do the migration to an embedded PSC. This functionality is available starting from vCenter Server 6.7 Update 1 and also in vCenter Server 6.5 Update 2d and onward.

With the upcoming release of vSphere 6.7 Update 2, there will be an option to complete the whole migration using the vSphere Client – super easy!

Meanwhile, the process of moving from an external PSC deployment to the embedded one using CLI consists of two manual steps – converge and decommission. A detailed instruction of how to prepare for and execute each of those steps is documented in the David Stamen’s post ‘Understanding the vCenter Server Converge Tool‘.

What I found tricky was running the converge step when the external PSC had been previously joined to the child domain in Active Directory. In this case, the vCenter Server Converge Tool precheck ran with the default parameters generates the following error message in vcsa-converge-cli.log:

2019-04-06 03:08:15,979 – vCSACliConvergeLogger – ERROR – AD Identity store present on the PSC:root.domain.com
2019-04-06 03:08:15,979 – vCSACliConvergeLogger – INFO – ================ [FAILED] Task: PrecheckSameDomainTask: Running PrecheckSameDomainTask execution failed at 03:08:15 ================
2019-04-06 03:08:15,980 – vCSACliConvergeLogger – DEBUG – Task ‘PrecheckSameDomainTask: Running PrecheckSameDomainTask’ execution failed because [ERROR: Template AD info not providded.], possible resolution is [Refer to the log for details]
2019-04-06 03:08:15,980 – vCSACliConvergeLogger – INFO – =============================================================
2019-04-06 03:08:16,104 – vCSACliConvergeLogger – ERROR – Error occurred. See logs for details.
2019-04-06 03:08:16,105 – vCSACliConvergeLogger – DEBUG – Error message: com.vmware.vcsa.installer.converge.prechecksamedomain: ERROR: Template AD info not providded.

In this example, the root.domain.com refers to the root domain; whereas, the computer object for PSC is in the child domain.

To workaround this issue, I had to use the –skip-domain-handling flag to skip the AD Domain related handling in both precheck and actual converge.

By doing this, the vCenter Server Appliance should be joined to the correct AD domain manually after the converge succeed and before the external PSC will be decommissioned.

‘Response with status: 401 Unauthorized for URL’ while performing Update Manager tasks on Firefox in vSphere 6.5

Updating to vCenter Server 6.5 Update 2d has brought long-awaited support for the vSphere Update Manager (VUM) functionality to vSphere Client.

However, you should be aware of some incompatibility between a VUM plug-in in vSphere Client and Mozilla Firefox.

Update Manager gives a ‘Response with status: 401 Unauthorized for URL‘ error message at random occasions when used with the recent versions of this web-browser (tested with Firefox 63.x and 64.x):

Searching on the VMware’s web-site led me to the KB 59696 that describes the similar issue in vSphere 6.7 and the workaround provided worked for me.

First you need to clear the browser cache:

Then stop Firefox from caching pages by modifying the browser’s default configuration:

According to VMware, they are currently working towards resolving this issue in future releases.

IMPORTANT: Removing a virtual machine folder from inventory deletes the VMs within it from disk when using the vSphere Client

VMware has recently released a new KB 65207 warning about this issue.

Normally, all powered off VMs would be unregistered from vCenter when you remove a parent folder that includes those objects using the vSphere Web Client. The same is expected in vSphere Client.

vCenter logs similar events when the folder object was removed in vSphere Web Client or vSphere Client.

In my case, vCenter generated two events – one for deleting the folder, and another one for removing TEST-VM-02 from the inventory.

The only difference is that vCenter deleted TEST-VM-02 from the corresponding datastore when removing TEST-Folder from the inventory. Moreover, no events are generated during the delete operation!!!

This issue affects both VMware vCenter Server 6.5.x and VMware vCenter Server 6.7.x.

Workaround (provided by the vendor): Use the vSphere Web Client (Flex) for unregistering all virtual machines in a VM folder or move the VMs from that folder before removing it.

22/01/2019 – Update 1: This issue has been resolved in VMware vCenter Server 6.7 Update 1b. Please see the release notes for more details.

URGENT: VMware Tools 10.3.0 was recalled

VMware has just announced that VMware Tools 10.3.0 was recalled due to a functional issue with 10.3.0 in ESXi 6.5.

VMware-Tools-1030-Issue

As per KB 57796, the VMXNET3 driver released with VMware Tools 10.3.0 can result in a Purple Diagnostic Screen (PSOD) or guest network connectivity loss in certain configurations. Those configurations include:

  • VMware ESXi 6.5 hosts
  • VM Hardware version 13
  • Windows 8/Windows Server 2012 or higher guest operating system (OS).

As a workaround, VMware recommends uninstalling VMware Tools 10.3.0 and then reinstalling VMware Tools 10.2.5 for affected systems.

The vendor will be releasing a revised version of the VMware Tools 10.3 family at some point in the future.

More information is available in VMSA-2018-0017.

25/09/2018 – Update 1: VMware Tools were updated to version 10.3.2 to resolve the issue with VMXNET3 driver. VMware recommends to install VMware Tools 10.3.2, or VMware Tools 10.2.5 or an earlier version of VMware Tools.

VMware ESXi 6.0-6.5: Low network receive throughput for VMXNET3 on Windows VM

VMware has just released a new KB 57358 named ‘Low receive throughput when receive checksum offload is disabled and Receive Side Coalescing is enabled on Windows VM‘. This requires attention when configuring the VMXNET3 adapter on Windows operating systems (OS). However, it only affects virtual environments with VMware ESXi 6.0 and ESXi 6.5 only.

VMware states that it happens when the following conditions are met:

  • Guest OS is Windows 2012 / Windows 8 or later
  • VM hardware version 11 or later
  • Virtual network adapter is VMXNET3
  • Receive Side Coalescing (RSC) is enabled on the VMXNET3 driver on the guest OS
  • Some or all of following receive checksum offloads have value Disabled or only Tx Enabled on the VMXNET3 driver on the guest operating system:
    • IPv4 Checksum Offload
    • TCP Checksum Offload (IPv4)
    • TCP Checksum Offload (IPv6)
    • UDP Checksum Offload (IPv4)
    • UDP Checksum Offload (IPv6).

This shouldn’t be a problem if the VMXNET3 driver has the default settings.

VMXNET3-RSC-Issue-00

For example, copying 3.4 gigabytes of data to the test VM via the 1Gbps link took me seconds.

VMXNET3-RSC-Issue-01

With TCP Checksum Offload (IPv4) set to Tx Enabled on the VMXNET3 driver the same data takes ages to transfer.

VMXNET3-RSC-Issue-02

VMware provides a workaround for this issue: you either need to disable RSC, if any of receive checksum offloads is disabled, or manually enable receive checksum offloads. The knowledge base includes the PowerShell commands that help to automate the latter.

vSphere 6.5: Failed to deploy OVF package from the Content Library

Working on automating the virtual machine (VM) provisioning from an OVF template in a content library in vSphere 6.5 Update 2, I ran across an interesting behaviour. Some virtual machines were created without any issues, whereas others have been failing with the error message ‘Failed to deploy OVF package.’

OVF-Template-Issue-00

OVF-Template-Issue-01

In the vpxd.log on vCenter Server, this error message looks as follows:

info vpxd[7F83D22C5700] [Originator@6876 sub=Default opID=SelectResourcePageMediator-validate-335519-ngc:70031913-ae-71-7e-92-01] [VpxLRO] — ERROR task-51355 — TEST-VM-01 — ResourcePool.ImportVAppLRO: vim.fault.OvfImportFailed:
–> Result:
–> (vim.fault.OvfImportFailed) {
–> faultCause = (vim.fault.NotFound) {
–> faultCause = (vmodl.MethodFault) null,
–> faultMessage = (vmodl.LocalizableMessage) [
–> (vmodl.LocalizableMessage) {
–> key = “com.vmware.ovfs.ovfs-main.ovfs.object_not_found”,
–> arg = (vmodl.KeyAnyValue) [
–> (vmodl.KeyAnyValue) {
–> key = “0”,
–> value = “/TEST-VM-01/nvram”
–> }
–> ],
–> message = “The specified object /TEST-VM-01/nvram could not be found.”
–> }
–> ]
–> msg = “The specified object /TEST-VM-01/nvram could not be found.”
–> },
–> faultMessage = <unset>
–> msg = “”
–> }
–> Args:
–>

The NVRAM file contains information such as BIOS settings. As VMware states here, a new NVRAM file is created ‘when a virtual machine is migrated, migrated with vMotion, cloned or deployed from a template.’

Looking through the knowledge base articles on the vendor’s website, I have found the following hint in KB 2108718:

Error 2: Could not find the file

This issue occurs if the NVRAM file mentioned in the virtual machines configuration file(*.vmx) is not available anymore.

As you might know, templates are stored in OVF format in the content library. Two files compose a template: an OVF descriptor (.ovf) and a virtual disk file (.vmdk). During the VM provisioning phase, the settings listed in the OVF descriptor form the virtual machine’s configuration file (.vmx). So my assumption was that one of those settings caused a conflict that prevented vSphere from creating those VMs.

When we clone a template to the content library, only minimum required settings are stored in the OVF descriptor. However, there is an option available to include extra configuration.

OVF-Template-Issue-02

I decided to compare a vanilla OVF descriptor with the one with extra settings. While doing this comparison, one particular line in the file with extra configuration caught my eye:

<vmw:ExtraConfig ovf:required=”false” vmw:key=”firmware” vmw:value=”efi”/>

For some reason, the provisioning task is not able to succeed in creating new virtual machines with the EFI firmware from the template in vSphere 6.5. This issue happens regardless of the guest operating system version and content library type (local or subscribed).

Surprisingly, I do not see any problem with deploying VMs with the EFI firmware from the content library in vSphere 6.0. I need some time to test this functionality in vSphere 6.7, before opening a support request with VMware.

Workaround: Feel free to use a PowerCLI script below to set the firmware type to EFI after provisioning a new VM.

 

vCenter Converter 6.2: Error 1053: The service did not respond to the start or control request in a timely fashion

I couldn’t believe something as simple as vCenter Converter Agent installation would cause me a headache. The service was properly registered on the remote machine. However, it was failing to start with the error message as follows:

Converter-Issue-01

For a moment I thought about any dependencies that VMware vCenter Converter Standalone Agent service could have, but it was a false assumption.

Converter-Issue-02

Looking for any hint from the community, I thought it might worth time to check the release notes for VMware vCenter Converter Standalone. To my great surprise, this issue has been documented as the known issue:

Converter-Issue-03

Who can imagine that the Internet access can be a requirement for this tool?

Gladly, it can be fixed with a workaround provided by VMware:

  1. Open HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control in the Windows Registry Editor and create or modify a ServicesPipeTimeout DWORD32 decimal value to at least 300000 (5 minutes).
  2. Restart the system for the changes to take effect.

16/08/2018 – Update 1: If any issues with VMware vCenter Converter Standalone, it makes sence to look into VMware KB 1016330 ‘Troubleshooting checklist for VMware Converter’ for possible solutions.