vCSA 6.x: WinSCP fails with the error ‘Received too large SFTP packet’

Back to basics… When you try connecting to vCenter Server Virtual Appliance 6.x (vCSA) using WinSCP, the error message ‘Received too large (1433299822 B) SFTP packet‘ might appear.

vCSA6x-WinSCP-01

This is due to the configuration of vCSA when the default shell used for the root account set to the Appliance Shell.

To fix this issue, VMware recommends switching the vCSA 6.x to the Bash Shell. This can be done in the SSH session with the following command:

chsh -s /bin/bash root

Note: You need to log out from the Appliance Shell and log in back again for the changes to take effect.

vSphere 6.x: The datastore file browser converts VMDKs to thick-provisioned when copy/move data to the target VMFS datastore

It came as a surprise to me that the datastore file browser in vSphere 6.5 (and all other versions) would convert VMDK disks to thick-provisioned when you copy or move them across from one VMFS datastore to another. This is a default behaviour, even if the initial VMDK file was thin-provisioned and the target datastore supports thin provisioning.

This can potentially cause an outage to the virtual machines resided on the Virtual Machine File System. Unfortunately, when you initiate a copy/move operation in the datastore file browser, the system doesn’t warn you about this change. So you need to remember about it and calculate the required disk space ahead of transferring data.

What is more interesting, I haven’t been able to find any reference to this in the official documentation to vSphere. It actually states quite opposite:

“Virtual disk files are moved or copied without format conversion.”

To illustrate the observed behaviour, I had created a virtual machine named TEST-VM with one thin-provisioned disk of 10 GB.

VMFS-Thick-Issue-01

After the VM was powered on, it reported the following data usage in the datastore file browser:

VMFS-Thick-Issue-02

The Inflate button on the image above indicates that “you can convert the thin disk to a virtual disk in thick provision format.”

The PowerCLI commands below helped me to show the used and provisioned space for TEST-VM:

Get-VM -Name “TEST-VM” | Select Name,@{N=’Used Space (GB)’;E={[math]::Round($_.UsedSpaceGB,2)}},@{N=’Prov. Space (GB)’;E={[math]::Round($_.ProvisionedSpaceGB,2)}} | Format-List

The result was as expected:

VMFS-Thick-Issue-03

I powered off the VM and copied the TEST-VM folder via the datastore file browser to another VMFS datastore.

VMFS-Thick-Issue-04

After this task completes, the resulted VMDK looked different:

VMFS-Thick-Issue-05

Please note the Inflate button is greyed out now. This means the virtual disk is thick provisioned.

I have looked on the Internet for any information and found a few community threads from 2009 discussing this issue – here and here. So the problem exists for a while. According to those threads, during the copy/move operation initiated via the datastore file browser, the underlying vmkfstools utility executes with the default settings creating a thick provisioned disk.

The only workaround is to use the following command to convert the VMDK to thin again:

vmkfstools -i <source.vmdk> <destination.vmdk> -d thin

If the intent is to replace the source thick-provisioned VMDK with a new thin-provisioned one, make sure to use vmkfstools utility for that operation. It will change names for both *.vmdk and *flat.vmdk files, as well as the extent description value in *.vmdk.

vSphere 6.5: Additional considerations when migrating to VMFS-6 – Part 1

For those who use the Virtual Machine File System (VMFS) datastores, one of the steps when upgrading to vSphere 6.5 is to migrate them to VMFS-6.

VMFS6-01

VMware provides a detailed overview of VMFS-6 on the StorageHub, as well as an example of how the migration from VMFS-5 can be automated using PowerCLI.

However, there are three edge cases that require extra steps to continue with the migration. They are as follows:

All those objects, if they exist, prevent the ESXi host from unmounting the datastore, and they need to be moved to a new location before migration continues. The required steps to relocate them will be reviewed in the paragraphs below.

Relocating the system swap

The system swap location can be checked and set via vSphere Client in Configure > System > System Swap settings of the ESXi host.

VMFS6-02

Alternatively, the system swap settings can be retrieved via PowerCLI:

The script above can be modified to create the system swap files on a new datastore:

Note: The host reboot is not required to apply this change.

Moving the persistent scratch location

A persistent scratch location helps when investigating the host failures. It preserves the host log files on a shared datastore. So they can be reachable for troubleshooting, even if the host experienced the Purple Screen of Death (PSOD) or went down.

To identify the persistent scratch location, filter the key column by the ‘scratch’ word in Settings > System > Advanced System Settings of the ESXi host in vSphere Client.

VMFS6-03

You only need to point the ScratchConfig.ConfiguredScratchLocation setting to a new location and reboot the host for this change to take effect.

Note: Before doing any changes, make sure that the .locker folder (should be unique for each configured host to avoid data mixing or overwrites) has been created on the desired datastore. Otherwise, the persistent scratch location remains the same.

To review and modify advanced host parameters including the persistent scratch location via PowerCLI, look for two cmdlets named Get-AdvancedSetting and Set-AdvancedSetting. This procedure is well-documented in KB 1033696.

An information about how to automate the diagnostic coredump file relocation will be covered in Part 2 or this series later this month. Keep you posted!

vSphere 6.5: Switching to Native Drivers in ESXi 6.5

The Native Device Driver architecture is not something new. Since its introduction more than five years ago, VMware encourages their hardware ecosystem partners to work on developing native drivers. A list of supported hardware is growing with every major release of ESXi, with the company’s aim to deprecate the vmkLinux APIs and associated driver ecosystem completely in the future releases of vSphere.

The benefits of using the native drivers are as follows:

  • It removes the complexity of developing and maintaining Linux derived drivers,
  • It improves the system performance,
  • It frees from the functional limitations of Linux derived drivers,
  • It increases the stability and reliability of the hypervisor, as native drivers are designed specifically for VMware ESXi.

Saying that one of the steps when upgrading to a new version of vSphere is to check that the hardware supports native drivers. By default, if ESXi identifies a native driver for a device it will be loaded instead of Linux derived driver. However, it is not always a case, and you need to check whether native drivers are in use after the system upgrade.

Following steps in KB 1031534 and KB 1034674, you can pinpoint PCI devices and corresponding drivers loaded for each of them:

  • To identify a storage HBA (such as a fibre card or RAID controller), run this command:

# esxcfg-scsidevs -a

  • To identify a network card, run this command:

# esxcfg-nics -l

  • To list device state and note the hardware IDs, run this command:

# vmkchdev -l

The /etc/vmware/default.map.d/ folder on ESXi host contains a full list of map files referring to the native drivers available for your system.

ESXi-Native-Driver-01

To quickly identify the driver version, you can run this command:

# esxcli software vib list | grep <native_driver_name>

In addition, information about available vSphere Installation Bundles (VIBs) in vSphere 6.5 can be found via the web client or PowerCLI session:

  • To view all installed VIBs in vSphere Client (HTML5), open Configure > System > Packages tab in the host settings:

ESXi-Native-Driver-02

  • To view all installed VIBs in VMware Host Client, open Manage > Packages tab in the host settings:

ESXi-Native-Driver-03

  • To list all installed VIBs in PowerCLI, run this command:

# (Get-VMHost -Name ‘<host_name>‘ | Get-EsxCli).software.vib.list() | select Name,Vendor,Version | sort Name

Comparing findings above with information in the IO Devices section in VMware Hardware Compatibility List, you would be able to find out whether native drivers available for your devices, as well as the recommended combination of the driver and firmware, tested and supported by VMware.

It worth reading the release notes for the corresponding drivers and search any reference to it on VMware and the third-party vendors’ websites, in case there are any known issues or limitations that might affect how device function.

If everything seems good, it is time to enable the native driver following steps in KB 2147565:

# esxcli system module set –enabled=true –module=<native_driver_name>

This change requires a host reboot and a thorough testing afterwards. The following commands can be quite helpful when troubleshooting native drivers:

  • To get the driver supported module parameters, run this command:

# esxcfg-module -i <native_driver_name>

  • To get the driver info, run this command:

# esxcli network nic get -n <vmnic_name>

  • To get an uplink stats, run this command:

# esxcli network nic stats -n <vmnic_name>

31/08/2018 – Update 1: After some feedback provided, I have decided to list well-known issues with the native drivers that exist currently. They are as follows:

  • The Mellanox ConnectX-4/ConnectX-5 native ESXi driver might exhibit performance degradation when its Default Queue Receive Side Scaling (DRSS) feature is turned on (Reference: vSphere 6.7 Release Notes),
  • Native software FCoE adapters configured on an ESXi host might disappear when the host is rebooted (Reference: vSphere 6.7 Release Notes),
  • HP host with QFLE3 Driver Version 1.0.60.0 experienced a PSOD or stuck at “Shutting down device drivers…” shutdown or restart (Reference: KB 55088),
  • ESXi 6.5 Storage Performance Issues and Fix (Reference: Anthony Spiteri’s blog).

vSphere 6.5: How to improve performance of the Content Library data transfers

I came across this subject by chance. However, it worth considering the KB 2112692 when planning for the content library design in vSphere 6.5.

For all operations related to content libraries that involve data transfer, which cannot be conducted by the ESXi hosts, VMware suggests the following:

  • Switch the traffic to run through non-secure HTTP connection. This is to improve the transfer speed for synchronisation tasks.

This can be done by enforcing HTTP connection for Library Sync operations for all the libraries (global approach) or on particular libraries (individual approach) in your vCenter Server instance.

At the time of writing this, there is no support for editing the Content Library Service settings via vSphere Client (HTML5). All changes can be implemented using the vSphere Web Client though.

To set this policy globally, you need to change ‘Force HTTP for Library Sync’ option in Administration > Deployment > System Configuration > Services > Content Library Service to true. This applies immediately and restart is not required.

vSphere65-CL-Performance-01

With the individual approach, you need to edit the URL of the subscribed library to start with http, instead of https. Again, no need to restart the Content Library Service.

  • Bypass the rhttpproxy. This is to improve the transfer speed for synchronisation tasks, importing and exporting items.

This change applies globally. It requires editing the vCenter Server configuration file vdc.properties, changing the firewall settings to allow connections to TCP port 16666, and restarting the Content Library Service.

VMware Tools 10.2.5: Changes to VMXNET3 driver settings

Last week VMware released a new version of VMware Tools.

VMware Tools 10.2.5

It might look like a minor upgrade. However, it includes important changes to the Receive Side Scaling (RSS) and Receive Throttle options in VMXNET3 driver which require attention and careful planning when implemented.

According to the vendor:

RSS is a mechanism which allows the network driver to spread incoming TCP traffic across multiple CPUs, resulting in increased multi-core efficiency and processor cache utilization. If the driver or the operating system is not capable of using RSS, or if RSS is disabled, all incoming network traffic is handled by only one CPU. In this situation, a single CPU can be the bottleneck for the network while other CPUs might remain idle.

Despite all benefits, this technology has been disabled on Windows 8 and Windows 2012 Server or later due to an issue with the vmxnet3 driver which affects Windows guest operating systems with VMware Tools 9.4.15 and later.

It was finally resolved in mid-2017 with the release of VMware Tools 10.1.7. However, only vmxnet3 driver version 1.7.3.7 in VMware Tools 10.2.0 was recommended by VMware for Windows and Microsoft Business Critical applications.

Few months after, VMware introduces the following changes to vmxnet3 driver version 1.7.3.8:

  • Receive Side Scaling is enabled by default,
  • The default value of the Receive Throttle is set to 30.

If you install VMware Tools 10.2.5 on a new virtual machine with Windows 8 and Windows 2012 Server or later, those settings will apply automatically; with the VMware Tools upgrade, they remain the same as it was before.

To check the current status of RSS and the Receive Throttle, you can execute the following PowerShell script inside the VM:

Get-NetAdapter | Where-Object { $_.InterfaceDescription -like “vmxnet3*” } | Get-NetAdapterAdvancedProperty | Where-Object { $_.RegistryKeyword -like “*RSS” -or $_.RegistryKeyword -like “RxThrottle” } | Format-Table -AutoSize

If you would like to edit those advanced options for all VMXNET3 NICs inside the VM, it can be done with the following two lines:

Get-NetAdapter | Where-Object { $_.InterfaceDescription -like “vmxnet3*” } | Set-NetAdapterAdvancedProperty -DisplayName “Receive Side Scaling” -DisplayValue “Enabled” -NoRestart
Get-NetAdapter | Where-Object { $_.InterfaceDescription -like “vmxnet3*” } | Set-NetAdapterAdvancedProperty -DisplayName “Receive Throttle” -DisplayValue “30” -NoRestart

Remember that after applying those settings, the virtual machine should be rebooted. As a result, the output will look similar to this:

VMXNET3-RSS

The only thing that is left is to perform thorough testing. Some ideas how to do it can be found in here.

25/04/2018 – Update 1: VMware released a knowledge base article about Windows 7 and 2008 virtual machines losing network connectivity on VMware Tools 10.2.0. To resolve this issue they recommend to upgrade to VMware Tools 10.2.5.

01/05/2018 – Update 2: VMware released VMware Tools 10.2.1. This minor update resolves an issue when ‘network ports are exhausted on Guest VM after a few days when using VMware Tools 10.2.0’.

A few noticeable changes to vROps 6.6 and Log Insight 4.5

With the recent releases of VMware vRealize Operations Manager 6.6 and Log Insight 4.5, VMware has made some changes to its products. They are as follows:

  • vRealize Operations Manager Plugin for vSphere Web Client should be removed after upgrading to vROps 6.6.vROps Plugin - 01
    In KB 2150394, VMware provides detailed instructions on how to do it for both vCenter for Windows and vCenter Virtual Appliance.
  • Native support for Active Directory in vRealize Log Insight is now deprecated.Log Insight - AD Integration - 01
    As per KB 2148976, VMware Identity Manager (VIDM) should be configured as an alternative. It is not confirmed yet whether VIDM is available for free for Log Insight users. More information will be posted on this thread on VMware Technology Network.I understand that this change widens the business scenarios for the product. However, for those of us who use Log Insight purely for collecting and analysing vSphere logs, it would be great to have Active Directory replaced with vCenter Single Sign-On (vCenter SSO). It sounds more logical.You can vote for the Log Insight integration with vCenter SSO on this link. It will be great if more people request this feature.

21/06/2017 – Update 1: VMware Identity Manager for Log Insight has been officially released and it is free! The VIDM virtual appliance is available on the Log Insight 4.5 download page.

27/06/2017 – Update 2: VMware has changed its policy and continues to provide support for the Active Directory integration in Log Insight 4.5. However, “it may be removed in a future version” (and probably will).