vSAN 6.5-6.6.1: An urgent hotfix ESXi650-201710401

VMware has just released a new hotfix for ESXi and vSAN (KB 2151081) urging customers with all-flash configuration with deduplication enabled to upgrade their environment as soon as possible. This patch resolves data corruption issue which might appear in rare circumstances.

ESXi650-201710401

The affected versions of vSAN include 6.5, 6.6, and 6.6.1.

06-10-2017 – Update 1: As listed in KB 2151042, similar issue has been fixed for ESXi 6.0.

vSAN 6.6.1: vSAN Build Recommendation Engine Health fails

As you might already know, vSAN 6.6.1 is the first release with automated build recommendations for vSAN clusters for vSphere Update Manager, which should help to keep your hardware in a supported state by comparing information from the VMware Compatibility Guide and vSAN Release Catalog with information about the installed ESXi releases.

Obviously, this feature requires vSAN to have Internet access to update release metadata, as well as valid My VMware credentials to download ISO images for upgrades.

To help customers with enabling vSAN build recommendations, VMware embedded some health checks into vSAN 6.6.x that contribute to resolve configuration issues. The build recommendation engine health check detects the following states:

  • Internet access is unavailable.
  • vSphere Update Manager (VUM) is disabled or is not installed.
  • VUM is not responsive.
  • vSAN release metadata is outdated.
  • My VMware login credentials are not set.
  • My VMware authentication failed.
  • Unexpected VUM baseline creation failure.

If the virtual environment seats behind the proxy, you should configure proxy settings in the Internet Connectivity option in vSAN_ClusterConfigure > vSAN > General.

vSAN Health Engine Issue - 02

Those parameters are kept in /etc/vmware-vsan-health/config.conf. Be careful with the user password, as it is added to this file without any encryption.

To test access through the proxy, you can click on the Get latest version online button in vSAN_ClusterConfigure > Health and Performance to update the HCL Database. If everything setup correctly, it will generate the following lines in /var/log/vmware/vsan-health/vmware-vsan-health-service.log:

INFO vsan-health[ID] [<user_name> op=UpdateHclDbFromWeb obj=VsanHealthService] Update HCL database from Web
INFO vsan-health[ID] [VsanHclUtil::_getHttpResponse] Download via proxy

However, even if the Internet connection works, the vSAN Build Recommendation Engine Health test will produce a warning message as follows:

vSAN Health Engine Issue - 01

In the log file you will see lines like these:

WARNING vsan-health[healthThread-c3ad57ea-a3f1-11e7] [VsanCloudHealthUtil::checkNetworkConnection] Internet is not connected.

File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 337, in run
profiler=self.profiler):
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 279, in collectedResults
VsanCloudHealthCollector.updateManifestWithPerCluster(serviceInstance)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 230, in updateManifestWithPerCluster
cls._updateManifest()
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 190, in _updateManifest
manifestVersion = cls._queryManifestVersion()
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 174, in _queryManifestVersion
dataType=’manifest_version’, objectId=MANIFEST_VERSION_UUID)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthConnector.py”, line 209, in getClusterHealth
maxRetries=maxRetries, waitInSec=waitInSec)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthConnector.py”, line 247, in getObject
responseBody = self._getPhoneHomeResultsWithRetries(urlParams)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthConnector.py”, line 279, in _getPhoneHomeResultsWithRetries
raise e
VsanCloudHealthConnectionException: <urlopen error [Errno 110] Connection timed out>

Apparently, it is a bug in the current version of vSAN that is documented in the VMware KB 2151692. Neither fix nor workaround is available at the time of writing this blog post.