vSAN 6.6.1: vSAN Build Recommendation Engine Health issue [RESOLVED]

In my previous post about vSAN Build Recommendation Engine Health test, I have concluded that it was a bug in vSAN 6.6.1 that prevented vSAN Health service from properly connecting to the Internet via proxy.

With vCenter Server Appliance 6.5 Update 1d release, I have noticed that one of two warning messages disappeared from the vSphere Web Client leaving that task in the ‘Unexpected vSphere Update Manager (VUM) baseline creation failure‘ state.

After checking vSAN configuration one more, I concluded the following:

  • Internet connectivity for automatic updates of the HCL database has been set up properly (vSAN_Cluster > Configure > vSAN > General):

vSAN-BRE-01

  • The HCL database is up-to-date and CEIP is enabled (vSAN_Cluster > Configure > vSAN > Health and Performance):

vSAN-BRE-02

vSAN-BRE-03

  • Update Manager has proxy settings configured and working (vSAN_Cluster > Update Manager > Go to Admin View > Manage > Settings > Download Settings):

vSAN-BRE-04

vSAN-BRE-05

At the same time, the proxy server replaces SSL certificates with its own one signed by the corporate CA when establishing HTTPS connection with the remote peer.

As a result, it causes an error message for the vSAN Build Recommendation Engine Health task as follows (extract from vmware-vsan-health-service.log):

INFO vsan-health[Thread-49] [VsanVumConnection::RemediateVsanClusterInVum] build = {u’release’: {u’baselineName’: u’VMware ESXi 6.5.0 U1 (build 5969303)’, u’isoDisplayName’: u’VMware ESXi Release 6.5.0, Build 5969303′, u’bldnum’: 5969303, u’vcVersion’: [u’6.5.0′], u’patchids’: [u’ESXi650-Update01′], u’patchDisplayName’: u’VMware ESXi 6.5.0 U1 (vSAN 6.6.1, build 5969303)’}}

INFO vsan-health[Thread-49] [VsanVumConnection::_LookupPatchBaseline] Looking up baseline for patch VMware ESXi 6.5.0 U1 (vSAN 6.6.1, build 5969303) (keys: [40], hash: None)…

INFO vsan-health[Thread-49] [VsanVumConnection::_LookupPatchBaseline] Looking up baseline for patch vSAN recommended patch to be applied on top of ESXi 6.5 U1: ESXi650-201712401-BG (keys: [], hash: None)…

ERROR vsan-health[Thread-49] [VsanVumConnection::RemediateAllClusters] Failed to remediate cluster ‘vim.ClusterComputeResource:domain-c61’

Traceback (most recent call last):
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVumConnection.py”, line 1061, in RemediateAllClusters
performScan = performScan)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVumConnection.py”, line 876, in RemediateVsanClusterInVum
patchName, patchMap[chosenRelease])
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVumConnection.py”, line 373, in CreateBaselineFromOfficialPatches
baseline = self._LookupPatchBaseline(name, keys)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVumConnection.py”, line 411, in _LookupPatchBaseline
result = bm.QueryBaselinesForUpdate(update = updateKeys)
File “/usr/lib/vmware-vpx/pyJack/pyVmomi/VmomiSupport.py”, line 557, in <lambda>
self.f(*(self.args + (obj,) + args), **kwargs)
File “/usr/lib/vmware-vpx/pyJack/pyVmomi/VmomiSupport.py”, line 362, in _InvokeMethod
list(map(CheckField, info.params, args))
File “/usr/lib/vmware-vpx/pyJack/pyVmomi/VmomiSupport.py”, line 883, in CheckField
raise TypeError(‘Required field “%s” not provided (not @optional)’ % info.name)
TypeError: Required field “update” not provided (not @optional)

INFO vsan-health[Thread-49] [VsanVumSystemUtil::AddConfigIssue] Add config issue createBaselineFailed

INFO vsan-health[Thread-49] [VsanVumConnection::_DeleteUnusedBaselines] Deleting baseline VMware ESXi 6.5.0 U1 (vSAN 6.6.1, build 5969303) (id 424) because it is unused

INFO vsan-health[Thread-49] [VsanVumSystemUtil::VumRemediateAllClusters_DoWork] Complete VUM check for clusters [‘vim.ClusterComputeResource:domain-c61’]

ERROR vsan-health[Thread-49] [VsanVumConnection::RemediateAllClusters] Failed to remediate cluster

Following the community advice, I decided to add Root CA and subordinate CA certificates (in *.pem format) to the local keystore on vCenter Server Appliance. After copying certificates to /etc/ssl/certs and running the c_rehash command, I added proxy servers to /etc/sysconfig/proxy and rebooted the server.

vSAN-BRE-07

To test that new configuration works, I used the wget command, and it all seemed to work smoothly.

vSAN-BRE-06

Regardless of all that changes, I still got error messages with the vSAN Build Recommendation Engine Health test, but this time they looked a bit different:

INFO vsan-health[Thread-11125] [VsanPyVmomiProfiler::InvokeAccessor] Invoke: mo=ServiceInstance, info=content

WARNING vsan-health[Thread-11125] [VsanPhoneHomeWrapperImpl::_try_connect] Cannot connect to VUM. Will retry connection

INFO vsan-health[Thread-11125] [VsanPyVmomiProfiler::InvokeAccessor] Invoke: mo=group-d1, info=name

INFO vsan-health[Thread-11125] [VsanPyVmomiProfiler::InvokeAccessor] Invoke: mo=group-d1, info=name

ERROR vsan-health[Thread-11125] [VsanCloudHealthDaemon::run] VsanCloudHealthSenderThread exception: Exception: HTTP Error 411: Length Required, Url: https://vcsa.vmware.com/ph/api/dataapp/send?_v=1.0&_c=VsanCloudHealth.6_5&_i=<Support_Tag&gt;, Traceback: Traceback (most recent call last):
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthUtil.py”, line 511, in getResponse
resp = proxyOpener.open(*args, **kwargs)
File “/usr/lib/python2.7/urllib2.py”, line 435, in open
response = meth(req, response)
File “/usr/lib/python2.7/urllib2.py”, line 548, in http_response
‘http’, request, response, code, msg, hdrs)
File “/usr/lib/python2.7/urllib2.py”, line 473, in error
return self._call_chain(*args)
File “/usr/lib/python2.7/urllib2.py”, line 407, in _call_chain
result = func(*args)
File “/usr/lib/python2.7/urllib2.py”, line 556, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 411: Length Required
Traceback (most recent call last):
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 353, in run
self._sendCloudHealthData(clusterUuid, data=data)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthDaemon.py”, line 321, in _sendCloudHealthData
objectId=clusterUuid, additionalUrlParams=additionalUrlParams)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthConnector.py”, line 156, in send
dataType=dataType, pluginType=pluginType, url=postUrl)
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthConnector.py”, line 139, in sendRawData
raise ex
VsanCloudHealthHTTPException: Exception: HTTP Error 411: Length Required, Url: https://vcsa.vmware.com/ph/api/dataapp/send?_v=1.0&_c=VsanCloudHealth.6_5&_i=<Support_Tag&gt;, Traceback: Traceback (most recent call last):
File “/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanCloudHealthUtil.py”, line 511, in getResponse
resp = proxyOpener.open(*args, **kwargs)
File “/usr/lib/python2.7/urllib2.py”, line 435, in open
response = meth(req, response)
File “/usr/lib/python2.7/urllib2.py”, line 548, in http_response
‘http’, request, response, code, msg, hdrs)
File “/usr/lib/python2.7/urllib2.py”, line 473, in error
return self._call_chain(*args)
File “/usr/lib/python2.7/urllib2.py”, line 407, in _call_chain
result = func(*args)
File “/usr/lib/python2.7/urllib2.py”, line 556, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 411: Length Required

INFO vsan-health[Thread-11125] [VsanCloudHealthDaemon::run] VsanCloudHealthSenderThread done.

vsan-health[Thread-9] [VsanCloudHealthDaemon::_sendExceptionsToPhoneHome] Exceptions for collection/sending exceptions

I thought that the vSAN Health service might try to contact vSphere Update Manager directly, and the proxy settings set on the OS level redirected this request to the Internet proxy instead.

I have added the local domain to the exception list in /etc/sysconfig/proxy and rebooted the server again.

vSAN-BRE-08

After reading about ‘HTTP Error 411’, the only idea was to add a domain service account and its password to HTTP_PROXY and HTTPS_PROXY lines in /etc/sysconfig/proxy. If the password has special characters, they should be added in ASCII encoding to work correctly.

To my great surprise, all communication issues have been resolved, and the vSAN Health service was able to synchronise data with vSphere Update Manager and online services correctly.

vSAN-BRE-09

vSAN-BRE-11

A few minutes later vSAN system baselines and baseline groups appeared in vSphere Update Manager.

vSAN-BRE-10

Of cause, those modifications in Photon OS configuration files are not supported by VMware and could be overwritten by future updates. Yet I hope engineers and developers are working on better integration between vSAN Health and vSphere Update Manager when vCenter resides behind the proxy.

23/02/2018 – Update 1: Per VMware documentation, a starting point to troubleshoot connectivity to the CEIP web server is to make sure the following prerequisites are met:

One thought on “vSAN 6.6.1: vSAN Build Recommendation Engine Health issue [RESOLVED]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s