I am in the process of building my home lab. I recently purchased two servers and installed ESXi 4.1 on them. In addition, I deployed a test vCenter Server instance so I could run VUM. With vCenter Server up, I attempted to add the two ESXi servers. The first one added without issue, but the second one failed with the error messages:
Cannot complete the configuration of the HA agent on the host. See the task details for additional information.
Misconfiguration in the host network setup
I verified that the hosts were in fact configured identically and then tried to add the host again, but the same error messages were displayed. Based on the error messages, I found the following KB article: http://kb.vmware.com/kb/1019200. Unfortunately, the link did not help.
Next, I removed the ESXi host from vCenter Server and tried to re-add it. This time I got a different error message:
A general system error occurred internal error vmodl.fault.HostCommunication
From this error message I found KB article: http://kb.vmware.com/kb/1012154. This article pointed to name resolution (i.e. DNS) being my issue. I know of the importance of DNS with VMware products and was sure I had verified its configuration, but decided to double check. As suspected, DNS was configured and working as expected.
At this point, I decided to restart the management services as that fixes a majority of ESX(i) issues. Upon doing so and trying to add the ESXi server to vCenter Server, I received another new error message:
Unable to access the specified host, either it doesn’t exist, the server software is not responding, or there is a network problem
This error message pointed me to KB article: http://kb.vmware.com/kb/1003409.
Again I tried everything suggested and was still receiving the same error message. At this point, I was frustrated. I decided to reboot the server just in case that fixed the issue. Upon restarting, the error message want back to the vmodl.fault.HostCommuncation one.
What was going on and how could this be fixed?
At this point I was out of ideas and needed to log into the ESXi host to figure out what was going on. VMware typically recommends that if you need to log into the ESXi host then you should probably just reinstall it. I would have taken this advice, but I did not have shared storage in place and as such would need to manually move my VMs first.
Out of curiosity, and because I did not what to move my VMs, I decided to log into the host. Once logged in, I noticed the vpxa user existed, but saw no reference to aam (i.e. HA). I decided to force an uninstall of vpxa by running /opt/vmware/uninstallers/VMware-vpxa-uninstall.sh and the restarted the ESXi services with /sbin/services.sh. While I did receive an error, it appeared vpxa uninstalled completely. While running tail -f /var/log/vmware/hostd.log I tried to connect the host again and still was unsuccessful.
With no useful error messages I had not looked up yet, I decided to migrate all the VMs and reinstall the host. Upon doing so the host connected to vCenter Server without issue. Unfortunately, I still have no idea what was causing the problem.
© 2011, Steve Flanders. All rights reserved.