Host is not responding

I was recently deploying a Cloud Foundry instance and was experiencing errors during the deployment. From the several failed deployments, I received the following BOSH error messages:

Error 100: Unable to communicate with the remote host, since it is disconnected.

mysql_node/54: Unable to communicate with the remote host, since it is disconnected.

Error 100: A general system error occurred: Server closed connection after 0 response bytes read; SSL(TCPClientSocket(this=000000000de625e0, state=CONNECTED, _connectSocket=TCP(fd=-1), error=(null)) TCPStreamWin32(socket=TCP(fd=23280) local=10.23.6.17:59642, peer=10.17.0.156:443))

Looking at the events on the ESXi host I saw the following:

Host is not responding

What was causing the issue and how can it be fixed?

Continue reading

Cannot install the vCenter agent service. Cannot upload agent

During a recent change ticket in a non-production environment I was called into an All Paths Down (APD) situation on some ESXi hosts. For those who do not know what an APD is, it is when an ESXi host loses all paths to its shared storage. The ESXi hosts impacted in my particular case were hosting several virtual vCenter Server (vCS) instances. The virtual vCS VMs were being used by a group of developers for SOAP calls in order to provision, modify, and delete VMs. Once all the vCS instances had been recovered and vSphere client sessions verified that the instances were operational, the instances were turned back over to development. The developers immediately began complaining about SOAP commands failing to the vCS instances.

What was going on?

Continue reading

A general system error occurred internal error vmodl.fault.HostCommunication

I am in the process of building my home lab. I recently purchased two servers and installed ESXi 4.1 on them. In addition, I deployed a test vCenter Server instance so I could run VUM. With vCenter Server up, I attempted to add the two ESXi servers. The first one added without issue, but the second one failed with the error messages:

Cannot complete the configuration of the HA agent on the host. See the task details for additional information.

Misconfiguration in the host network setup

I verified that the hosts were in fact configured identically and then tried to add the host again, but the same error messages were displayed. Based on the error messages, I found the following KB article: http://kb.vmware.com/kb/1019200. Unfortunately, the link did not help.

Next, I removed the ESXi host from vCenter Server and tried to re-add it. This time I got a different error message:

A general system error occurred internal error vmodl.fault.HostCommunication

From this error message I found KB article: http://kb.vmware.com/kb/1012154. This article pointed to name resolution (i.e. DNS) being my issue. I know of the importance of DNS with VMware products and was sure I had verified its configuration, but decided to double check. As suspected, DNS was configured and working as expected.

At this point, I decided to restart the management services as that fixes a majority of ESX(i) issues. Upon doing so and trying to add the ESXi server to vCenter Server, I received another new error message:

Unable to access the specified host, either it doesn’t exist, the server software is not responding, or there is a network problem

This error message pointed me to KB article: http://kb.vmware.com/kb/1003409.

Again I tried everything suggested and was still receiving the same error message. At this point, I was frustrated. I decided to reboot the server just in case that fixed the issue. Upon restarting, the error message want back to the vmodl.fault.HostCommuncation one.

What was going on and how could this be fixed?

Continue reading

fault.CustomizationPending.summary

Quick issue for you today. I was attempting to clone a RHEL 5 VM with a guest customization and the operation kept failing with: fault.CustomizationPending.summary. Turns out the VM I was attemping to clone was previously cloned with a guest customization, but was never powered on. The resolution is simply to power on the VM and once it boots up shut it down. Then you should be able to clone the VM with a guest customization without issue. VMware posted a KB article on the issue (http://kb.vmware.com/kb/1006809), but it only refers to VirtualCenter 2.5.x though I just experienced this issue on vCenter Server 4.1.

vCenter Server and MAC address conflicts

As I am sure you have heard by now, VMware recently released some updates including ESX/ESXi 4.1 Update 1, vCenter Server 4.1 Update 1 and vCloud Director 1.0 Update 1. Unlike many other bloggers, I opted not to write about the updates as I do not find any of the information all that important to talk about (i.e. it did not apply to me). However, while reading over the list of known issues with vCenter Server 4.1 Update 1 I came across an issue referring to vCenter Server instances and potential MAC address conflicts. Since I have seen a growing number of environments where multiple vCenter Server instances have been deployed, I thought I would talk about the issue and one of the best practices around deploying such an environment.

vCenter Server generates MAC addresses of virtual machines in part from an instance ID randomly generated and assigned to a vCenter Server instance at installation time. While this ID cannot be set during installation time it can be changed post installation. Why is this important? Well, if you only have one vCenter Server instance in your environment than it probably is not. Even if you do have multiple vCenter Server instances running (e.g. one for development and one for production) you will likely not run into any problems. With that said, if you have more than one vCenter Server running in your environment and by chance any of them have the same instance ID configured than it is possible that virtual machines in the vCenter Server instances with the same instance ID may have the same MAC address. As such, it is a best practice to ensure that each vCenter Server is configured with a unique instance ID.

Directions on how to check and set instance IDs are specified in the vCenter Server 4.1 Update 1 release notes and copied below.

Virtual machine MAC address conflicts

Each vCenter Server system has a vCenter Server instance ID. This ID is a number between 0 and 63 that is randomly generated at installation time but can be reconfigured after installation.
vCenter Server uses the vCenter instance ID to generate MAC addresses and UUIDs for virtual machines. If two vCenter Server systems have the same vCenter instance ID, they might generate identical MAC addresses for virtual machines. This can cause conflicts if the virtual machines are on the same network, leading to packet loss and other problems.

Workaround: If you deploy virtual machines from multiple vCenter Server systems to the same network, you must ensure that these vCenter Server systems have unique instance IDs.

To view or change the vCenter Server instance ID:

  1. Log in to vCenter Server using the vSphere Client, and select Administration > vCenter Server Settings.
  2. Select Runtime Settings.
    The vCenter Server Unique ID text box displays the current vCenter Server instance ID.
  3. If this ID is not unique, type a new unique value between 0 and 63 in the vCenter Unique ID text box and click OK.
  4. If you changed the vCenter Server instance ID, restart vCenter Server for the change to take effect.

If you have existing virtual machines with conflicting MAC addresses, edit the MAC addresses to make them unique:

  1. Ensure that the virtual machine is powered off.
  2. In the vSphere Client inventory, right-click the virtual machine and select Edit Settings.
  3. On the Hardware tab, select the virtual network adapter for the virtual machine.
  4. Under MAC Address, select Manual and specify a unique MAC address.
  5. Click OK.

Alternatively, you can force vCenter Server to generate a new MAC address for the virtual network adapter by configuring the virtual network adapter to use a Manual MAC address, and then reconfiguring it to Automatic.

Unable to add an ESX host to vCenter

While this issue has been discussed at length both in the communities and in knowledge base articles (ex. http://kb.vmware.com/kb/1003409), I cannot find a single KB article that lists every step I would perform to fix the issue and the order in which I would perform them.

There are two different kinds of ESX to vCenter connectivity problems that I would like to discuss:

  1. Initially adding an ESX host
  2. Reconnecting a disconnected ESX host

In my experience, the first problem is almost always caused by a DNS or network connectivity issue. To solve this, first try to add the ESX host by IP address instead of FQDN. If this works, ensure the ESX host and vCenter Server have the appropriate DNS servers, that they can resolve through them (i.e. nslookup) to other hosts, and most importantly that they can resolve to each other. If/When DNS is working properly, try to connect between the ESX host and vCenter server via ping and SSH.

In the case of a disconnected ESX host, the second problem listed above, simply restarting the management services fixes the issue most of the time:

The important thing to note is that after restarting the management services you may need to wait several minutes in order to confirm whether or not the issue is resolved (i.e. the host reconnects to vCenter). If this fails, in some rare cases closing out of the VI session and establishing a new connection resolves the issue. If the issue is still not resolved, disconnect the ESX host from vCenter and then manually remove the vmware-vpxa rpm from the host:

Finally, reconnect the ESX host. As a last-ditch effort, if the host is still listed as disconnected or remains “In Progress” during the “Add Host” function, restart the vCenter Server service.

While the KB articles suggest a variety of other options, the above steps have always resolved the issue for me.