I hit an interesting issue the other day when attempting to reconfigure remote syslog on some ESXi hosts. What followed was an exercise on troubleshooting remote syslog on an ESXi host and I wanted to share some tips.
It all started when I attempted to configure syslog on some ESXi hosts using PowerCLI. Here is what I saw:
> Set-VMHostSyslogServer -VMHost esx01.matrix -SyslogServer "tcp://192.168.2.20" Set-VMHostSysLogServer : 4/25/2014 9:59:28 AM Set-VMHostSysLogServer Unable to configure syslog server information for host 'esx01.matrix'. Additional info: Message: Got no data from process /usr/lib/vmware/vmsyslog/bin/esxcfg-syslog --plugin=esxcli --loghost='tcp://192.168.2.10'; InnerText: Got no data from process /usr/lib/vmware/vmsyslog/bin/esxcfg-syslog --plugin=esxcli --loghost='tcp://192.168.2.10' EsxCLI.CLIFault.summary Check the inner exception for more details. At line:1 char:36 + get-vmhost | set-VMHostSysLogServer < <<< -SysLogServer "tcp://192.168.2.20" + CategoryInfo : InvalidArgument: (esx01,matrix:VMHostImpl) [Set-VMHostSysLogServer], VimException + FullyQualifiedErrorId : Client20_SystemManagementServiceImpl_SetVmHostSysLogServer_ViError,VMware.VimAutomation.ViCore.Cmdlets.Commands.Host.SetVMHostSysLogServer
Confused by the error, I attempted to perform the command on the ESXi host using esxcli:
# esxcli system syslog config get Default Network Retry Timeout: 180 Local Log Output: /scratch/log Local Log Output Is Configured: false Local Log Output Is Persistent: true Local Logging Default Rotation Size: 1024 Local Logging Default Rotations: 8 Log To Unique Subdirectory: false Remote Host: tcp://192.168.2.10 # esxcli system syslog config set --loghost="tcp://192.168.2.20" Got no data from process /usr/lib/vmware/vmsyslog/bin/esxcfg-syslog --plugin=esxcli --loghost='tcp://192.168.2.20'
This confirmed that the issue was not PowerCLI related. So what is going on and how can you fix it? Well, I was in a rush and had the luxury of rebooting the host so that is what I did. Even after reboot the issue appeared! So next it was time to actually troubleshoot the issue.
What you may or may not realize is that there is a hidden log file just for the ESXi syslog daemon. I started by tailing this file:
# tail /var/log/.vmsyslogd.err 2014-04-25T02:24:13.785Z vmsyslog.loggers.network : ERROR ] 10.148.104.10:514 - socket init calls failed: <class 'socket.error'> 2014-04-25T08:00:51.982Z vmsyslog.main : ERROR ] reloading (78683) 2014-04-25T08:00:57.031Z vmsyslog.loggers.network : ERROR ] 10.118.9.60:514 - socket init calls failed: <class 'socket.error'> 2014-04-25T08:01:02.032Z vmsyslog.loggers.network : ERROR ] 10.118.9.203:1514 - socket init calls failed: <class 'socket.error'> 2014-04-25T08:01:07.047Z vmsyslog.loggers.network : ERROR ] 10.148.104.10:514 - socket init calls failed: <class 'socket.error'>
So it appears that the syslog process is stuck with a socket error. After some trial and error, I found that the following commands fixed the issue:
kill -9 `ps -Cuv | grep syslog | awk '{print $1}'`; esxcli system syslog config set --loghost="tcp://192.168.2.10"; kill -9 `ps -Cuv | grep syslog | awk '{print $1}'`; esxcli system syslog config set --loghost="tcp://192.168.2.10"
Yes, I know I am running the same two commands twice, but I found this to be necessary on some ESXi hosts for the change to take effect (possibly watchdog related). So what do the two commands do? First, the syslog daemon is killed and then syslog is reconfigured, which in turn starts syslog automatically. Upon running the above command, the syslog log file looked better:
# tail /var/log/.vmsyslogd.err 2014-04-25T16:35:58.213Z vmsyslog.loggers.network : ERROR ] 10.148.104.10:514 - socket init calls failed: <class 'socket.error'> 2014-04-25T16:41:04.379Z vmsyslog.main : ERROR ] reloading (477791) 2014-04-25T16:46:04.263Z vmsyslog.main : ERROR ] Watchdog 477790 fired (child 477791 died with status 9)! 2014-04-25T16:46:04.354Z vmsyslog : CRITICAL] vmsyslogd daemon starting (479480) 2014-04-25T16:46:04.363Z vmsyslog.main : ERROR ] Watchdog 477790 exiting
Now, I still have not tracked down what causes the issue, but I have only seen it on ESXi 5.0/5.1 and only after a syslog misconfiguration or a host reconfiguration (e.g. move from one vCenter Server to another). For more information about this issue and similar ESXi syslog issues, see these links:
- Cannot set Syslog.global.logHost either by GUI or PowerCLI
- Incorrect Syslog configuration in the /etc/vmsyslog.conf file causes multiple issues
© 2014, Steve Flanders. All rights reserved.
Great article dude, thanks very much fixed it for me! 🙂
Thanks and I am glad it helped!