If you are upgrading or have upgraded to vSphere 6.0, you should be aware of a couple syslog gotchas. These will be especially important if you are running a central logging system like vRealize Log Insight. Read on to learn more.
vCenter Server / VCSA
If you are upgrading or have upgraded from vCenter Server / VCSA 5.x to 6.0 and had been sending logs to a remote syslog destination like Log Insight then you should know that the sending of logs will not work after the upgrade! Luckily, this appears to be a one-time, post-upgrade issue with an easy fix that requires only restarting the syslog service. VMware has released a KB for this issue with the steps necessary to fix it from the vSphere client as well as the CLI.
Credit: my colleague Alan Castonguay uncovered this issue and the content described below.
Once you upgrade to ESXi 6.0, you may notice that your forwarded hostd logs no longer follow the syslog RFC.
Important: This issue ONLY impacts hostd logs and not other logs like vpxd.
For example, when I sent them to my Log Insight instance they appeared like:
NoneZ esx03.matrix Hostd: 2015-03-17T01:14:32.870Z info hostd[59C40B70] [[email protected] sub=Vimsvc.TaskManager opID=3de02b57-61fe user=vpxuser] Task Created : haTask-ha-host-vim.host.NetworkSystem.commitTransaction-62674
Notice how the event starts with “NoneZ” instead of a timestamp. The good news is that the events do have a proper timestamp on ESXi (i.e. /var/log/hostd.log). This means the issue only impacts forwarded events. Since the forwarded events do not comply with the syslog RFC, this can cause issues on remote logging products such as Log Insight. For example, the appname (Hostd) will not be properly extracted because the events does not comply with the syslog RFC.
Now you might be wondering, how can you confirm this is an ESXi issue and not a remote logging product issue? Here is a little shell command you can run on ESXi to confirm:
[[email protected]:/var/log] tcpdump-uw -c 99999 -B 1500 -i vmk0 -s 1514 -vvv 'dst port 514 and ip host li01.matrix' | grep 'Hostd:' tcpdump-uw: listening on vmk0, link-type EN10MB (Ethernet), capture size 1514 bytes Msg: NoneZ esx03.matrix Hostd: [LikewiseGetDomainJoinInfo:355] QueryInformation(): ERROR_FILE_NOT_FOUND (2/0):\0x0a
This issue was just uncovered when doing some post vSphere 6 upgrade testing with Log Insight and I am not aware of a VMware KB at this time, but stay tuned!
© 2015, Steve Flanders. All rights reserved.
11 comments on “vSphere 6.0 Syslog Gotchas”
Just came across this searching for RFC issues I’m seeing with syslog mesages from ESX. Contrary to what your blog here seems to say though, hostd and especially hostd-probe messages from ESX 5.X were most blatantly violating the syslog RFC before 6.0 already, in even worse ways, like sending some unknown text (Section for VMWare ESX) in front of the hostname, which makes the hostname seen by the syslog server as “Section”.
Thanks for the comment and you are absolutely right! My point was previously some of the hostd events did and could be picked up by the vSphere content pack in Log Insight, but in vSphere 6.0 all hostd logs where missing the timestamp so while some did work all failed in 6.0. ESXi logs have improved over time, but I also have seen multiple RFC issues with them.
That’s not all. My syslog server can handle the missing timestamp and skips it, but then barks because the next field hasto be the hostname, but instead comes with a wide variety of nonsense data instead. Like a vmnic up/down message starting with the mac address instead of the hostname. Or above mentioned”Section for VMware.. junk.
All in all, I’m pretty dumbstruck how bad VMWares syslog implementation really is…
Pretty cool it can skip timestamp, but that is yet another syslog RFC violation. Agreed that the syslog RFC violations are surprising.
I was just surfing the web looking into this issue because it seemed very odd and i was wondering since we have not patched in a while whether or not this had been resolved recently.
Hey Alex, it has been fixed in 6.0.0b — see the KB from Ravi
Did you ever figure out what’s up with all of these “[LikewiseGetDomainJoinInfo:355] QueryInformation(): ERROR_FILE_NOT_FOUND (2/0):” messages in the syslog? I’ve upgraded to ESXi 6.0 and am redirecting to SolarWinds NPM 11.5.3. I’m seeing tons of these in our syslog under the error category. I’ve patched up and rebooted the host(s) to version 6.0.0 build 3620759 and still see the errors being generated. I did rebuild my vCenter and noticed VMware had a KB article stating if you didn’t put the host into maint. mode first before changing vCenters you’ll get these errors. However, even putting the host into maint. mode and rebooting/patching doesn’t resolve the errors being generated.
Hey Cody — I did not. Actually, I am only using local accounts in the home lab so weird likewise is even in play.
Any further insight you could share about the
“LikewiseGetDomainJoinInfo:355] QueryInformation(): ERROR_FILE_NOT_FOUND (2/0):”
messages that we are receiving by the thousands?
That message is safe to ignore. Likewise is used for authentication. I believe you can resolve the log message by joining a domain. If you do not plan to do this then you should just ignore the log message. More information here: https://communities.vmware.com/thread/547358.