Log Insight 3.0 Agents: Parser Examples

Now that you know all about the Log Insight 3.0 agent parsers, it is time for a quiz! Read on to learn more.
li-agentThe quiz is straightforward — I will provide a log example and you tell me which parser to use. Bonus points if you can provide the entire agent configuration. Ready?

Test 1

22-09-2015 15:37:36,109 UTC INFO  [AgentDaemonMain] [ProductPluginManager@847] Loading plugin: system-plugin.jar (../../bundles/agent-x86-64-linux-6.1.0/pdk/plugins)
22-09-2015 15:37:36,197 UTC INFO  [AgentDaemonMain] [SystemPlugin@222] [getPlugin] type='autoinventory' info='AIX' res=org.hyperic.hq.plugin.system.SigarPlatformDetector@ba5d57f
22-09-2015 15:37:36,309 UTC INFO  [AgentDaemonMain] [AutoinventoryCommandsServer@185] Autoinventory Commands Server started up

The above events contain a bunch of different formats. At first glance, at least starting with the CLF base parser likely makes sense. Let’s break down the pieces of the events:

  • Timestamp: Unfortunately, the timestamp is in a non-standard format. While it can be extracted using a custom format, the default %t will not work. The timestamp could also just be ignored with “%i %i %i”
  • Severity: Easy with CLF base parser, but do note the double spaces or tab after
  • Class: Easy with CLF base parser
  • Thread: Easy with CLF base parser, but do note the threadid after the @ sign
  • Message: Some events could be parsed with KVP

Upon looking at the entire log message, it appears that none of the KVP pieces are valuable. As such, I would recommend the following configuration:

format=%i %i %i %{severity}i  [%{class}i] [%{thread}i@%M

Note the above example parsing section ignores the timestamp. In addition, the thread is properly extracted so that duplicate threads — with or without duplicate thread IDs — are properly grouped.

Test 2

{"nodeToken":"589a426e-64ed-4dbb-a8a2-19a7214c2898","system":{"systemStartTimestamp":1444070678904},"cpu":{"usage":3.172104520363772,"timestamp":1445782931589},"memory":{"usedMem":14696329216,"usedSwap":297488384,"timestamp":1445782931590},"disk":{"dataLocation":"/storage/core/loginsight/cidata","mount":"/storage/core","device":"/dev/mapper/data-core","used":120817180672,"available":129980788736,"total":264219529216,"usage":0.49},"iops":{"readsInRate":0.0,"writesOutRate":2.29,"dataReadRate":0.0,"dataWriteRate":15906.133333333333,"meanReadSize":0.0,"meanWriteSize":6945.909752547307,"timestamp":1445782931590},"ingestion":{"dropped":0,"droppedForPreservingData":0,"syslogReceived":3608716205,"importReceived":4688273006,"truncated":0,"meanIngestRate":2107.352715042263,"recentIngestRate":2.964393875E-314,"lastDropTimestamp":0,"lastIngestTimestamp":1444859241425},"gc":{"PS Scavenge":{"frequency":8,"averageTime":16.25,"timestamp":1445782931590},"PS MarkSweep":{"frequency":0,"averageTime":0.0,"timestamp":1445782931590}},"activeQueries":{}}

The above entry is in JSON. There is no JSON agent parser at this time and as explained earlier existing parsers should not be used for formats they were not intended for. As such, the above message should not be parsed. Do not attempt to use the CLF parser for JSON events. Note that the above event can still be collected and sent to Log Insight — it just should not be parsed client-side.

Test 3

tag1:test    foo:bar    test:key    long:long long long message    long-long-long-label:short

You may not be familiar with the above format, but it is called Labeled Tab Separated Values or LTSV. It is similar to KVP, but the separator is colon and between each pair is a tab. This is another example format that agent parsers did not support today so the event should not be parsed, but it can be collected and send to Log Insight.

Test 4

2015-09-30 14:52:39,903 | DEBUG | consoleproxy | ABaseInitialServerTransfer | Initiating a destination connection to on port 902 |

Given the above event you might think:

  • This event should not be parsed because it is not in a standard format — actually, it is!
  • This event can be parsed with CLF — it could, but this is not recommended
  • This events looks a lot like CSV — right!

The above event is basically CSV, but with pipe instead of commas. The CSV parser supports this and should be leveraged. You do need to be careful for the trailing pipe though:


Now, the above message could be parsed with CLF:

format=%t | %{severity}i | %{thread}i | %{class}i | %M

However, what if the next event in looked like:

2015-09-30 14:52:39,903 | DEBUG | test proxy | ABaseInitialServerTransfer | Initiating a destination connection to on port 902 |

Now the thread contains a space and the CLF parser can no longer be used. Whether this second event existed or not, the above format is known value separated and as such is not meant for the CLF parser.

Test 5

OK, last one — the kicker here is that the request method and the request URL needs to be extracted from events that contain such information.

[UTC:2015-09-30 23:55:00 Local:2015-09-30 16:55] [Test]: Thread-Id: 20 - context=""  localContext="" (21) PUT roles/my/test/domain
[UTC:2015-09-30 23:55:00 Local:2015-09-30 16:55] [Test]: Thread-Id: 23 - context=""  localContext="" (21) Response: Created 0:00.067

The CLF parser is capable of handling all of the parsing, however not all events contain the additional information that needs to be parsed. As such, multiple parsers are needed:

format=[UTC:%t Local:%i %i] [%{severity}i]: Thread-Id: %{thread}i - context="%{context}i"  localContext="%{localContext}i" %M
format=(%{transaction}i) %{request_method}i %{request_url}i

You may look at the above parsing and realize that the KVP parser could be of use. This is true, but given the format of the event is consistent/known and the CLF parser is already being used, the KVP parser is not required. I know I said that the proper parser should be leveraged, but in the above example the format of the message is known so using either CLF or KVP would be the “proper” parser.


Parsing events can be fun and can be challenging. The biggest problem is that while there are some event format standards like the syslog RFC or structured events like LTSV, no one follows the same standard and for complex applications it is common to see several different formats on the same system or even within the same file. The good news is for every application you can determine the agent configuration once and then re-use it. In my next two posts, I will explain some re-use functionality that has been added to Log Insight 3.0.
How did you do on the quiz?

© 2015, Steve Flanders. All rights reserved.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top