Continuing with the regex theme this week, I would like to cover a corner case with regular expression matching to be aware of. The example has to do with a single event that contains multiple lines or new line characters and the use of the .* regex.
2014-01-30 14:38:12,032 [Thread-57] [iaas-proxy] ERROR com.vmware.vcac.iaas.service.impl.CatalogRequestServiceImpl.failed:384 – Exception during request callback with id a739e6e6-d9fd-411d-a6a7-dd3b4d42a276 for item 8dfc5e04-d6a1-4dbc-91c7-b4418fc0c632. Error Message: [Error code: 42100 ] – [Error Msg: Infrastructure service provider error]
at java.lang.Thread.run(Unknown Source)
To highlight the gotcha, I will use the above event in Log Insight an attempt to extract a field.
Extracting a field using one line
Let’s say I want to extract the exception reason from the end of the event. To do so, I may define the extracted field as follows in Log Insight:
Extracting a field using multilines
While the above definition works, it may not match what I want. For example, what if I want to know the exception reason only for ApiClientException? To do this, I could add more context to my extracted field as follows:
In the second example, you can see that none of the event is highlighted. Looking at the regex you may wonder why. The only change was the addition of:
to the pre-context, which should match ApiClientException followed by anything, right? Well, anything except for new line characters. Turns out the period (.) means any character except for new line characters. Given that the example log message is a multiline message this is a problem.
To resolve the issue, replace the period with:
which means any digit or non-digit (including new lines). With the above modification, the extracted field works as expected:
While this is a corner case as multiline messages are less common and not well handled in syslog, it can cause a lot of frustration at first. I hope this helps!
© 2014, Steve Flanders. All rights reserved.