Log Insight 3.3: Query API

A great new addition to Log Insight 3.3 is the introduction of a query API. While the initial documentation for the feature has not been posted yet, it is in progress and should be available soon. In the meantime, I have included the latest information so you can start leveraging the API today. Read on to learn more!
li-logo

Credit

First and foremost, shout out to Nick Kushmerick for putting together this documentation — this post is all him!

Background

The Log Insight Query API allows end-users to programmatically query Log Insight to retrieve events, and aggregations over events. The API:

  • Exposes simple event and aggregated-event queries as HTTP GETs:
    1. Events
      GET /api/v1/events/constraint1/constraint2/…?param1&param2&...
    2. Aggregated events
      GET /api/v1/aggregated-events/constraint1/constraint2/…?param1&param2&...
  • Allows structured queries against both static fields, and dynamic fields defined in content packs.
  • Offers several standard aggregation functions (COUNT, UCOUNT, AVG, MIN, MAX, SUM, STDEV, VARIANCE, SAMPLE) on both static fields, and dynamic fields defined in content packs.
  • Allows aggregating events by time into fixed-width bins.
  • Defaults to a simple & fast queries:
    1. Events: up to 100 events from the last minute, with a 30 seconds timeout
    2. Aggregated events:  as above, with 5 second time bins and the COUNT aggregation function

NOTE: The query API today has the same limits as the UI in terms of returned results.

Authentication

The Log Insight Query API requires authentication, and Log Insight denies requests from non-authorized users.  Specifically, the Query API requires authentication by a user with at least the “User” role. Before invoking the Query API, your client must first authenticate and obtain a session id by POSTing to /api/v1/sessions, and then send the session id with the special header X-LI-Session-Id in subsequent requests to the Query API.  For example:

$ curl -sk -X POST -H 'Content-Type: application/json' --data "{\"username\":\"________\", \"password\":\"_______\"}" https://_________/api/v1/sessions
{"userId":"8a30e36a-d525-48bb-aaa8-9deec8ba21f1","sessionId":"2pj5iYL....nQB6c=","ttl":1800}
$ curl -sk -H 'Authorization: Bearer 2pj5iYL....nQB6c=' https://_________/api/v1/events
{"complete":true, … }

More information about the Authentication API can be found in my previous post here.

Specification

GET /api/v1/events/path&query

  • URL path and query – see details below
  • Request payload: none
  • Response:
    • Success:
      200 OK

      • Payload:
        {
            "events": [ event1, event2, … ],
            "complete": {true|false}
        }

      Where

      • ‘complete’ indicates whether the query result was fully computed before the timeout expired (true), or partial results are returned because the timeout expired (false).
      • Event:
        {
            "text": "original event text",
            "timestamp": 1234567890,
            "fields": [ field1, field2, … ]
        }
      • Field:  there are two formats for a field, depending on whether its value (a) does not exist in the event itself, or (b) is a substring of the original event :
        (a)

        {
            "name": "myfield",
            "content”: content
        }

        (b)

        {
            "name": "myfield",
            "startPosition": 47,
            "length": 18
        }
      • Content: a number (123.45) or “quoted string”
    • Failure:
      • 401 Unauthorized: the request is not authenticated or the user does not have the “User” role, or the simple query API is disabled
      • 400 Bad Request: the constraints are invalid (eg an invalid operator).

GET /api/v1/aggregated-events/path&query

  • URL path and query – see details below
  • Request payload: none

Response:

  • Success:
    200 OK

    • Payload:
      {
          "bins": [ bin1, bin2, … ],
          "complete": {true|false}
      }

    Where

    • ‘complete’ indicates whether the query result was fully computed before the timeout expired (true), or partial results are returned because the timeout expired (false).
    • Bin:
      {
          "min-timestamp": 1234567000,
          "max-timestamp": 1234567999,
          "value": value
      }
    • Value: a number (123.45) for numeric aggregation functions, or an event for the SAMPLE aggregation function.
  • Failure:
    • 401 Unauthorized: the request is not authenticated or the user does not have the “User” role, or the simple query API is disabled
    • 400 Bad Request: the constraints are invalid (eg an invalid operator)

Constraints in the URL path and query: constraint1/constraint2/…?key1=value1&key2=value2&…

  • URI path after /api/v1/events = zero or more constraints separated by “/”, optionally followed by “?” and then one or more key=value pairs separated by “?”
  • Constraint = one of…
    • “field/operator value”
      • Field
        • The text or timestamp magic fields
        • Any static field
        • A field defined in a content pack, referenced with the syntax content_pack_namespace.field_name (eg com.vmware.vsphere:vmw_user or com.lenovo.xclarity:lenovo_lxca_class).
      • Operator
        • Numeric operators
          • EQ (=), NE (!=), LE (<=), LT (<), GE (>=), GT (>)
        • String operators:
          • CONTAINS and NOT_CONTAINS
          • MATCHES_REGEX (=~) and NOT_MATCHES_REGEX (!=~)
        • Whitespace is optional with the terse form, whitespace is mandatory with the verbose form
        • There are no explicit STARTS_WITH, NOT_START_WITH operators, but this can be achieved with a trailing * ; for example, text/CONTAINS foo* retrieves events containing “foo”, “foobar”, “foobaz”, etc.
      • Value
        • Must be numeric for numeric operators.
    • field/EXISTS
  • Phrases:
    • text/CONTAINS foo bar retrieves events that contain the phrase “foo bar” (perhaps separated by punctuation).
    • text/CONTAINS bar foo  retrieves events that contain the phrase in the opposite order.
    • text/CONTAINS foo/text/CONTAINS bar retrieves events that contain foo or bar in either order but not necessarily both.
    • text/=~foo.*bar/text/=~bar.*foo retrieves events that contain both foo and bar in either order.
  • key=value pairs
    • limit=10 — maximum number of events to retrieve (limit must be at most 20,000 for event queries and 2,000 for aggregation queries)
    • timeout=60000 — number of milliseconds to wait for response, if the exact result is not available then the response will be a partial result with “complete=false”
  • Default URI path: timestamp/>T?limit=100&timeout=30000
    where T = 1 minute ago
  • Everything must be URL-encoded — eg, for “/api/v1/foo/> 10” the actual URL must be “/api/v1/events/foo/%3E%2010” or “/api/v1/events/foo/%3E+10”
  • Detailed example:
    GET /api/v1/events/text/foobar/filepath/!bifbuz/build_number/> 12345/text/=~[A-Z]*/java_class/a/java_class/b

    li-33-query-api-foobar-example

AND/OR/NOT and duplicated field/operator combinations

Arbitrary AND/OR/NOT constraint trees cannot be expressed with the Query API today.  For complex queries, this may require the client to submit multiple requests and merge the results on the client side. In general, constraints are ANDed: text/CONTAINS foo/size/>10 retrieves events that both contain “foo” and that have a size field greater than 10. However, if there are more than one constraint for a given field and operator, then the constraints are ORed.  For example: text/CONTAINS foo/text/CONTAINS bar/size/>10 retrieves events with size field greater than 10, and that contain either “foo” or “bar”. Arbitrary negation is not supported.  However, there is a negated version of each operator — for example, CONTAINS and NOT_CONTAINS, LT and GE, etc. This is the same behavior as the UI.

URL syntax — /api/v1/aggregated-events/constraint1/constraint2/…&key1=value1&key2=value2&…

Same options as for /api/v1/events, with seven additional key=value options:

  • bin-width=2000 — width in milliseconds of the time-range bins (default 5 seconds)
  • aggregation-function=AVG — the aggregation function:
    • COUNT — aggregate by counting the events in each bin (this is the default)
    • SAMPLE — aggregate by returning an arbitrary event from each bin
    • UCOUNT, MIN, MAX, SUM, STDEV, VARIANCE  — aggregate events using the given aggregation function on the field specified by aggregation-field
  • aggregation-field=size — the field to be aggregated.  Not permitted for COUNT, SAMPLE; mandatory for all other aggregation functions.
  • Example:
    /api/v1/aggregated-events?bin-width=1000&aggregation-function=UCOUNT&aggregation-field=appname

    li-33-query-api-url-example

Java Example

To tie this all together, how about an example Java class you can leverage? Good news, one is available here!

Summary

As you can see, the query API provides another means to get search results out of Log Insight. While any queries that can be done in Log Insight should be done in Log Insight, the query API allows you to manipulate, store, and display the events any way you wish. Have feedback on the new query API? Be sure to post on https://loginsight.vmware.com!

© 2016, Steve Flanders. All rights reserved.

3 comments on “Log Insight 3.3: Query API

A quick preview of some Query API enhancements that you’ll see in the next Log Insight Tech Preview:
* Custom sort criteria (e.g. most-to-least vs least-to-most recent for time, or largest-to-smallest vs smallest-to-largest for numerical aggregation functions) using the new order–by–direction parameter;
* Arbitrary “GROUP BY” expressions using the new aggregation-function and aggregation-field parameters;
* Relative time ranges (e.g. “last 10 minutes”) rather than absolute timestamps;
* Unordered matching rather than strict phrase matching with a new HAS operator (e.g. “text/HAS foo bar” matches “blah bar blah foo blah”, whereas “text/CONTAINS foo bar” wouldn’t match
In short, the Query API will soon match (and in some way exceeds) the flexibility/expressiveness of queries you can construct via the Log Insight UI.

Hi Steve,
Great article. This helps a lot 🙂
From this article https://sflanders.net/2013/12/19/query-building-log-insight-fields/ I understood different types of fields in vRLI and how to query.
1) Static Field:
/api/v1/events/hostname/CONTAINS+my_host/timestamp/GT+08262017
2) Content Pack Filed:
/api/v1/events/com.vmware.vsphere:vmw_esx_shell_command/CONTAINS+grep/timestamp/GT+08262017
3) User fields (aka custom fields)
Could you please guide me on how to query using custom field?

Hey Venkat — Thanks for the comment. Extracted fields are not supported via the API today.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top