To conclude the series on Log Insight system architecture I would like to talk about the life of an event. This post needed to come last so that the relevant system architecture could be discussed first.
Life of an Event
- Event is generated on a device (outside of Log Insight)
- Event is picked up and sent to Log Insight (inside and/or outside Log Insight)
- Log Insight agent using ingestion API or syslog
- Third party agent such as rsyslog, syslog-ng or log4j using syslog
- Custom writing to ingestion API (e.g. log4j appender)
- Custom writing to syslog (e.g. log4j appender)
- Event is received by Log Insight
- If using ILB then L4 LB directs the event to a single node which is responsible for processing it
- Event is declined — client handles declines (UDP drops, TCP uses protocol settings, CFAPI uses disk-backed queue)
- Event is accepted and client is notified
- Event goes through the Log Insight ingestion pipeline
- Keyword index is created/updated — index is stored in proprietary format on local disk
- Machine learning clusters event — clustering is stored in Cassandra
- Event is stored in compressed format on disk in a bucket — compressed event is stored in proprietary format on local disk
- Event is available for queries
- Keyword and glob queries are matched against the keyword index
- Regex is matched against compressed events
- Event ages as new events come in
- Each event is stored in a single on-disk bucket
- Buckets are not replicated across Log Insight nodes today — if you lose a node then you lose the data on that node
- A bucket can be a maximum of 1GB in size
- When a bucket reaches 1GB it is sealed
- A sealed bucket is immutable — you can read, but cannot write to it anymore
- Buckets are kept based on /storage/core – 3% and deleted on a FIFO model
- Once a bucket is sealed it is marked to be archived
- Once a sealed bucket is archived it is marked as archived
- This means an event may be retained locally as well as the archive at the same time
- Other information
- Once an event is deleted locally it can no longer be queried unless imported from the archive from the CLI
- Once all events for a machine learning cluster are deleted from Log Insight the cluster is removed from Cassandra
- Each event is stored in a single on-disk bucket
Summary
To highlight the most important aspects of an event’s lifecycle:
- Log Insight automatically rebalances all incoming events fairly across nodes in the cluster (i.e. even if a node is explicitly sent an event it may not be the node to ingest the event)
- An event’s metadata is stored in a proprietary format on a single Log Insight node and not in a database
- There is no way to determine what node an event was ingested on
- Events are stored locally in buckets that can grow up to 1GB in size
- Buckets are not replicated across nodes today
- Once a bucket gets to 1GB it is sealed
- Only once a bucket is sealed can it be archived
- Once it is archived it is marked as archived
- This means an event can exist locally on a node as well as on the archive
- Buckets are deleted in a FIFO model
- All buckets are stored on the /storage/core partition and Log Insight deletes old buckets when less than 3% of available space is available
The last bullet is one of the most important things to note — having a near-full /storage/core partition is 100% normal and expected. That partition should never reach 100% because Log Insight manages that partition. Now this does mean that users should not attempt to store data on that partition as it may interfere with Log Insight’s ability to properly clean up data.
© 2015, Steve Flanders. All rights reserved.
Good post Steve!
Thanks Pavan!
POsts are awesome
Thanks!