Log Insight Cluster: Data Not Balanced Across Nodes

In Log Insight 2.0 a scale-out feature was introduced. The best practice when using scale-out is to configure an external load balancer in front of the cluster and send all ingestion traffic (i.e. syslog and ingestion API) to the load balancer instead of directly to a node. The reason for this best practice is to assist with balancing data across nodes and also to provide ingestion high availability. In this post, I would like to discuss why data may not be balanced across a Log Insight cluster whether a load balancer is used or not.

Log Insight: From Standalone to Cluster

If you started with Log Insight 1.x or with an environment that a single Log Insight node could handle then you likely are running a standalone node. Upon upgrading to Log Insight 2.x, when ingestion increases above the amount a single node can handle or when business requirements change (e.g. requiring ingestion HA), moving from a standalone Log Insight instance to a cluster may become necessary. While building a cluster, even with an already leveraged standalone node, is very straightforward, there are several considerations to take into account. In this post, I will walk through these considerations as well as the process to perform the transformation.

