Did you know Log Insight has the notion of a master node and a leader node? Do you know the differences between the roles? Read on to learn more!
The first node in a Log Insight cluster is always the master. All other nodes are known as workers. The master node is also a worker node. The notion of a worker does not really matter, but as you may have guessed, the notion of a master does matter. I often get asked, how do you make another node in the cluster the master? The answer is you don’t. The first node is always the master. If the first node goes offline, the master role does not move. If you permanently lose the first node, there is no way to promote a worker node to the master role.
What is the impact of the master node going down? While the role of the master node has changed over time, as of all supported versions of Log Insight, the only role the master node plays is serving the /admin/cluster page and all the features of that page. In short, if the master node is down, you cannot access the /admin/cluster page, and this means you cannot perform cluster operations such as entering maintenance mode, changing VIPs, or upgrading the cluster — all previously configured functionality on the /admin/cluster page continues to work even if the master node is down, but changes cannot be made.
What if you permanently lose the master node? Well, hopefully, you have a backup you can restore, but in the worst case, you can deploy a brand new node with the same version, same IP address, and same FQDN as the old master node. The node should automatically join the cluster shortly after booting.
Once you have a Log Insight cluster, a leader election occurs, and one of the nodes becomes the leader. Any node, including the master node, can become the leader. Only one leader will exist at a time. The leader role serves various purposes on the backend, but on the front-end, it simply serves as the node that hosts the VIPs. All configured VIPs are served on the leader, which means all ingestion and query traffic comes into the leader node. The traffic is then L4 load-balanced by the leader node to worker nodes in the cluster. In addition, it is the leader node that connects to configured integrations (e.g., vCenter and vROps).
What is the impact of the leader nodes going down? During the downtime, configured VIPs may be unavailable, and leader-specific operations will fail. The good news is that Log Insight detects when a leader is unavailable and performs a leader election, assuming the cluster still has quorum. When the election is over, a new leader is announced, which handles all the leader role responsibilities (e.g., VIP failover).
How do you determine which node is the leader? Visit the /admin/cluster page. Is it possible to fail over the leader? Sure, but this should never be necessary. Of course, a reboot of the leader node will do it, but you can also just put the node in maintenance mode on the /admin/cluster page.
All Log Insight nodes are worker nodes. The master node and leader node also perform worker responsibilities. When using the integrated load balancing functionality of Log Insight, the load is distributed across workers via L4 load balancing. Since workers perform all Log Insight actions, it is critical that all nodes have the same access to external systems — see the security guide.
Does only the leader node need access to the archive destination if configured? No, all nodes need access. Does event forwarding originate from the VIP (aka leader node)? No, from all worker nodes. Does only the leader node need access to vCenter/vROps? No, all worker nodes need access.
© 2018 – 2021, Steve Flanders. All rights reserved.