Did you know Log Insight has the notion of a master node and a leader node? Do you know the differences about the roles? Read on to learn more!
The first node in a Log Insight cluster is always the master. All other nodes are known as workers. The master node is also a worker node. The notion of worker does not really matter, but as you may have guessed the notion of master does matter. One question I often get asked is, how to you make another node in the cluster the master? The answer is you don’t. The first node is always the master. If the first node goes offline the master role does not move. If you permanently lose the first node there is no way to promote a worker node to the master role.
What is the impact if the master node goes down? While the role of the master node has changed over time, as of all supported versions of Log Insight, the only role the master node plays is serving the /admin/cluster page and all the features of that page. In short, if the master node is down, you cannot access the /admin/cluster page and this means you cannot perform cluster operations such as entering maintenance mode, changing VIPs or upgrading the cluster — all previously configured functionality on the /admin/cluster page continues to work even if the master node is down, but changes cannot be made.
What if you permanently lose the master node? Well, hopefully you have a backup you can restore, but in worst case you can deploy a brand new node with the same version, same IP address and same FQDN as the old master node. The node should automatically join the cluster on shortly after booting.
Once you have a Log Insight cluster, a leader election takes place and one of the nodes becomes the leader. Any node, including the master node, can become the leader. Only one leader will exist at a time. The leader role serves a variety of purposes on the backend, but on the front-end it simply serves as the node which hosts the VIPs. All configured VIPs are served on the leader, which means all ingestion and query traffic comes into the leader node. The traffic is then L4 load balanced by the leader node to worker nodes in the cluster. In addition, it is the leader node that connects to configured integrations (e.g. vCenter and vROps).
What is the impact if the leader nodes goes down? During the down time, configured VIPs may be unavailable and leader specific operations will fail. The good news is that Log Insight detects when a leader is unavailable and performs a leader election, assuming the cluster still has quorum. When the election is over, a new leader is announced which handles all the leader role responsibilities (e.g. VIP fail over).
How do you determine which node is the leader? Visit the /admin/cluster page. Is it possible to fail over the leader? Sure, but this should never be necessary. Of course, a reboot of the leader node will do it, but you can also just put the node in maintenance mode on the /admin/cluster page.
All Log Insight nodes are worker nodes. The master node and leader node also perform worker responsibilities. When using the integrated load balancing functionality of Log Insight, load is distributed across workers via L4 load balancing. Since workers perform all Log Insight actions it is critical that all nodes have the same access to external systems — see the security guide.
Does only the leader node need access to the archive destination if configured? No, all nodes need access. Does event forwarding originate from the VIP (aka leader node)? No, from all worker nodes. Does only the leader node need access to vCenter/vROps? No, all worker nodes need access.
© 2018, Steve Flanders. All rights reserved.