Did you know Log Insight has the notion of a master node and a leader node? Do you know the differences between the roles? Read on to learn more!
Master Node
The first node in a Log Insight cluster is always the master. All other nodes are known as workers. The master node is also a worker node. The notion of a worker does not really matter, but as you may have guessed, the notion of a master does matter. I often get asked, how do you make another node in the cluster the master? The answer is you don’t. The first node is always the master. If the first node goes offline, the master role does not move. If you permanently lose the first node, there is no way to promote a worker node to the master role.
What is the impact of the master node going down? While the role of the master node has changed over time, as of all supported versions of Log Insight, the only role the master node plays is serving the /admin/cluster page and all the features of that page. In short, if the master node is down, you cannot access the /admin/cluster page, and this means you cannot perform cluster operations such as entering maintenance mode, changing VIPs, or upgrading the cluster — all previously configured functionality on the /admin/cluster page continues to work even if the master node is down, but changes cannot be made.
What if you permanently lose the master node? Well, hopefully, you have a backup you can restore, but in the worst case, you can deploy a brand new node with the same version, same IP address, and same FQDN as the old master node. The node should automatically join the cluster shortly after booting.
Leader Node
Once you have a Log Insight cluster, a leader election occurs, and one of the nodes becomes the leader. Any node, including the master node, can become the leader. Only one leader will exist at a time. The leader role serves various purposes on the backend, but on the front-end, it simply serves as the node that hosts the VIPs. All configured VIPs are served on the leader, which means all ingestion and query traffic comes into the leader node. The traffic is then L4 load-balanced by the leader node to worker nodes in the cluster. In addition, it is the leader node that connects to configured integrations (e.g., vCenter and vROps).
What is the impact of the leader nodes going down? During the downtime, configured VIPs may be unavailable, and leader-specific operations will fail. The good news is that Log Insight detects when a leader is unavailable and performs a leader election, assuming the cluster still has quorum. When the election is over, a new leader is announced, which handles all the leader role responsibilities (e.g., VIP failover).
How do you determine which node is the leader? Visit the /admin/cluster page. Is it possible to fail over the leader? Sure, but this should never be necessary. Of course, a reboot of the leader node will do it, but you can also just put the node in maintenance mode on the /admin/cluster page.
Worker Nodes
All Log Insight nodes are worker nodes. The master node and leader node also perform worker responsibilities. When using the integrated load balancing functionality of Log Insight, the load is distributed across workers via L4 load balancing. Since workers perform all Log Insight actions, it is critical that all nodes have the same access to external systems — see the security guide.
Does only the leader node need access to the archive destination if configured? No, all nodes need access. Does event forwarding originate from the VIP (aka leader node)? No, from all worker nodes. Does only the leader node need access to vCenter/vROps? No, all worker nodes need access.
© 2018 – 2021, Steve Flanders. All rights reserved.
Thanks for explanation Steve.
I suppose the leader is the node with (ILB) after the IP address on the /admin/cluster/ page?
It may be worth to say that this leader node VM shows the ILB IP address as a second network interface/IP address on VM level in addition to its own.
This may have impacts in IP address inventory systems, if the ILB address is a separate inventory entry there (like “Service IP”): It will be shown as a “duplicate” IP address at the leader node in this inventory system.
Good point — let me update the post.
What would happen if the master node A.K.A Web UI node goes down and doesnt come back up.? I know we should have a backup of the VM and i assume if we restore the backup everything might be fine BUT what if a backup is not applicable. Would re-building the node and re-joining fix the master node issue ?
disregard, i read this: What if you permanently lose the master node? Well, hopefully you have a backup you can restore, but in worst case you can deploy a brand new node with the same version, same IP address and same FQDN as the old master node. The node should automatically join the cluster on shortly after booting.
What if you permanently lose the master node? Well, hopefully you have a backup you can restore, but in worst case you can deploy a brand new node with the same version, same IP address and same FQDN as the old master node. The node should automatically join the cluster on shortly after booting.
Regarding this Steve, i attempted to try to bring down the Master by deleting it, and re-building it, it did not automatically join and admin page kept crashing. Currently using the new Log Insight 4.8. I also wonder if the backup wouldn’t work as well.
Hey John — Try manually copying the config XML from /storage/core/loginsight/config to the new master and restarting it.