The question of whether Log Insight supports deployment on vSphere stretched clusters has come up multiple times in the last few months. As such, I figured I would cover it in this post. Read on to learn more!
vSphere Stretched Clusters
A vSphere stretched cluster is a cluster that has ESXi hosts in two different locations (typically different data centers). The requirement is that the latency between the hosts cannot exceed 10ms Round-Trip Time (RTT). I am not going to cover stretched clusters in-depth in this post, but you can read more about them here.
Why Stretched Clusters?
The question becomes, why would you want a stretched cluster? Several reasons come to mind including:
- Workload mobility
- Cross-site automated load balancing
- Enhanced downtime avoidance
- Disaster avoidance (not disaster recovery)
Log Insight Cluster Requirements
As I have covered in the past, Log Insight requires that all nodes be in the same data center and on the same L2 network. There are several reasons for these requirements including:
- Minimum of three nodes
- ILB requires all nodes be on the same L2 network
- Cassandra requires minimal latency between nodes (no official number published)
- Query performance will be impacted by the slowest node (e.g. latency)
Log Insight and vSphere Stretched Clusters
The short answer is Log Insight is not officially supported on a stretched cluster today. The longer answer is you can run Log Insight on a stretched cluster and if you configure it to best practices (as highlighted in the link at the beginning of this post) then the configuration should work under the majority of situations, but may experience issues depending on the failure.
The primary stretched cluster best practice to follow is to define sites and keep all Log Insight nodes local to a single site — Log Insight would absolutely recommend this. In addition, the recommendation is to configure DRS “should%#8221; (not “must%#8221;) rules to insure Log Insight nodes stay in a single site during normal operations. This second recommendation is the one that can lead to problems. If the Log Insight cluster gets split across two datacenters then you may experience:
- Availability issues — depending on how long it takes for nodes to be recovered, the quantity of nodes impacted and the order in which they come up
- Query performance issues — depending on the latency and load on Log Insight
Now arguably any split should be temporary, but still it could cause issues. Arguably, having the entire cluster failover would be better than splitting the cluster across two sites. The question becomes why do you want to put Log Insight on a stretched cluster? If you are looking for disaster recovery then you should use the reference architectures instead.
You can run Log Insight on a vSphere Stretched Cluster though it is not officially supported today. If you are going to do it then you should configure sites as well as DRS should rules to keep all Log Insight nodes in the same site. If you want a fully supported configuration then you should refer to the Log Insight reference architectures.
© 2017, Steve Flanders. All rights reserved.