Let’s Talk About Best Practices

So I am sure you have heard the term “best practices” before. The phrase literally means the best way to practice/configure/administrate/operate/etc. something. The problem with best practices is that what is best for one is often not best for all. To complicate matters, declared best practices are often dependent on various items, including environment, requirements, and expertise. People have debated using the term “best practices,” and many have suggested using the term “recommended practices” instead. The connotation of recommended is that it is the de facto standard given a lack of context. In other words, a recommended practice is the recommended way to configure/administrate/operate/etc. something in most cases or in cases where no additional information/requirements/etc. are given.

You may be asking why I am bringing this up…

The reason is that whether you call it best/recommended/standard/default/generic/common/preferred/etc., the point remains the same: there are multiple ways to perform an action, but only one can be selected at any given time. In general, two types of people care about the action decided upon:

  • Administrators – need to know how to configure their environment and how others are configuring their environment
  • Managers – need to know why an administrator configured the system the way they did (think outage situation)

In my opinion, the problem with best practices is that they are often used as a safety net. What I mean by this is people often configure an environment based on so-called best practices without understanding why the best practices exist (if they even do) and if it applies to the given use case. If an issue arises later, the best practice is used as the scapegoat to cover the mistake of applying the so-called best practice in the first place.

Example

The above point is best illustrated in an example. I recently encountered an environment where two 250GB thin provisioned NFS datastores from a single array were presented to two ESXi hosts in a cluster. The hosts contained a handful of VMs and could easily fit on a single 500GB datastore. As such, I asked why two 1TB datastores were presented. I expected to hear something about growth or performance, but the answer was because VMware’s best practice for vSphere 5.0 and above was to have two datastores per cluster that had HA enabled.

For those familiar with the HA changes in vSphere 5.0, you will recall that datastore heart beating has been added as a check. By default, two datastores are selected to validate storage connectivity. If only a single datastore is present, vCenter Server throws a warning stating that two datastores are expected. The reason why two datastores are desired is because of redundancy reasons. Just like with management interfaces, it is best to have at least two paths to check and validate an issue. In a perfect world, each path would be unique in some way. For management interfaces, this means it would be ideal if each interface connected to a different switch. In the case of data stores, it would be ideal if there were at least two datastores connected to two different types of storage devices.

So back to the example, the architecture was such that only a single array would ever be connected to the cluster. In addition, the number of VMs in the cluster was statically defined. Given this scenario, does it makes sense to have two 250GB datastores exposed to the cluster? Let’s flip the question, what potential issues could arise from exposing additional capacity to a cluster? For one, if you are using thin-provisioned datastores, then without proper monitoring, you could run into issues where one datastore fills up quicker than the other. You could ensure that you had proper monitoring in place to get around this issue, expose larger datastores to the cluster, or not use thin-provisioned datastores.

Exposing multiple datastores to a cluster does offer some additional flexibility. For one, you may be able to load balance VMs better or isolate VMs on a per datastore basis (think SIOC or Storage DRS). Additional datastores open the potential of using different paths for storage traffic. Sounds great, right? Who manages this additional flexibility? Are you sure it is configured properly initially? What about ongoing configuration?

A final comment on this example is that checking for two datastores is a default setting. Default settings are often not changed unless absolutely needed, as they can cause side effects that may not be realized. In the case of two datastores per cluster, a HA advanced setting can be put into place to disable the warning about a single datastore being found. Note that while two management interfaces are a best practice, vCenter Server does not alert if a host only has a single management interface.

Summary

In the above (simple) example, you can see the complexity involved in deciding how to architect an environment. So what is the “best practice,” and how should the environment be configured? The answer to the first question is that if you have multiple types of storage devices, then the default setting should be kept as it should provide the ability to detect failure scenarios properly. The answer to the latter question is it depends. It depends, but it matters how the environment is used, who manages the storage, who monitors the environment, and how good the operating procedures are. Assuming this is a mature company/environment/department/etc., the chosen configuration should be fine. If not, then it may make more sense to consider alternatives. The point of the above example is to clarify that a default setting does not necessarily constitute a best practice.

To prevent issues in your environment, I highly recommend when presented with a best practice to ask why? Why is it a best practice? Why is it applicable to your environment? Why does it need to be applied now? Why is/isn’t it a default setting then? Why, why, why? At the end of the day, it is better to be a leader and ask why than be a follower and explain to your executive team where the so-called best practice came from and why it did not work.

In summary, default configurations make that decision for you and should work in a variety of cases. Best practices suggest how something should be configured based on feedback and issues others have experienced. How you decide to configure your environment is your decision. In my opinion, you should have all the configuration options presented to you with the pros and cons of each. In addition, recommendations should be provided for when each configuration option is applicable. Finally, a rule of thumb should be provided with a detailed explanation of why it should be chosen.

IMPORTANT: I am not advocating against following a default or best practices. Instead, I challenge you to understand the best practices you are implementing before you implement them.

© 2014 – 2021, Steve Flanders. All rights reserved.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top