EMC FLARE <32 RAID 6 Implementation

Over the last few weeks I was engaged in an issue where all VMs backed by a NFS datastore in an environment experienced several seconds to over two minutes of latency (i.e. inability to write to the filesystem) at approximately the same time. The write delay was so significant that it resulted in VMs crashing.

The configuration at a high-level consisted of ESXi hosts with NFS datastores connected to a pair of VNXs. The VNXs consisted of SATA drives and per EMC best practices was configured as RAID 6. In addition, FAST VP was enabled, which requires the use of storage pools instead of RAID groups. It should be noted that storage pools are currently stated as the best practice configuration as they ease administration and configuration as well as allow for more advanced options like FAST VP. The network between the devices converged to a pair of core switches.

Based on the symptoms it seemed logical to rule out ESXi as the hosts were spread over different switches, different vCenter Servers, and were running different versions of ESXi. Since the storage arrays were both impacted at approximately the same time it also seemed logical to rule out the storage arrays. This left the network and specifically the core switches. The issue was the core switches had been running stable and with the same software version and configuration for some time.

So what was going on?

Continue reading

DAE and LCC connectivity

EMC storage, DAE failures, and vertical striping

EMC’s best practice for creating storage pools and RAID groups on mid to low end storage arrays (e.g. CLARiiON or VNX) has always been to create them using disks from a single Drive Array Enclosure (DAE). This configuration is sometimes referred to as horizontal striping. You may be wondering why this was the case as a complete DAE failure would result in data unavailablity for all pools/groups on the DAE. As such, you may be tempted to create pools/groups across multiple DAEs, which is sometimes referred to as vertical striping. To understand the reasons you need to first understand the physical configuration of EMC mid to low end storage arrays and then some DAE failure scenarios.

Continue reading

My take on what we’re doing with Atmos

One of the big news stories over the last couple weeks has been the announcement that EMC’s Atmos Online service offering will no longer be sold commercially. As you can imagine, this announcement is of great significance to Atmos Online customers, but it should not come as a surprise. As Chad over at Virtual Geek pointed out, Atmos Online was always intended to be a proof-of-concept and nothing more. He posted a good article about the changes entitled: Understanding what we’re doing with Atmos. In this article he highlights his views on the recent news. I would like to make a couple comments in regards to Chad’s article. Please be advised the views below are mine and do not reflect EMC’s stance on the topics.

Continue reading