SAN Storage Architecture Simplified, Forever?

Dodo BirdIn the recent while I have been watching the practice of storage architecture and operations changing in it’s most fundamentals ways.  While none of the product announcements or products themselves are new, EMC is really changing the game.  The days of of the mystic art of calculating IOPs, read/write ratios, spindle performance and raid types are starting to go the way of the Dodo bird.  With the introduction of Enterprise Flash Drives (EFD), block level automated migration of data (based off of performance trending), thin volumes and disk pools, things are bound to change.  None of these individual items are new in their availability to the market but they’ve never really been mixed together to really provide us the value to the industry.Before I get into the effect, I feel it’s pertinent to give credit where it’s due.  There are instances of each of these individual technologies having been seen in past, but I will really concentrate on their successful implementations in recent past.

Thin Volumes

I won’t get into this debate of “who first” but this is a feature we’ve seen in the market for a good while.  The debate of FC LUN based thin provisioning has been between 3Par and DataCore.  Of course, we’ve seen EMC doing file system thin provisioning on their NAS boxes for a good while too.

In essence, thin provisioning allows us to present a logical capacity seperate from the actual physical capacity.  In less wordy terms, this in short allows an administrator to allocate a volume based on the user’s request but only commit a fraction of the real capacity.  This  gets us past the hurdle of having to purchase capacity that often won’t get used up.  The double edge of the sword, it’s not uncommon to be over-subscribed (have more logical capacity allocated than what is physically available) and cross your fingers that the volumes are used to their full capacity.

The other challenge that ensues, in traditional architecture, we allocate the number of disks based on performance.  With the automated nature of thin volume allocation, we lose control of how the data is physically layed out.  Depending on the implementation of the vendor, there are tricks to semi-guarantee performance but still not 100%.

In-Array Automated Tiering at Block Level

For many moons, we have been tiering file level data using various flavors of HSM tools.  EMC has had Rainfinity FMA and DiskXtender,  Symantec Enterprise Vault, et cetra.  The tools are so numerous but have been out there for a long time.  EMC started to aggressively chase this rabbit with the marketing of “Information Lifecycle Management”  (ILM) and had an entire ecosystem to support these principals.  They later went to market aggressively toting “Archive before Backup” which made sense as well.

FAST OverviewOf course, this was all well and nice for the file level stuff, but what about block level data?  At the time, a little known player, Compellant began popping it’s head up (recently acquired by Dell).  Compellant’s claim to fame was the idea that data should be automatically and transparently moved around different tiers of the storage array.  The only downfall when they originally came to market, we were limited to traditional spinny disk based tiers.  The differentiation between tiers were not that great  (180/120/80 IOPS) so the value within the array was diminished.  Their success was also diminished as they were the newest kid on the block in a saturated market where players had come and gone pretty quickly and the competition stiff.

In recent while thou, EMC came into the market with their “FAST” feature set and continued to evolve it.  The first release was a bit on the crude side as it only could tier an entire volume at a time.  Not too long after came the sub-LUN (v2) where FAST really began to functionally be a product that could get value.

Enterprise Flash Drives

This was one of the funnest innovations to come to market in a while.  EMC took the reigns and launched EFDs in their entire product lineup marketing it’s useas Tier 0.  At first, the regular FUD started flying around from the competition who weren’t in position.  The fun part is that now, nearly every major vendor supports EFD in their arrays.

The primary benefit to EFD is it’s sheer performance.  2000+ IOPs per drive really created a true divergence in the performance tiers.  This of course comes with a cost and EFD must be used sparingly.  That said, as with any new technology the price is coming down.  Using EFD as a the sole storage for a single volume is a very expensive proposition but warranted where one is sensative to the number of drives otherwise required (1 EFD : 11 15K) such as in a Symmetrix VMAX array where each engine comes with a lofty price tag.

Disk Pools

The idea in recent while was most prevalent in the HP EVA line of products.  The overarching approach was to create a single group of drives in which we provision storage from and the administrator fundamentally forgets about how exactly the LUN is striped across the Volume.  This in stark contrast to products such as Symmetrix in past where each individual drive is “split” (aka Hypers) into evenly divided chunks and the volumes were layed out from there.

The beuty of the EVA (and we won’t get into it’s architectural short commings here) was that from a storage provision and administration point of view, you simply had a large group of disks which you created LUNs from.  Under the hood, the storage array would stripe the LUN across the entire set of disks.  This really made for “fire and forget” storage provisioning.  There are draw backs to this approach but overall the principal shot for “simplicity vs control”.

In Array Data Mobility

The ability to move data between groups of disk non-disruptively with ease has been a feature that has allowed us to retro-fit our architectures and layouts for many years.  This in essence allowed volumes to be transparently reprovisioned based on new workloads.  EMC introduced us this back in the Flare 16 days (or was it the release after?) on the Clariion line of products.  Idea being that we could move data within the array without causing interruption to host requests.  Later, IBM and HDS took this a step further by allowing inter-array data mobility by virtualizing the storage.

How All of This Changes the Game

In the most recent while, we’re now seeing the combination of all of these capabilities to really create the new architecture paradigm.  The days of simply “carving out disks” is going away and the job of the storage administrator/architect is that of a capacity manager.

In essence, the storage administrator and architects provision now groups of capacity and performance and put the heavy work of managing the day to day workload to the array itself.  What we have seen VMWare do to computing (with over-subscribtion, VMotion and pooling of resources) now Storage Architects see in their environments. Compound EMC’s FAST with the use of thin provisioning and the old it proverb “Do more with less” really becomes a truth.   Where in past we have seen sprawls of fiber channel disks with the odd smattering of SATA disk, we now see large combinations of the EFD, FC and SATA all in the same array and each growing depending on the performance needs of the day.

This doesn’t mean that necessarily the voodoo magic of the Storage SME is debunked (or as my wife likes to say, “Purple Smoke and Mirrors”), as there still is FCoE vs FC, Zoning, specific product and tooling expertise, et cetra.  It just means that the way we manage storage and the way we think of our arrays changes.  The days of calculating raid types, front-end vs back-end IOPs, examining the usage of a single disk are now becoming less and less.  I do miss these days but on the same mark, it does allow myself and my fellow storage professionals to manage the ever sprawling amounts of data with less effort and do way less math.

Here, I speak primarily of EMC’s suite, but the honest is, this is so compelling (especially in my existence) that I feel that it’s just a matter of time before every major vendor is supporting these features.  This is where the great divide will start, the storage leaders vs followers vs wanna-be’s.  Our environments aren’t shrinking, the demands aren’t stopping, the motto has changed from “Do more with less” to “Do more with nothing” and the pool of strong resources is dwindling.  As Gartner has been saying, 47% of their respondants are reporting data growth as their largest challenge.  These tools and capabilities are key in my ability to spend time with my family, not having to setup a cot in my office and be a slave to my excel spread sheets.



About ericgrav
Senior technologist specializing in information management and dabblings into cloud computing

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: