June 25, 2015 Leave a comment
For the last couple of years I’ve been writing a series of blog posts introducing the concepts of flash-memory and solid state storage to those who aren’t part of the storage industry. I’ve covered storage fundamentals, some of what I consider to be the enduring myths of storage, a section of unashamed disk-bashing and then a lengthy set of articles about NAND flash itself.
Now it’s time to talk about all flash arrays. But first, a warning.
Although I work for a flash array vendor, I have attempted to keep my posts educational and relatively unbiased. That’s pretty tricky when talking about the flash media, but it’s next to impossible when talking about arrays themselves. So from here on this is all just my opinion – you can form your own and disagree with me if you choose – there’s a comment box below. But please be up front if you work for a vendor yourself.
All Flash Array Definition(s)
It is surprisingly hard to find a common definition of the All Flash Array (or AFA), but one thing that everyone appears to agree on is the shared nature of AFAs – they are network-attached, shared storage (i.e. SAN or NAS). After that, things get tricky.
IDC, in its 2015 paper Worldwide Flash Storage Solutions in the Datacenter Taxonomy, divides network-attached flash storage into All Flash Arrays (AFAs) and Hybrid Flash Arrays (HFAs). It further divides AFAs into categories based on their use of custom flash modules (CFMs) and solid state disks (SSDs), while HFAs are divided into categories of mixed (where both disks and flash are used) and all-flash (using CFMs or SSDs but with no disk media present).
Did you make it through that last paragraph? Perhaps, like me, you find the HFA category “all-flash” confusingly named given the top-level category of “all flash arrays”? Then let’s go and see what Gartner says.
Gartner doesn’t even get as far as using the term AFA, preferring the term Solid State Array (or SSA). I once asked Gartner’s Joe Unsworth about this (I met him in the kitchen at a party – he was considerably more sober than I) and he explained that the SSA term is designed to cope with any future NAND-flash replacement technology, rather than restricting itself to flash-based arrays… which seems reasonable enough, but it does not appear to have caught on outside of Gartner.
The big catch with Gartner’s SSA definition is that, to qualify, any potential SSA product must be positioned and marketed “with specific model numbers, which cannot be used as, upgraded or converted to general-purpose or hybrid storage arrays“. In other words, if you can put a disk in it, you won’t see it on the Gartner SSA magic quadrant – a decision which has drawn criticism from industry commentators for the way it arbitrarily divides the marketplace (with a response from Gartner here).
The All Flash Array Definition at flashdba.com
So that’s IDC and Gartner covered; now I’m going to give my definition of the AFA market sector. I may not be as popular or as powerful as IDC or Gartner but hey, this is my website and I make the rules.
In my humble opinion, an AFA should be defined as follows:
An all flash array is a shared storage array in which all of the persistent storage media comprises of flash memory.
Yep, if it’s got a disk in it, it’s not an AFA.
This then leads us to consider three categories of AFA:
The hybrid AFA is the poor man’s flash array. It’s performance can best be described as “disk plus” and it is extremely likely to descend from a product which is available in all-disk, mixed (disk+SSD) or all-SSD configurations. Put simply, a hybrid AFA is a disk array in which the disks have been swapped out for SSDs. There are many of these products out there (EMC’s VNX-F and HP’s all-flash 3PAR StoreServ spring to mind) – and often the vendors are at pains to distance themselves from this definition. But the truth lies in the architecture: a hybrid AFA may contain flash in the form of SSDs, but it is fundamentally and inescapably architected for disk. I will discuss this in more detail in a future article.
The next category covers all-flash arrays that have been architected with flash in mind but which only use flash in the form of solid state drives (SSDs). A typical SSD-based AFA consists of two controllers (usually Intel x86-based servers) and one or more shelves of SSDs – examples would be EMC’s XtremIO, Pure Storage, Kaminario and SolidFire. Since these SSDs are usually sourced from a third party vendor – as indeed are the servers – the majority of the intellectual property of an SSD-based array concerns the software running on the controllers. In other words, for the majority of SSD-based array vendors the secret sauce is all software. What’s more, that software generally doesn’t cover the tricky management of the flash media, since that task is offloaded to the SSD vendor’s firmware. And from a purely go-to-market position (imagine you were founding a company that made one of these arrays), this approach is the fastest.
The final category is the ground-up designed AFA – one that is architected and built from the ground up to use raw flash media straight from the NAND flash fabricators. There are, at the time of writing, only two vendors in the industry who offer this type of array: Violin Memory (my employer) and IBM with its FlashSystem. A ground-up array implements many of its features in hardware and also takes a holistic approach to managing the NAND flash media, because it is able to orchestrate the behaviour of the flash across the entire array (whereas SSDs are essentially isolated packages of flash). So in contrast with the SSD-based approach, the ground-up array has a much larger proportion of it’s intellectual property in its hardware.
Why are there only two ground-up AFAs on the market? Well, mainly because it takes a lot longer to create this sort of product: Violin is ten years old this year, while IBM acquired the RamSan product from Texas Memory Systems who had been around since 1987. In comparison, the remaining AFA companies are mostly under six years old. It also requires hardware engineering with NAND flash knowledge, usually coupled to a relationship with a NAND flash foundry (Violin, for example, has a strategic alliance with Toshiba – the inventor of NAND flash).
Which Is Best?
Ahh well that’s the question, isn’t it? Which architecture will win the day, or will something else replace them all? So that’s what we’ll be looking at next… starting with the Hybrid Array. And while I don’t want to give away too much too soon <spoiler alert>, in my book the hybrid array is an absolute stinker.