Playing The Data Reduction Lottery

Picture courtesy of Capsun Poe

Picture courtesy of Capsun Poe

Storage for DBAs: Do you want to sell your house? Or your car? Let’s go with the car – just indulge me on this one. You have a car, which you weren’t especially planning on selling, but I’m making you an offer you can’t refuse. I’m offering you one million dollars so how can you say no?

The only thing is, when we come to make the trade I turn up not with a suitcase full of cash but a single Mega Millions lottery ticket. How would you feel about that? You may well feel aggrieved that I am offering you something which cost me just $1 but my response is this: it has an effective value of well over $1m. Does that work for you?

Blurred Lines

The thing is, this happens all the time in product marketing and we just put up with it. Oracle’s new Exadata Database Machine X4-2 has 44.8TB of raw flash in a full rack configuration, yet the datasheet states it has an effective flash capacity of 448TB. Excuse me? Let’s read the small print to find out what this means: apparently this is “the size of the data files that can often be stored in Exadata and be accessed at the speed of flash memory“.  No guarantees then, you just might get that, if you’re lucky. I thought datasheets where supposed to be about facts?

Meanwhile, back in storageland, a look at some of the datasheets from various flash array vendors throws up a similar practice. One vendor shows the following flash capacity figures for their array:

  • 2.75 – 11 TBs raw capacity
  • 5 – 50 TBs effective capacity

In my last two posts I covered deduplication and data compression as part of an overall data reduction strategy in storage. To recap, I gave my opinion that dedupe has no place with databases (although it has major benefits in workloads such as VDI) while data compression has benefits but is not necessarily best implemented at the storage level.

Here’s the thing. Your database vendor’s software has options that allow you to perform data reduction. You can also buy host-level software to do this. And of course, you can buy storage products that do this too. So which is best? It probably depends on which vendor you ask (i.e. database, host-level or storage), since each one is chasing revenue for that option – and in some storage vendor cases the data reduction is “always on”, which means you get it whether you want it or not (and whether you want to pay for it or not). But what you should know is this: your friendly flash storage vendor has the most to gain or lose when it comes to data reduction software.

Lies, Damned Lies and Capacities

When you purchase storage, you invariably buy it at a value based on price per usable capacity, most commonly using the unit of dollars per GB. This is simply a convenient way of comparing the price of competing products which may otherwise have different capacities: if a storage array costs $X and gives you GB of usable capacity, then the price in $/GB (dollars per gig) is therefore X/Y.

Now this practice originally developed when buying disk arrays – and there are some arguments to be made that $/GB carries less significance with flash… but everyone does it. Even if you aren’t doing it, chances are somebody in your purchasing department is. And even though it may not be the best way to compare two different products, you can bet that the vendor whose product has the lowest $/GB price will be the one looking most comfortable when it comes to decision day.

But what if there was a way to massage those figures? Each vendor wants to beat the competition, so they start to say things like, “Hey, what about if you use our storage compression features? On average our customers see a 10x reduction in data. This means the usable capacity is actually 10Y!“. Wouldn’t you know it? The price per gig (which is now X/10Y) just came down by 90%!

The First Rule of Compression

You all know this, but I’m going to say it anyway. Different sets of data result in different levels of compression (and deduplication). It’s obvious. Yet in the sterile environment of datasheets and TCO calculations it often gets overlooked. So let me spell it out for once and for all:

The first rule of compression is that the compression ratio is entirely dependant on the data being compressed.

Thus if you are buying or selling a product that uses compression, deduplication and data reduction, you cannot make any guarantees. Sure you can talk about “average compression ratios”, but what does that mean? Is there really such a thing as the average dataset?

Conclusion: Know What You Are Paying For

It’s a very simple message: when you buy a flash array (or indeed any storage array) be sure to understand the capacity values you are buying and paying for. Dollar per GB values are only relevant with usable capacities, not so-called effective or logical capacities. Also, don’t get too hung up on raw capacity values, since they won’t help you when you run out of usable space.

Definitions are important. Without them, nothing we talk about is … well, definite. So here are mine:

Lies, Damned Lies and Capacities

Advertisement

The Most Expensive CPUs You Own

CPU-gold

Storage for DBAs: Take a look in your data centre at all those humming boxes and flashing lights. Ignore the storage and networking gear for now and just concentrate on the servers. You probably have many different models, with different types and numbers of CPUs and DRAM inside. My question is, which CPUs are the most expensive? Almost without exception, the answer will be the CPUs inside your database servers…

In the last couple of posts I talked about the real cost of enterprise database software in general and Oracle RAC in particular. The point I was making was that database software, which is traditionally licensed by the CPU core, is expensive in comparison to the cost of the hardware on which it runs. But since the hardware fundamentally affects the performance – and therefore value for money – of the software, it’s important to make the right choices when building a database system. And yes, predictably, I believe that this means using flash memory instead of disk – but don’t worry, that’s not the main message behind post.

Lawn Mower Tax

OLYMPUS DIGITAL CAMERAThink of any consumer item which comes in multiple sizes and price brackets. I don’t know, let’s say a lawn mower. To simplify, let’s assume you can buy three different types of mower: small ($250), medium ($500) and large ($1000). The small one is cheaper but less powerful, so it takes longer to cut your grass, while the large one is the most expensive but requires the shortest amount of time. Which would you pick?

There’s no right answer because it depends on your requirements. But let’s introduce an unexpected complication into the mix: lawn mower tax. The government, in their wisdom, imposes a $50,000 tax on the purchase of any new lawn mower regardless of size. You still need a mower so you are forced to pay the tax, but is your choice influenced? The chances are you would buy the larger model, because a) the percentage difference in overall price is much less, and b) it avoids the risk of needing to upgrade in the future and having to pay the tax again. The $51,000 large mower represents better value for money than the two smaller models.

CPU Tax

You can think of database software in the same way. There are countless types of CPU available on the market right now: Intel, AMD, ARM, IBM Power, Oracle / Fujitsu SPARC, etc. Each vendor has many models and architectures, clock speeds and power ratings, yet they all share one important property: core count. And that core count is subject to the massive “CPU tax” that is the database software license. I’m sticking to the Oracle Database in this post but the same applies to Microsoft SQL Server (where licenses are core-based from SQL2012 onwards), Sybase and so on.

Picture Courtesy of 401(K) 2013 (Flickr)

Take a standard two-socket sixteen-core Intel Xeon-based server as an example: there are a multitude of CPU models fitting that description. Even if we restrict ourselves to the Sandy Bridge-EP range Wikipedia shows there are 11 different models fitting the description of “8 cores per socket”. Yet not all CPUs are equal. Wouldn’t it make sense, given the massive cost associated with core-based licensing, so ensure you are using the processor which gives you the best performance, i.e. value for money, per license?

Performance Per Licenseable Core

The problem of determining which CPUs provide the best value for money was one I struggled with for a while. Looking at benchmarks like SPECint and the datasheets from Intel and co, it’s hard not to be overwhelmed by data – and if I’m honest I probably don’t have the systems-level knowledge to interpret it accurately. Ironically, the solution came from someone who does have that knowledge, but showed me that it isn’t required because there’s a much simpler way. More importantly, benchmarks like SPECint don’t take into account what we want these CPUs to do, which is to run the Oracle Database.

Kevin Closson‘s elegant and annoyingly simple solution was to use TPC benchmarks – specifically the transactional TPC-C benchmark from Oracle databases, results from which are freely available here. All we need to do then is simply download the spreadsheet, filter out the non-Oracle workloads and then divide the value of tpmC (the number of orders that can be fully processed per minute) by the number of CPU cores to get the performance per core.

Since this is an Oracle-specific calculation we also then need to multiply this by Oracle’s Processor Core Factor (see link on this page) to get the ultimate figure we need to know, the performance per license. Here’s my working copy of the spreadsheet, but I make no claims to its accuracy and will not keep this screenshot up-to-date. You should recalculate every time you want to make a judgement on which servers to use, it’s a very simple exercise.

Click to enlarge

Performance per licensable core (based on published TPC-C benchmark results using Oracle) – click to enlarge

The red column is the performance per licenseable core, marked “Perf / license“. Hopefully it’s obvious that this is just a re-work of Kevin’s ideas, many of which he posted in this blog article, which I highly recommend reading. As such I can claim no credit, except for any mistakes.

The Flash Angle

Of course, this wouldn’t be a flashdba article without some mention of flash memory. As discussed above there are many different types and models of CPU, but there is one great leveller: CPUs are all equally good at doing nothing. If your processors are waiting on I/O then they are not working – and that has a direct negative effect on the value you are realising from them.

Violin Memory 6000 seriesIn the above chart, the last benchmark result (with the best value for performance per licensable core) is this one performed by Cisco. Now, I honestly didn’t engineer this article to work out this way, but it so happens that Cisco used a pair of Violin Memory 6616 flash memory arrays to achieve this workload. (I’d almost* be happier if this had been a competitor’s flash array, because I don’t want this to look like an advert for my employer and therefore detract from my point…)

The point I’m aiming to make here is that it’s worth using the best-performing processors in order to see value for money from your database licenses. But to enable that, the processors need to be released from the chains of high-latency storage – and that, quite simply, means using flash.

* almost, but not quite

The Real Cost of Oracle RAC

coins

Storage for DBAs: In my previous article (in this mini-series on database economics) I explained how to calculate the cost of a mid-range Oracle database system. My motive was a concern that many people working either directly or indirectly with database software are uninformed about just how expensive it is – particularly in comparison to the cost of hardware. And in this article I want to cover the great granddaddy of Oracle licenses costs: Oracle Real Application Clusters (RAC).

I also want to show you a little-known trick that can allow you to build a two-node fully active/active RAC cluster for a fraction of the price you would normally expect to pay.

But first, let’s talk about RAC…

Oracle RAC: High Availability For the Masses

There was a time, long ago, when big servers were very expensive. Many people ran Oracle on RISC-based UNIX systems, which had limited scalability in terms of the number of CPU cores and the maximum amount of physical memory. Oracle recognised this scalability issue and built a software solution for it, initially called Oracle Parallel Server (OPS). If you never used OPS in anger you should ask some of the grizzled, battle-scarred veterans who did how they fared against it, but at least in theory it allowed customers to scale out when scaling up wasn’t really possible.

However, things change – and nowhere more so than in IT. The days of big iron RISC systems seem long ago and nowadays (comparatively) cheap multicore x86 hardware is the norm. Scaling up to 80 cores in a server is not unusual, so the need for a software scalability solution is less strong than it was. However, Oracle knows a thing or two about staying at the top, so OPS became Real Application Servers and the scalability marketing message got overtaken by a new claim: high availability. Yes, Oracle RAC allows you to run one database across multiple nodes so if you look at it the right way that’s increasing system availability.

Of course, If you look at it another way (as I do), increasing the number of nodes is actually increasing the risk of failure to a single node. Plus, adding a whole raft of cluster functionality such as cache coherence, cluster filesystems and cluster ready services is just adding complexity, which is the enemy of availability. Yet everyone in the RAC game lives with the same shared deception: that losing a whole node does not count as a service outage. Sure, you get a whole load of users that get kicked off. Ok, so you have to bounce a whole set of application servers. But hey, technically it wasn’t a full outage so the SLAs weren’t affected. Er… ok… I think I’ve made my thoughts clear on this before.

Oracle RAC: The Expensive Way

There are two reasons why RAC can be expensive, or to put it another way two dimensions. The price goes up as the license cost increases, but it also goes up in multiples as the architecture scales out to multiple nodes.

In general, RAC is a feature of Oracle Enterprise Edition – in fact looking at the prices on the Oracle Store as I write this it’s the joint-most-expensive option (along with Oracle OLAP) priced at $23k per core (list)… If you consider that the Enterprise Edition license is $47.5k per core then that’s nearly half as much again. Don’t forget that Oracle’s core multiplication factor table determines that we need to multiply these costs by 0.5 for Intel Xeon processors, which is what I’m using in this example (see the first article in this series if you don’t know what this means).

oracle-rac-price-assumptions

Let’s state some assumptions for this imaginary Oracle RAC cluster we are building. It will have 4 nodes (16 cores per node) and 20TB of usable disk storage. We’ll also assume that in buying the licenses we got a 60% discount. We’re looking at the three-year price and, as always, the maintenance costs us 22% of the net license cost. I’m including the Oracle Diagnostics Pack ($5k per core) in the license cost too – surely nobody can cope without it these days?

oracle-rac-price-breakdowns

The total cost over three years, just for hardware, software and support (i.e. discounting TCO-type calculations like power, cooling, etc) is now up at £1.8m. That’s a relatively large amount of money! But what I find really interesting is the proportion that goes to the database vendor compared to the proportion that is spent on hardware:

oracle-rac-4node-price-breakdown

The storage (which I naturally have an interest in) is just 8% of the total cost, while the database vendor’s products and support services comprise 89% of the total cost. This is where database consolidation starts to make sense (more databases on the same hardware means better value for money from the core-based licenses). It’s also where flash memory storage makes sense, because it allows a far better return on this massive investment: firstly by unleashing applications to run at the speed of memory, and secondly by unlocking (expensive) CPUs which are otherwise stuck waiting on I/O from slow disk storage systems.

Oracle RAC: The Inexpensive Way

But wait, I promised you an alternative to the costly system above. What is it? The answer can be found buried deep within Oracle’s Software Investment Guide (page 11 of the current published version) where we find the following information: from Oracle 10g onwards, Oracle Standard Edition includes the Real Application Clusters Option provided customers use Oracle Clusterware and ASM. Since Standard Edition is limited to a maximum of 4 CPU sockets (not cores!!) this effectively means a two-node system using two-socket servers.

That’s still an amazing revelation – it’s basically RAC (with certain caveats) for free! With the right choice of high-end CPU, a two-socket server can deliver massive performance. Let’s have a look at the cost of a two-node RAC system running on Standard Edition using the same assumptions from above [massive thanks to Doug (see comments below) for pointing out my mistake – now corrected – that Standard Edition is licensed by the socket not the core and that the core multiplication factor therefore does not apply]:

oracle-se-rac-price-breakdowns

Now many people will think, “Hang on I can’t cope without Enterprise Edition” … but for this level of saving, isn’t it worth giving that some closer analysis? The real bonus here is that, in only paying licenses by the socket, you can achieve a massive benefit if you use the fastest processors with the largest number of cores and not pay any penalty.

oracle-se-rac-2node-price-breakdown

The price of Standard Edition RAC is 87% of the price of our previous configuration. (If you were to compare a like-for-like scenario where 2 node Enterprise Edition RAC moved to 2 node Standard Edition RAC the saving would instead be £702.5k or 76%)

Conclusion

Everything here is just speculation, based on the information available from Oracle at the time of writing. You should not construe my remarks as guarantees or facts, but instead do your own research and talk to your local database vendor’s representatives.

The point of writing this article is that technical people don’t always have a handle on price, because in some organisations they don’t always need to. But when the technical design has such a dramatic effect on the price, I think we all ought to be looking at the bigger picture and taking the time to work out the implications of our choices.

Software, as they say in Redwood Shores, doesn’t come cheap…

The Real Cost of Enterprise Database Software

moneyStorage for DBAs: The strange thing about enterprise databases is that the people who design, manage and support them are often disassociated from the people who pay the bills. In fact, that’s not unusual in enterprise IT, particularly in larger organisations where purchasing departments are often at opposite ends of the org chart to operations and engineering staff.

I know this doesn’t apply to everyone but I spent many years working in development, operations and consultancy roles without ever having to think about the cost of an Oracle license. It just wasn’t part of my remit. I knew software was expensive, so I occasionally felt guilt when I absolutely insisted that we needed the Enterprise Edition licenses instead of Standard Edition (did we really, or was I just thinking of my CV?) but ultimately my job was to justify the purchase rather than explain the cost.

On the off chance that there are people like me out there who are still a little bit in the dark about pricing, I’m going to use this post to describe the basic price breakdown of a database environment. I also have a semi-hidden agenda for this, which is to demonstrate the surprisingly small proportion of the total cost that comprises the storage system. If you happen to be designing a database environment and you (or your management) think the cost of high-end storage is prohibitive, just keep in mind how little it affects the overall three-year cost in comparison to the benefits it brings.

Pricing a Mid-Range Oracle Database

Let’s take a simple mid-range database environment as our starting point. None of your expensive Oracle RAC licenses, just Enterprise Edition and one or two options running on a two-socket server.

At the moment, on the Oracle Store, a perpetual license for Enterprise Edition is retailing at $47,500 per processor. We’ll deal with the whole per processor thing in a minute. Keep in mind that this is the list price as well. Discounts are never guaranteed, but since this is a purely hypothetical system I’m going to apply a hypothetical 60% discount to the end product later on.

midrange-price-assumptions

I said one or two options, so I’m going to pick the Partitioning option for this example – but you could easily choose Advanced Compression, Active Data Guard, Spatial or Real Application Testing as they are all currently priced at $11,500 per processor (with the license term being perpetual – if you don’t know the difference between this and named user then I recommend reading this). For the second option I’ll pick one of the cheaper packs… none of us can function without the wait interface anymore, so let’s buy the Tuning Pack for $5,000 per processor.

The Processor Core Factor

I guess we’d better discuss this whole processor thing now. Oracle uses per core licensing which means each CPU core needs a license, as opposed to per socket which requires one license per physical chip in the server. This is normal practice these days since not all sockets are equal – different chips can have anything from one to ten or more cores in them, making socket-based licensing a challenge for software vendors. Sybase is licensed by the core, as is Microsoft SQL Server from SQL 2012. However, not all cores are equal either… meaning that different types of architecture have to be priced according to their ability.

The solution, in Oracle’s case, is the Oracle Processor Core Factor, which determines a multiplier to be applied to each processor type in order to calculate the number of licenses required. (At the time of writing the latest table is here but always check for an updated version.) So if you have a server with two sockets containing Intel Xeon E5-2690 processors (each of which has eight cores, giving a total of sixteen) you would multiply this by Oracle’s core factor of 0.5 meaning you need a total of 16 x 0.5 = 8 licenses. That’s eight licenses for Enterprise Edition, eight licenses for Partitioning and eight licenses for the Tuning Pack.

midrange-price-breakdown

What else do we need? Well there’s the server cost, obviously. A mid-range Xeon-based system isn’t going to be much more than $16,000. Let’s also add the Oracle Linux operating system (one throat to choke!) for which Premier Support is currently listing at $6,897 for three years per system. We’ll need Oracle’s support and maintenance of all these products too – traditionally Oracle sells support at 22% of the net license cost (i.e. what you paid rather than the list price), per year. As with everything in this post, the price / percentage isn’t guaranteed (speak to Oracle if you want a quote) but it’s good enough for this rough sketch.

Finally, we need some storage. Since I’m actually describing from memory an existing environment I’ve worked on in the past, I’m going to use a legacy mid-range disk array priced at $7 per GB – and I want 10TB of usable storage. It’s got some SSD in it and some DRAM cache but obviously it’s still leagues apart from an enterprise flash array.

Price Breakdown

That’s everything. I’m not going to bother with a proper TCO analysis, so these are just the costs of hardware, software and support. If you’ve read this far your peripheral vision will already have taken in the graph below. so I can’t ask you to take a guess… but think about your preconceptions. Of the total price, how much did you think the storage was going to be? And how much of the total did you think would go to the database vendor?

midrange-database-price-breakdown

The storage is just 17% of the total, while the database vendor gets a whopping 80%. That’s four-fifths… and they don’t even have to deal with the logistics of shipping and installing a hardware product!

Still, the total price is “only” $430k, so it’s not in the millions of dollars, plus you might be able to negotiate a better discount. But ask yourself this: what would happen if you added Oracle Real Application Clusters (currently listing at $23,000 per processor) to the mix. You’d need to add a whole set of additional nodes too. The price just went through the roof. What about if you used a big 80-core NUMA server… thereby increasing the license cost by a factor of five (16 cores to 80)? Kerching!

Performance and Cost are Interdependent

light_bulbThere are two points I want to make here. One is that the cost of storage is often relatively small in terms of the total cost. If a large amount of money is being spent on licensing the environment it makes sense to ensure that the storage enables better performance, i.e. results in a better return on investment.

The second point is more subtle – but even more important. Look at the price calculations above and think about how important the number of CPU cores is. It makes a massive difference to the overall cost, right? So if that’s the case, how important do you think it is that you use the best CPUs? If CPU type A gives significantly better performance than CPU type B, it’s imperative that you use the former because the (license-related) cost of adding more CPU is prohibitive.

Yet many environments are held back by CPUs that are stuck waiting on I/O. This is bad news for end users and applications, bad news for batch jobs and backups. But most of all, this is terrible news for data centre economics, because those CPUs are much, much more expensive than the price you pay to put them in the server.

There is more to come on this subject…