The Strategic Platform for ALL Database Workloads

I was invited to Microsoft HQ in the UK yesterday to be a speaker at one of their launch event for SQL Server 2012. It’s the second of these events that I’ve appeared at and it finally made me realise I need to change something about this blog.

So far until now I have resisted making any critical remarks about the Oracle Exadata product here, other than to quote the facts as part of my History of Exadata series. I’m going to change that now by offering my own opinions on the product and Oracle’s strategy around selling it.

Before I do that I should establish my credentials and declare any bias I may have. For a number of years, until very recently in fact, I was an employee of Oracle Corporation in the UK where I worked in Advanced Customer Services. I began working with Exadata upon the release of the “v2” Sun Oracle Database Machine and at the time of the “X2” I was the UK Team Lead for Exadata. I personally installed and supported Exadata machines in the UK and also trained a number of the current Exadata engineers in ACS (although that wasn’t exactly difficult as all of the ACS engineers I know are excellent). I also used to train the sales and delivery management communities on Exadata using my trademarked “coloured balls” presentation (you had to be there).

I now work for Violin Memory, a company that (to a degree) competes with Oracle Exadata. Exadata is a database appliance, whilst Violin Memory make flash memory arrays… so that doesn’t immediately sound like a true competition. But I’ll let you into a little secret: Exadata isn’t a database appliance at all – it’s an application acceleration product. That’s what is does, it takes applications which businesses rely on and makes them run faster. And in fact that’s also exactly what Violin is – it’s an application acceleration product that just happens to look like a storage array.

So now we have everything out in the open I’m going to talk about my issue with Exadata – and you can read this keeping in mind that everything I say is tainted by the fact that I have an interest in making Violin products look better than Oracle’s. I can’t help that, I’m not going to quit my exciting new job just to gain some journalistic integrity…

There are a number of critiques of Exadata out there on the web, ranging from technical discussions (the best of which are Kevin Closson’s Critical Analysis videos) to stories about the endless #PatchMadness from Exadata DBAs on Twitter. My main issue is much more fundamental:

Oracle now say that Exadata is the strategic database platform for ALL database workloads. This did not used to be the case. If you read my History of Exadata piece you will see that when the original v1 HP Oracle Database Machine was released, “Exadata” was the name of the storage servers. And those storage servers were, in Oracle’s own words, “Designed for Oracle Data Warehouses“.

Upon the release of the v2 Sun Oracle Database Machine there came an epiphany at Oracle: the realisation that flash technology was essential for performance (don’t forget I’m biased). This was great news for Violin as back in those days (this was 2009) flash was still an emerging technology. However, the Sun F20 Accelerator cards that were added to the v2 were (in my biased opinion) pretty old tech and Oracle was only able to use them as a read cache. However, that didn’t stop Oracle’s marketing department (never one to hold back on a bold claim) from making the statement that the v2 was “The First Database Machine For OLTP“. We are now on the X2 model (really only a minor upgrade in CPU and RAM from the v2) and Oracle has now added Database Consolidation to the list of things that Exadata does. And of course the new bold claim has now appeared, as in the image above, “Exadata is Oracle’s strategic database platform for ALL database workloads“. Sure the X2 now came in two models, the X2-2 and the X2-8, but they weren’t actually different in terms of the features that you get above a normal Oracle database… you still get the same Exadata storage, Hybrid Columnar Compression and Exadata Flash Cache features regardless of the model.

So what’s my problem with this? Well first of all let’s just think about what a workload is. Essentially you can define the workload of a database by the behaviour of its users. There are two main types of workload in the database world, OnLine Transactional Processing (OLTP) and Data Warehousing (DW). OLTP systems tend to have highly transactional workloads, with many users concurrently querying and changing small amounts of data. Conversely, DW systems tend to have a smaller number of power users who query vast amounts of data performing sorts and aggregation. OLTP systems experience huge amounts of change throughout their working period (e.g. 9am-5pm for a national system, 24×7 for a global system). DW systems on the other hand tend to remain relatively static except during ETL windows when massive amounts of data are loaded or changed.

In fact, you can pretty much picture any workload as fitting somewhere on a scale between these two extremes:

Of course, this is a sweeping generalisation. In practice no system is purely OLTP or purely DW. Some systems have windows during which different types of workload occur. Consolidation systems make things even more complicated because you can have multiple concurrent workloads taking place.

There’s a point to all this though. Take a random selection of real life databases and look at their workloads. If you agree with my OLTP <> DW scale above then you will see that they all fit in different places. Maybe you don’t agree with it though and you think there are actually many more dimensions to consider… no matter. What we should all be able to agree on is this:

In the real world, different databases have different workloads.

And if we can agree on that then perhaps we can also agree on this:

Different workloads will have different requirements.

That’s simple logic. And to extend that simple logic just one more step:

One design cannot possibly be optimal for many different requirements.

And that’s my problem with Oracle’s strategy around selling Exadata. We all know that it was originally designed as a data warehouse solution. Although I defer to Kevin’s knowledge about the drawbacks of an asymmetric shared-nothing MPP design, I always thought that Exadata was an excellent DW product and something that (at the time) seemed like an evolutionary step forward (although I now believe that flash memory arrays are a revolutionary step forward that make that evolution obsolete – keep remembering that I’m biased though). But it simply cannot be the best solution for everything because that doesn’t make sense. You don’t need to be technical to get that, you don’t even need to be in IT.

Let’s say I wanted to drive from town A to town B as fast as I can. I’d choose a Ferrari right? That’s my OLTP requirement. Now let’s say I wanted to tow a caravan from A to B, I’d need a 4×4 or something with serious towing ability – definitely not a Ferrari. There’s my DW requirement. Now I need to transport 100 people from A to B. I guess I’d need a coach. That’s my Database Consolidation requirement. There is no single solution which is optimal for all requirements. Only a set of solutions which are better at some and worse at others.

A final note on this subject. The Microsoft event at which I spoke was about Redmond’s new set of database appliances: the Database Consolidation Appliance, the Parallel Data Warehouse, and the Business Decision Appliance. Microsoft have been lagging behind Oracle in the world of appliances but I believe that they have made a wise choice here in offering multiple solutions based on customer workload. And they are not the only ones to think this. Look at this document from Bloor comparing IBM and Exadata:

“Oracle’s view of these two sets of requirements is that a single solution, Oracle Exadata, is ideal to cover both of them; even though, in our view (and we don’t think Oracle would disagree), the demands of the two environments are very different. IBM’s attitude, by way of contrast, is that you need a different focus for each of these areas and thus it offers the IBM pureScale Application System for OLTP environments and IBM Smart Analytics Systems for data warehousing.”

Now… no matter how biased you think I am… Maybe it’s time to consider if this strategy of Oracle’s really makes sense?

Exadata Re-Racking Service

I’ve heard from a few sources now that Oracle is offering a new Exadata Re-racking service for quarter and half racks. The idea, as I understand it, is that if you have your own rack equipment in your data centre and don’t want to use the rack that Exadata comes preinstalled in, you can pay an extra fee for Oracle’s Advanced Customer Services engineers (a fine bunch of people I must say!) to re-rack it. It appears that the machine is delivered to your data centre and then ACS will disassemble it at your site and reassemble it in your rack.

There appear to be some caveats, such as a pre-installation survey to check that your rack kit is suitable and a ban on putting anything else in the same rack. Also, since you cannot have this with the full rack I presume that this would preclude you from upgrading to a full machine in the future – at least not without having to relocate the kit, which I guess means downtime. I must stress that I don’t have the exact details, so talk to your friendly local Exadata sales rep if you want to know.

What I will say is that in all my time at Oracle the idea that customers could not re-rack the Exadata component servers was one of the few rules which was set in stone. Many customers asked, but all were told no. So what’s changed?

If you ask Oracle I am sure they would say that they are “listening to customer demand” and being “flexible”. On the other hand surely there must be some who will see this as a simple case of abandoning a principle in order to increase the attraction of Exadata and get more sales.

I’d love to know what happens to the empty Exadata rack once the kit has been moved. I’ll start checking to see if they appear on eBay…

SLOB testing on Violin and Exadata

I love SLOB, the Silly Little Oracle Benchmark introduced to me by Kevin Closson in his blog.

I love it because it’s so simple to setup and use. Benchmarking tools such as Hammerora have their place of course, but let’s say you’ve just got your hands on an Exadata X2-8 machine and want to see what sort of level of physical IO it can drive… what’s the quickest way to do that?

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
exadataX2-8.vmem Linux x86 64-bit                  128    64       8    1009.40

Anyone who knows their Exadata configuration details will spot that this is one of the older X2-8’s as it “only” has eight-core Beckton processors instead of the ten-core Westmeres buzzing away in today’s boxes. But for the purposes of creating physical I/O this shouldn’t be a major problem.

Running with a small buffer cache recycle pool and calling SLOB with 256 readers (and zero writers) gives:

Load Profile              Per Second
~~~~~~~~~~~~         ---------------
  Physical reads:          138,010.5

So that’s 138k read IOPS at an 8k database block size. Not bad eh? I tried numerous values for readers and 256 gave me the best result.

Now let’s try it on the Violin 3000 series flash memory array I have here in the lab. I don’t have anything like the monster Sun Fire X4800 servers in the X2-8 with their 1TB of RAM and proliferation of 14 IB-connected storage cells. All I have is a Supermicro server with two quad-core E5530 Gainestown processors and under 100GB RAM:

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
oel57            Linux x86 64-bit                   16     8       2      11.74

You can probably guess from the hostname that I’ve installed Oracle Linux 5 Update 7. I’m also running the Oracle Unbreakable Enterprise Kernel (v1) and using Oracle 11.2.0.3 database and Grid Infrastructure in order to take advantage of the raw performance of Violin LUNs on ASM. For each of the 8x100GB LUNs I have set the IO scheduler to use noop, as described in the installation cookbook.

So let’s see what happens when we run SLOB with the same small buffer cache recycle pool and 16 readers (zero writers):

Load Profile              Per Second
~~~~~~~~~~~~         ---------------
  Physical reads:          159,183.9

That’s 159k read IOPS at an 8k database block size. I’m getting almost exactly 20k IOPS per core, which funnily enough is what Kevin told me to expect as a rough limit.

The thing is, my Supermicro has four dual-port 8Gb fibre-channel cards in it, but only two of them have connections to the Violin array I’m testing here. The other two are connected to an identical 3000 series array, so maybe I should present another 8 LUNs from that and add them to my ASM diskgroup… Let’s see what happens when I rerun SLOB with the same 16 readers / 0 writers:

Load Profile              Per Second
~~~~~~~~~~~~         ---------------
  Physical reads:          236,486.7

Again this is an 8k blocksize so I’ve achieved 236k read IOPS. That’s nearly 30k IOPS per core!

I haven’t run this set of tests as a marketing exercise or even an attempt to make Violin look good. I was generally interested in seeing how the two configurations compared – and I’m blown away by the performance of the Violin Memory arrays. I should probably spend some more time investigating these 3000 arrays to see whether I can better that value, but like a kid with a new toy I have one eye on the single 6000 series array which has just arrived in the lab here. I wonder what I can get that to deliver with SLOB?

The History of Exadata

I’ve been working on a timeline for the history of Exadata, starting with the HP Oracle Database Machine and working through to the X2 series.

It’s interesting to see how Oracle’s presentation of the product has changed over time, particularly the marketing messages.

Also, if you didn’t know better you would probably think that Engineered Systems were something Oracle had been planning for years. But the original plans for the Oracle Database Machine were to allow multiple vendors and ports of the storage software, basically an open architecture.

Things have changed a lot since then…

Oracle minimises the Exadata minimal pack

As of Exadata Storage Software version 11.2.3.1 released in March 2012 the “minimal pack” has now been deprecated. This is a component of the storage server software patch which is actually applied to the database servers in order to bring them up to the same image version.

Those who have been patching Exadata for a while now may remember the days when the database servers were patched using the ironically-named “convenience pack”. At some point in 2011 that was renamed to be the minimal pack. Well now it is gone entirely, to be replaced with a yum channel on the Unbreakable Linux Network.

There appears to be a channel per software version, e.g. exadata_dbserver_11.2.3.1.0_x86_64_base.

In a way that sounds like a better solution – but it does of course mean some logistical changes if you are going to do it the way Oracle suggests. For a start you will need the database servers to have direct network access to the repositories on the ULN. Or failing that you may need to create your own mirror repositories somewhere on the internal network and point the Exadata machines at those.

One thing which isn’t made explicitly clear in the patch readme for 11.2.3.1 is that this will update the kernel on the X2-2 to 2.6.18-274… meaning your database servers are effectively moving from Oracle Linux 5 Update 6 to Update 7. The X2-8 on the other hand updates to 2.6.32-300.

It’s also interesting to note that Oracle is still persisting with the 2.6.18 Red Hat compatible kernel on the X2-2 database servers despite the 2.6.32 Oracle Unbreakable Enterprise Kernel (UEK) being out for years. In fact there’s even a UEKv2 out now.

Another thing I notice is that those customers who were brave enough to choose to run their Exadata database servers on Solaris 11 Express have now been served a desupport notice and have six months to upgrade to Solaris 11 proper. It’s not a drastically difficult upgrade to perform but I’m surprised about that six month limit, it seems a little unfair considering the one year grace period customers usually get with database patchsets.

Analysing Oracle Exadata

I’ve been working on a document recently which describes Exadata and examines its strengths and weaknesses. I have uploaded a number of the sections of the document as pages to this site – they are listed as tabs under the Oracle Exadata menu here.