Performance: It’s All About Balance…

Storage For DBAs: Everyone wants their stuff to go faster. Whether it’s your laptop, tablet, phone, database or application… performance is one of the most desirable characteristics of any system. If your system isn’t fast enough, you start dreaming of more. Maybe you try and tune what you already have, or maybe you upgrade to something better: you buy a phone with a faster processor, or stick an SSD in your laptop… or uninstall Windows 🙂

When it comes to databases, I often find people considering the same set of options for boosting performance (usually in this order): half-heartedly tuning the database, adding more DRAM, *properly* tuning the database, adding or upgrading CPUs, then finally tuning the application. It amazes me how much time, money and effort is often spent trying to avoid getting the application developers to write their code properly, but that’s a subject for another blog.

The point of this blog is the following statement: to achieve the best performance on any system it is important that all of its resources are balanced.

Let’s think about the basic resources that comprise a computer system such as a database server:

CPU – the processor, i.e. the thing that actually does the work. Every process pretty much exists to take some input, get on CPU, perform some calculations and produce some output. It’s no exaggeration to call this the heart of the system.
Network – communications with the outside world, whether it be the users, the application servers or other databases.
Memory – Dynamic Random Access Memory (DRAM) provides a store for data.
Storage – for example disk or flash; provides a store for data.

You’ll notice I’ve been a bit disingenuous by describing Memory and Storage the same way, but I want to make a point: both Memory and Storage are there to store data. Why have two different resources for what is essentially the same purpose?

The answer, which you obviously already know, is that DRAM is volatile (i.e. continuous power is required to maintain the stored information, otherwise it is lost) while Storage is persistent (i.e. the stored information remains in place until it is actively changed or removed).

When you think about it like that, the Storage resource has a big advantage over the Memory resource, because the data you are storing is safe from unexpected power loss. So why do we have the DRAM? What does it bring to the party? And why do I keep asking you questions you already know the answer to?

Ok I’ll get to the point, which is this: DRAM is used to drive up CPU utilisation.

The Long Walk

The CPU is interacting with the Memory and Storage resources by sending or requesting data. Each request takes a certain amount of time – and that time can vary depending on factors such as the amount of data and whether the resource is busy. But let’s ignore all that for now and just consider the minimum possible time taken to send or receive that data: the latency. CPUs have clock cycles, which you can consider a metronome keeping the beat to which everything else must dance. That’s a gross simplification which may make some people wince (read here if you want to know why), but I’m going to stick with it for the sake of clarity.

Let’s consider a 2GHz processor – by no means the fastest available clock speed out there today. The 2GHz indicates that the clock cycle is oscillating 2 billion times per second. That means one oscillation every half a nanosecond, which is such a tiny amount of time that we can’t really comprehend it, so instead I’m going to translate it into the act of walking, where each single pace is a clock cycle. With each step taken, an instruction can be executed, so:

One CPU Cycle = Walking 1 Pace

The current generation of DRAM is DDR3 DRAM, which has latencies of around 10 nanoseconds. So now, while walking along, if you want to access data in DRAM you need to incur a penalty of 20 paces where you potentially cannot do anything else.

Accessing DRAM = Walking 20 Paces

Now let’s consider storage – and in particular, our old friend the disk drive. I frequently see horrible latency problems with disk arrays (I guess it goes with the job) but I’ll be kind here and choose a latency of 5 milliseconds, which on a relatively busy system wouldn’t be too bad. 5 milliseconds is of course 5 million nanoseconds, which in our analogy is 10 million steps. According to the American College of Sports Medicine there are an average of 2,000 steps in one mile. So now, walking along and making an I/O request to disk incurs a penalty of 10,000,000 steps or 5,000 miles. Or, to put it another way:

Accessing Disk = Walking from London to San Francisco

Take a minute to consider the impact. Previously you were able to execute an instruction every step, but now you need to walk a fifth of the way around the planet before you can continue working. That’s going to impact your ability to get stuff done.

Maybe you think 5 milliseconds is high for disk latency (or maybe you think anyone walking from London to San Francisco might face some ocean-based issues) but you can see that the numbers easily translate: every millisecond of latency is equivalent to walking one thousand miles.

Don’t forget what that means back in the real world: it translates to your processor sitting there not doing anything because it’s waiting on I/O. Increasing the speed of that processor only increases the amount of work it’s unable to do during that wait time. If you didn’t have DRAM as a “temporary” store for data, how would you ever manage to do any work? No wonder In-Memory technologies are so popular these days.

Moore’s Law Isn’t Helping

It’s often stated or inferred that Moore’s Law is bringing us faster processors every couple of years, when in fact the original statement was on doubling the number of transistors on an integrated circuit. But the underlying point remains that processor performance is increasing all the time. Looking at the four resources we outlined above, you could say that in a similar way DRAM technologies are progressing while network protocols are getting faster (10Gb Ethernet is commonplace, Infiniband is increasingly prevalent and 40Gb or 100Gb Ethernet is not far away).

On the other hand, disk performance has been stationary for years. According to this manual from Seagate the performance of CPUs increased 2,000,000x between 1987 and 2004 yet the performance of hard disk drives only increased 11x. That’s hardly surprising – how many years ago did the 15k RPM disk drive come out? We’re still waiting for something faster but the manufacturers have hit the limits of physics. The idea of helium-filled drives has been floated (sorry, couldn’t resist) and indeed they could be on the shelves soon, but if you ask me the whole concept is so up-in-the-air (sorry, I really can’t help it) that I have serious doubts whether it will actually take off (ok I promise that’s the last one).

The consequence of Moore’s Law is that the imbalance between disk storage and the other resources such as CPU is getting worse all the time. If you have performance issues caused by this imbalance – and then move to a newer, faster server with more processing power… the imbalance will only get worse.

The Silicon Data Centre

Disk, as a consequence of its mechanical nature, cannot keep up with silicon as the number of transistors on a processor doubles every two years. Well as the saying goes, if you can’t beat them, join them. So why not put your persistent data store on silicon?

This is the basis of the argument for moving to flash memory: it’s silicon-based. The actual technology most vendors are using is NAND flash but that’s not massively important and technologies will come and go. The important point is to get storage onto the graph of Moore’s Law. Going back to the walking analogy above, an I/O to flash memory takes in the region of 200 microseconds, i.e. 200 thousand nanoseconds. This is a number of orders of magnitude faster than disk but still represents walking 400,000 paces or 200 miles. But unlike disk, the performance is getting better. And by moving storage to silicon we also pick up many other benefits such as reduced power consumption, space and cooling requirements. And most importantly we restore some balance to your server infrastructure.

Think about it. You have to admit that, as an argument, it’s pretty well balanced.

Footnote: Yes I know that by representing CPU clock cycles as instructions I am contributing to the Megahertz Myth. Sorry about that. Also, I strongly advise reading this article in the NoCOUG journal which makes some great points about DRAM and CPU utilisation. My favourite quote is, “Idle processors do not speed up database processing!” which is so obvious and yet so often overlooked.

New Blog Series: Storage For DBAs

When I joined Violin I suddenly realised that I was going to be in a minority… a DBA in a world of storage people. DBAs don’t always think about storage – in fact I think it’s fair to say that many DBAs would prefer to keep storage at arm’s length. It’s just disk, right?

Of course all that changed back in 2007 when Oracle released 10g along with the new feature Automatic Storage Management. I confess that when I first heard about ASM I thought “Naah, I won’t be using that”. I couldn’t have been more wrong… before I knew it Oracle had me travelling all round Britain delivering the “ASM Roadshow” to DBA teams who listened with the all the interest of a dog that’s been shown a card trick*. [On one particular low point I found myself trying so hard to explain RAID to a group of stoic, unblinking Welshmen that I suffered some sort of mental breakdown and had to go and stand in the car park for a minute while my co-presenter leapt to my assistance. This was not a good day.]

These days DBAs are becoming more and more involved in storage, which means they have to spend more time talking to their Storage Administrator cousins (unless they have Exadata of course, in which the DBAs are the storage administrators).

During my time working with Exadata at Oracle I began to learn more and more about disk concepts such as Mean Time Between Failure and Annualised Failure Rates… but when I joined Violin I was plunged into a murky new world of terminology such as IOPS, bandwidth, latency and various other terms that probably should have been more familiar to me but weren’t.

I quickly came to the conclusion that people in the database world and people in the storage industry speak different languages. So now that I’ve been working at Violin for a year I thought it was time to try and bridge the divide and offer some translations of what all these crazy storage people are really talking about.

I’ll start with the basics and then get progressively more detailed until either a) I no longer know what I’m talking about, or b) nobody is reading anymore. So that should see me through until about the middle of next week then…

* I confess I stole this line from Bill Hicks. But it describes the scene perfectly and is funnier than anything I could think of…

Using SLOB to Test Physical I/O

For some time now I’ve been using the Silly Little Oracle Benchmark (SLOB) tool to drive physical I/O against the various high performance flash memory arrays I have in my lab (one of the benefits of working for Violin Memory is a lab stuffed with flash arrays!)

I wrote a number of little scripts and tools to automate this process and always intended to publish some of them to the community. However, time flies and I never seem to get around to finishing them off or making them presentable. I’ve come to the conclusion now that this will probably never happen, so I’ve published them in all their nasty, badly-written and uncommented glory. I can only apologise in advance.

You can find them here.

Inserting Formatted Code into WordPress

A lot of people I know use WordPress.com, including me for this blog. One of the common complaints I hear is that it’s not easy to insert formatted text such as source code or, in my case, snippets from Oracle AWR Reports.

I don’t have a problem with this, because I have developed a way of doing it which is simple and relatively quick. I know that there are probably various plugins or other tools for doing this, but I want a simple method that works with the vanilla version of WordPress. One that allows me to do things like this:

Load Profile              Per Second    Per Transaction   Per Exec   Per Call
~~~~~~~~~~~~         ---------------    --------------- ---------- ----------
      DB Time(s):              197.6                2.8       0.00       0.70
       DB CPU(s):               18.8                0.3       0.00       0.07
       Redo size:    1,477,126,876.3       20,568,059.6
   Logical reads:          896,951.0           12,489.5
   Block changes:          672,039.3            9,357.7
  Physical reads:           15,529.0              216.2
 Physical writes:          166,099.8            2,312.8
      User calls:              282.4                3.9
          Parses:              379.2                5.3
     Hard parses:               25.9                0.4
W/A MB processed:                0.2                0.0
          Logons:                0.0                0.0
        Executes:           71,325.0              993.2
       Rollbacks:                0.0                0.0
    Transactions:               71.8

Maybe you already have a way of doing this, in which case good luck to you. But for anyone who might find it useful, here’s the way I do it…

Step 1

I’m assuming that, like all sane people, you are editing in the “Visual” mode (see tab at the top of the edit box). In the place where you want the formatted text to appear, insert a place marker. Something simple that you can easily find later on – ideally a single string that you can highlight with one double-click of the mouse. I’m going to use “XXX”. It should be a single line on its own. Make sure you insert some more text on the following line, otherwise you will have some trouble later on getting back to normal Paragraph mode; if you don’t know what you want to say in the next line (because you haven’t written that part yet) just insert a word, or even a single letter will do.

When you are ready to insert the formatted text, double click on the place marker to highlight it and then change it from “Paragraph” to “Preformatted” where the red arrow is on this picture:

Step 2

When I’m displaying AWR Reports, source code etc I prefer to indent it as I feel it looks better. You pay the price slightly because often these code blocks can be quite long (horizontally), so by losing an inch or so of margin on the left you run a greater risk of the horizontal scroll bar appearing… I’m ok with that, but consider this step optional: Click on the indent button:

Step 3

Now you need to switch out of “Visual” mode and into WordPress’s dreaded “Text” mode. Click the tab at the top right of the edit box to go into “text” and then scroll down to look for your easy-to-spot XXX placemarker. Double-click with the mouse to highlight it:

Step 4

Cut and paste the formatted text into the place where the XXX place marker was. It will retain the formatting, as long as you make sure everything you want to display is included between the “pre” and “/pre” markers:

One caveat I need to had here is that if your cut and pasted text contains either the greater than or less than characters (i.e. < and >) you will find they get interpreted as HTML and disapper. To fix this you will need to wait until Step 5 and then manually replace them in the text.

Step 5

Switch back to “Visual” mode using the tab at the top right of the edit box. Everything should look exactly as you inserted it:

Replace any < or > characters that went missing at this point. If the inserted text was something important like a SQL or shell script it’s probably worth cut and pasting it back into a file and running a diff between it and the original.

Step 6?

There doesn’t have to be a step 6 – in the example I used at the top of this page there wasn’t. But if you now want to play around with parts of your inserted text, such as by changing the colour of sections or using highlights, just do it in the “Visual” mode and it will automatically insert the correct tags without losing you formatting. Mess around, see if you can break it. So far for me it’s never failed.

Conclusion

So that’s my method. There may be easier ways, there may be plugins which make things look prettier, but I’m happy with this. If it works for you to then consider it my pleasure to pass this info on – and if you find another WordPress user looking down in the dumps, why not pass it on to them too?

Update

WordPress has the ability to format source code, a feature which I now use for any shells scripts or SQL scripts I post here. One “language” option is also text, but I still don’t think AWR reports look as good using this source code method as they do using my set of steps above. Personal tastes vary though, so check it out and see what you think.

AWR Generator

tools

As part of my role at Violin I spend a lot of time profiling customer’s databases to see how their performance varies over time. The easiest way to do this (since I often don’t have remote access) is to ask for lots of AWR reports. One single report covering a large span of time is useless, because all peaks and troughs are averaged out into a meaningless hum of noise, so I always ask for one report per snapshot period (usually an hour) covering many hours or days. And I always ask for text instead of HTML because then I can process them automatically.

That’s all well and good, but generating a hundred AWR reports is a laborious and mind-numbingly dull task. So to make things easier I’ve written a SQL script to do it. I know there are many other scripts out there to do this, but none of them met the criteria I needed – mainly that they were SQL not shell (for portability) and that they didn’t create temporary objects (such as directories).

If it is of use to anyone then I offer it up here:

https://flashdba.com/database/useful-scripts/awr-generator/

Likewise if you manage to break it, please let me know! Thanks to Paul for confirming that it works on RAC and Windows systems (you know you love testing my SQL…)

Engineered Systems – An Alternative View

Have you seen the press recently? Or passed through an airport and seen the massive billboards advertising IT companies? I have – and I’ve learnt something from them: Engineered Systems are the best thing ever. I also know this because I read it on the Oracle website… and on the IBM website, although IBM likes to call them different names like “Workload Optimized Systems”. HP has its Converged Infrastructure, which is what Engineered Systems look like if you don’t make software. And even Microsoft, that notoriously hardware-free zone where software exists in a utopia unconstrained by nuts and bolts, has a SQL Server Appliance solution which it built with HP.

[I’m going to argue about this for a while, because that’s what I do. There is a summary section further down if you are pressed for time]

So clearly Engineered Systems are the future. Why? Well let’s have a look at the benefits:

Pre-Integration

It doesn’t make sense to buy all of the components of a solution and then integrate them yourself, stumbling across all sorts of issues and compatibility problems, when you can buy the complete solution from a single vendor. Integrating the solution yourself is the best of breed approach, something which seems to have fallout out of favour with marketing people in the IT industry. The Engineered Systems solution is pre-integrated, i.e. it’s already been assembled, tested and validated. It works. Other customers are using it. There is safety in the herd.

Optimization

In Oracle Marketing’s parlance, “Hardware and software, engineered to work together“. If the same vendor makes everything in the stack then there are more opportunities to optimize the design, the code, the integration… assumptions no longer need to be made, so the best possible performance can be squeezed out of the complete package.

Faster Deployment

Well… it’s already been built, right? See the Pre-Integration section above and think about all that time saved: you just need to wheel it in, connect up the power and turn it on. Simples.

Of course this isn’t completely the case if you also have to change the way your entire support organisation works in order to support the incoming technology, perhaps by retraining whole groups of operations staff and creating an entirely new specialised role to manage your new purchase. In fact, you could argue that the initial adoption of a technology like Exadata is so disruptive that it is much more complicated and resource-draining than building those best of breed solutions your teams have been integrating for decades. But once you’ve retrained all your staff, changed all your procedures, amended your security guidelines (so the DataBase Machine Administrator has access to all areas) and fended off the poachers (DBMAs get paid more than DBAs) you are undoubtedly in the perfect position to start benefiting from that faster deployment. Well done you.

And then there’s the migration from your existing platform, where (to continue with Exadata as an example) you have to upgrade your database to 11.2, migrate to Linux, convert to ASM, potentially change the endianness of your data and perhaps strip out some application hints in order to take advantage of features like Smart Scan. That work will probably take many times longer than the time saved by the pre-integration…

Single-Vendor Benefits

The great thing about having one vendor is that it simplifies the procurement process and makes support easier too – the infamous “One Throat To Choke” cliché.

Marketing Overdrive

If you believe the hype, the engineered system is the future of I.T. and anyone foolish enough to ignore this “new” concept is going to be left behind. So many of the vendors are pushing hard on that message, but of course there is one particular company with an ultra-aggressive marketing department who stands out above the rest: the one that bet the farm on the idea. Let’s have a look at an example of their marketing material:

Video hosted by YouTube under Standard Terms of Service. Content owner: Oracle Corporation

Now this is all very well, but I have an issue with Engineered Systems in general and this video in particular. Oracle says that if you want a car you do not go and buy all the different parts from multiple, disparate vendors and then set about putting them together yourself. Leaving aside the fact that some brave / crazy people do just that, let’s take a second to consider this. It’s certainly true that most people do not buy their cars in part form and then integrate them, but there is an important difference between cars and the components of Oracle’s Engineered Systems range: variety.

If we pick a typical motor vehicle manufacturer such as Ford or BMW, how many ranges of vehicle do they sell? Compact, family, sports, SUV, luxury, van, truck… then in each range there are many models, each model comes in many variants with a huge list of options that can be added or taken away. Why is there such a massive variety in the car industry? Because choice and flexibility are key – people have different requirements and will choose the product most suitable to their needs.

Looking at Oracle’s engineered systems range, there are six appliances – of which three are designed to run databases: the Exadata Database Machine, the SuperCluster and the ODA. So let’s consider Exadata: it comes in two variants, the X3-2 and the X3-8. The storage for both is identical: a full rack contains 14x Exadata storage servers each with a standard configuration of CPUs, memory, flash cards and hard disk drives. You can choose between high performance or high capacity disk drives but everything else is static (and the choice of disk type affects the whole rack, not just the individual server). What else can you change? Not a lot really – you can upgrade the DRAM in the database servers and choose between Linux or Solaris, but other than that the only option is the size of the rack.

The Exadata X3-2 comes in four possible rack sizes: eighth, quarter, half and full; the X3-8 comes only as a full rack. These rack sizes take into account both the database servers and the storage servers, meaning the balance of storage to compute power is fixed. This is a critical point to understand, because this ratio of compute to storage will vary for each different real-world database. Not only that, but it will vary through time as data volumes grow and usage patterns change. In fact, it might even vary through temporal changes such as holiday periods, weekends or simply just the end of the day when users log off and batch jobs kick in.

Flexibility

And there’s the problem with the appliance-based solution. By definition it cannot be as flexible as the bespoke alternative. Sure I don’t want to construct my own car, but I don’t need to because there are so many options and varieties on the market. If the only pre-integrated cars available were the compact, the van and the truck I might be more tempted to test out my car-building skills. To continue using Exadata as the example, it is possible to increase storage capacity independent of the database node compute capacity by purchasing a storage expansion rack, but this is not simply storage; it’s another set of servers each containing two CPU sockets, DRAM, flash cards, an operating system and software, hard disks… and of course a requirement to purchase more Exadata licenses. You cannot properly describe this as flexibility if, as you increase the capacity of one resource, you lose control of many other resources. In the car example, what if every time I wanted to add some horsepower to the engine I was also forced to add another row of seats? It would be ridiculous.

Summary: Two Sides To Every Coin

Engineered Systems are a design choice. Like all choices they have pros and cons. There are alternatives – and those alternatives also have pros and cons. For me, the Engineered System is one end of a sliding scale where hardware and software are tightly integrated. This brings benefits in terms of deployment time and performance optimization, but at the expense of flexibility and with potential vendor-lockin. The opposite end of that same scale is the Software Defined Data Centre (SDDC), where hardware and software are completely independent: hardware is nothing more than a flexible resource which can be added or removed, controlled and managed, aggregated and pooled… The properties and characteristics of the hardware matter, but the vendor does not. In this concept, data centres will simply contain elastic resources such as compute, storage and networking – which is really just an extension of the cloud paradigm that everyone has been banging on about for some time now.

It’s going to be interesting to see how the engineered system concept evolves: whether it will adapt to embrace ideas such as the SDDC or whether your large, monolithic engineered system will simply become another tombstone in the corner of your data centre. It’s hard to say, but whatever you do I recommend a healthy dose of scepticism when you read the marketing brochure…

New Installation Cookbook

Short post to mention that I’ve added another installation cookbook to the set published here. This one falls into the Advanced Cookbook section and covers installation on Oracle Linux 6.3 with Oracle 11.2.0.3 single-instance and 4k ASM, paying special attention to the configuration of UDEV and the multipathing software.

The blog posts haven’t been coming thick and fast recently as I have been concentrating on Violin’s (excellent) end of year but I hope to resume soon. I have one more piece to publish concerning subjects like Exadata and VMware, then a new blog series on “Storage for DBAs” to mark the combined anniversaries of my joining Violin and starting this blog.

In the meantime I’d like to recommend this short but very interesting blog series on Exadata Hybrid Columnar Compression over at ofirm.wordpress.com – part one starts here…

Oracle VM 3.1.1 with Violin Memory storage

As part of the Installation Cookbook series I have now posted a new entry on how to install Oracle VM with Violin Memory flash storage:

Oracle VM 3.1.1 with Violin Memory storage

The Standard Tech Industry Sales Pitch

The tech industry is full of people, companies and organisations that want your attention, your custom and your money. There are so many of them out there it’s mind-boggling – so how does one go about standing out from the crowd? The winners are the ones that can differentiate themselves from the rest – the ones that grab your attention and keep hold of it from first contact through sales pitch and on to sale. But the funny thing is that the more people (and companies) try to stand out, the more they often sound the same. I guess we’re not all that different after all…

team

I work in a sales organisation in the tech industry, so I not only get to see a lot of marketing material and sales pitches but I also have to write and deliver them. I am in no way claiming to be better than the rest here, but since it’s the start of a new year and everyone is gearing up to win new business, let’s have a look at the standard tech industry pitch and see just how similar everyone’s messages are. I thought it would be more interesting than just adding my voice to the chorus of 2013 predictions…

Let’s say you are a company which makes some sort of data-related product: software to access, analyse or consume data; hardware to store, accelerate or process it. Maybe you make cloud-enabled big data in-memory analytical engines for social-networking in the mobile era. It doesn’t matter – the rules are always the same. Here’s the template to which you must conform, with all the necessary stock phrases and bullet points:

1/ Paint a picture of a new era in which existing tech cannot deliver.

Key phrase: “We live in a world where…“

Bullet points:

..unprecedented volumes of data, exponential growth, Moores Law
..mobile, social, Nexus of Forces, the Internet of Things
..big data, business intelligence, analytics
..heightened customer expectations, real-time data
..performance, acceleration, innovation

It always helps in this section if you can reference some sort of independent research to backup your theory, preferably from the likes of Gartner or IDC. For example, Gartner says Big Data will drive $34 billion of IT spending in 2013.

2/ Describe the purgatory in which customers are currently trapped.

Key phrase: “CIOs are being asked to do more with less“

Bullet points:

..economic pressures
..tightened budgets
..restricted operational expenditure
..limited investment but increasing demands
..aging and complex infrastructure
..legacy, legacy, legacy
..silos, sprawl, management overhead

3/ (Optional) Why other vendors and methods cannot deliver.

Key phrase: “Legacy approaches are not working”

Bullet points:

..lacking innovation
..unable to cope with modern demands
..wrong direction

You might even want to run some adverts criticising the opposition and showing off how much better you are… although to be fair that’s not standard practice unless you are a certain database company.

4/ Tada! We have the solution and we can now solve all of your problems.

Key phrase: “An innovative new way of thinking”

Bullet points:

..performance, increased agility, lower costs
..better return on investment, lower total cost of ownership
..leverage existing investments, increase utilisation
..reduce overheads, management costs, deployment times
..cloud-enabled, mobile, social, big data, real-time, in-memory

5/ Nobody else can do this, so don’t even waste time looking.

Key phrase: “Our unique product / service / solution”

Bullet points:

..broad portfolio for your unique requirements
..best of breed, turnkey, converged infrastructure/systems
..unified management
..pre-configured, pre-integrated, workload-optimized
..one throat to choke

That’s it. Stick to this recipe and you should be able to merge in nicely with everyone else who is trying to give the same message. And just for fun, here’s a perfect example of someone following the above script…

A happy and prosperous 2013 to you all.

Database Workload Theory

equations

In the scientific world, theoretical physicists postulate theories and ideas, for example the Higgs Boson. After this, experimental physicists design and implement experiments, such as the Large Hadron Collider, to prove or disprove these theories. In this post I’m going to try and do the same thing with databases, except on a smaller budget, with less glamour and zero chance of winning a Nobel prize. On the plus side though, my power bills will be a lot lower.

That last paragraph was really just a grandiose way of saying that I have an idea, but haven’t yet thought of a way to prove it. I’m open to suggestions, feedback and data which prove or disprove it… but for now let’s just look at the theory.

Visualising Database Server I/O Workload

If you look at a database server running a real life workload, you will generally see a pattern in the behaviour of the I/O. If you plot a graph of the two extremes of purely sequential I/O and purely random I/O most workloads will fit somewhere along this sliding scale :

IO-scale

Now of course workloads change all the time, so this is an approximation or average, but it makes sense. After all, we do this in the world of storage, because if the workload is highly random the storage requirements will be very different to if the workload is highly sequential.

What I am going to do now is plot a graph with this as the horizontal axis. The vertical axis will be an exponential representation of the storage footprint used by the database server, i.e. the amount of space used. I can then plot different database server workloads on the graph to see where they fall.

But first, two clarifications. I am at pains to say “database server” instead of “database” because in many environments there are multiple database instances generating I/O on the same server. What we are interested in here is how the storage system is being driven, not how each individual database is behaving. Remember this point and I’ll come back to it soon. The other clarification is regarding workload – because many systems have different windows where I/O patterns change. The classic (and very common) example is the OLTP database where users log off at the end of the day and then batch jobs are run. Let’s plot the OLTP and batch workloads as separate points on our graph.

Here’s what I expect to see:

There are data points in various places but a correlation is visible which I’ve highlighted with the blue line. Unfortunately this line is nothing new or exciting, it’s just a graphical representation of the fact that large databases tend to perform lots of sequential I/O whereas small databases tend to perform lots of random I/O.

Why is that? Well because in most cases large databases tend to be data warehouses, decision support systems, business intelligence or analytics systems… places where data is bulk loaded through ETL jobs and then scanned to create summary information or spot trends and patterns. Full table scans are the order of the day, hence sequential I/O. On the other hand, smaller databases with lots of random I/O tend to be OLTP-based, highly transactional systems running CRM, ERM or e-Commerce platforms, for example.

Still, it’s a start – and we can visualise this by dividing the graph up into quadrants and calling them zones, like this:

This is only an approximation, but it does help with visualising the type of I/O workload generated by database servers. However, there are two more quadrants looking conspicuously un-labelled, so let’s now turn our attention to them.

Database Consolidation I/O Workload

The bottom left quadrant is not very exciting, because small database systems which generate highly-sequential workloads are rare. I have worked on one or two, but none that I ever felt should actually have been designed to work that way. (One was an indexing system which got scrapped and replaced with Lucene, the other I am still not sure actually existed or if it was just a bad dream that I once had…)

The top right quadrant is much more interesting, because this is the world of database consolidation. I said I would come back to the idea that we are interested not in the workload of the database but of the database server. The reason for this is that as more databases are run on the same server and storage infrastructure, the I/O will usually become increasingly random. If you think about multiple sets of disparate users working on completely different applications and databases, you realise that it quickly becomes impossible to predict any pattern in the behaviour of the I/O. We already know this from the world of VDI, where increasing the number of seats results in an increasingly random I/O requirement.

The top right quadrant requires lots of random I/O and yet is large in capacity. Let’s label it the consolidation zone on our graph:

We now have a graphical representation of three broad areas of I/O workload. If we believe in the trend of database consolidation, as described by the likes of Gartner and IDC, then over time the dots in the DW and OLTP zones will migrate to the consolidation zone. I have already blogged my thoughts on the benefits of database consolidation, bringing with it increased agility and massive savings in operational costs (especially Oracle licenses) – and many of the customers I have been speaking to both at Violin and in my previous role are already on this journey, even if some are still in the planning stages. I therefore expect to see this quadrant become increasingly populated with workloads, particularly as flash storage technologies take away the barriers to entry.

I/O Workload Zone Requirements

The final step in this process is to look at the generic requirements of each of our three workload zones.

database-io-workload-requirements

The data warehouse zone is relatively straightforward, because what these systems need more than anything is bandwidth. Also known as throughput, this is the ability of the storage to pump large volumes of data in and out. There is competition here, because whilst flash memory systems can offer excellent throughput, so can disk systems. So can Exadata of course, it’s what it was designed for. Mind you, flash should enable a lower operational cost, but this isn’t a sales pitch so let’s move on to the next zone.

The OLTP zone is all about latency. To run a highly-transactional system and get good performance and end-user experience, you need consistently low latency. This is where flash memory excels – and disk sucks. We all (hopefully) know why – disk simply cannot overcome the seek time and rotational latency inherent in its design.

The consolidation zone however is particularly interesting, because it has a subtly different set of requirements. For consolidation you need two things: the ability to offer sustained high levels of IOPS, plus predictable latency. Obviously when I say that I mean predictably low, because predictably high latency isn’t going to cut it (after all, that’s what disk systems deliver). If you are running multiple, disparate applications and databases on the same infrastructure (as is the case with consolidation) it is crucial that each does not affect the performance of the other. One system cannot be allowed to impact the others if it misbehaves.

Now obviously disk isn’t in with a hope here – highly random I/O driving massive and sustained levels of IOPS is the worst nightmare for a disk system. For flash it’s a different story – but it’s not plain sailing. Not every flash vendor can truly sustain their performance levels or keep their latency spike-free. Additionally, not every flash vendor has the full set of enterprise features which allow their products to become a complete tier of storage in a consolidation environment.

As database consolidation increases – and in fact accelerates with the continued onset of virtualisation – these are going to be the requirements which truly differentiate the winners from the contenders in the flash market.

It’s going to be fun…

Disclaimer

These are my thoughts and ideas – I’m not claiming them as facts. The data here is not real – it is my attempt at visualising my opinions based on experience and interaction with customers. I’m quite happy to argue my points and concede them in the face of contrary evidence. Of course I’d prefer to substantiate them with proof, but until I (or someone else) can devise a way of doing that, this is all I have. Feel free to add your voice one way or the other… and yes, I am aware that I suck at graphics.