Cloud DBA: The Next Generation of Database Administrator?

Don’t drop the ball…

In the previous post, I ranted discussed the evolution of the DBA role, looking at how many additional functions the database administrator has inherited over the years: code fixer, virtualisation tamer, Linux / Windows juggler, reluctant storage administrator, application server hater, firewall botherer and all round fixer of any product badged as Oracle.

But the real change I am interested in comes as a result of databases moving into the cloud. Because this exposes the DBA to ownership of a new problem: cost. Specifically, ongoing operational costs – or Opex. It is my belief that this is in fact A New Thing – and New Things are not to be trusted. Sure, in the on prem world, DBAs were involved in decisions concerning capital expenditure (Capex) like the scoping of database servers, the calculation of how many database licenses were needed, the justification of additional license options (e.g. Enterprise Edition instead of Standard Edition). But in most cases, those decisions were made by a collective and then signed off by the business.

My Public Cloud Bill Just Arrived…

Cloud is different. Everything you do in the public cloud costs money. You want to spin up an instance? Kerching. You want to use some SSD storage? Kerching! You want to download copies of your data to an on prem location? Egress charges ahoy… KERCHING!

Bills, Bills, Bills…

Decisions taken by DBAs in the normal course of their day jobs can now have a significant effect on the next invoice from the cloud vendor. Do you remember in the early days of cell phones, if you used your phone a lot you were never entirely sure what the bill would look like at the end of the month? Could be a little more than usual, could be so massive you need a loan from the World Bank. Sometimes, the cloud has a similar feel.

Most cloud vendors have remarkably complex pricing structures (some say this complexity is deliberate!) and this has in fact spawned a whole industry of experts (“cloud economists”) who can help customers understand and reduce their cloud costs, often using the two step principle of 1) turn stuff off, and 2) negotiate harder for discounts.

Into this new minefield steps that brave warrior, the DBA. Often charged with the apparently simple task of “move that database into the cloud”, not only must a new technical language be learned (e.g. “it’s not a VM in the cloud, it’s an instance”) and a new set of TLAs be absorbed (“In my AWS VPC, I use EC2, EBS, S3 and ZXP”)… but also a new understanding must be gained of what each checkbox and pulldown option does to the operating cost.

Another Plate To Spin

It’s a whole new area of expertise to take on – and it’s complex. What’s more, it’s subtly different between cloud vendors – and even if you only use one cloud, it’s subject to change over time. Usually in the direction of more expensive

Here’s a simple example: provisioning an instance. You are a DBA (congrats!) and you need to migrate your on prem database into, say, Amazon Web Services. You first of all need to configure a Linux instance and some disks. There are many different ways of doing this – including templates, infrastructure-as-code and so on – but let’s do it in the GUI for fun. First, you’ll need some compute power, so let’s provision some from the Elastic Compute Cloud (EC2). Which type shall we choose?

If you are new to this, there are a lot of options. I mean, really a lotLet me see now, there’s categories of General Purpose, Compute Optimized, Memory Optimized, Accelerated Computing, or Storage Optimized. These are just the categories… each one of which contains many types, which contains many options! But “General Purpose” sounds kinda normal, so let’s choose that. Now you need to choose the instance type:

Amazon Web Services – Elastic Compute Cloud choices for General Purpose instance types

Amazon Web Services – EC2 M5 Large instance types

If we go for instance type of M5, we are told that “This family provides a balance of compute, memory, and network resources, and is a good choice for many applications”. Cool, so now you have to pick the instance size:

This screenshot only shows a fraction of the total choices, with each config of vCPUs and Memory replicated again in the m5d.* range (adds NVMe SSD storage), plus some further options around bare metal. It is a labyrinthine set of options to consider.

If you haven’t undertaken the myriad training courses for this cloud vendor, how do you know which instance size to choose? Well, maybe the same way that you specced up the config of your on prem database servers before… right? Except most DBAs didn’t do that, they were allocated servers without really playing a part in their procurement. But my real point here is that the choice you make reflects the ongoing monthly cost. And there are more choices to make! After all, you are going to need some storage from Elastic Block Store on which to place your database:

Amazon Web Services – Elastic Block Store volume types

Amazon recommends one of two different options for “I/O-intensive NoSQL and relational databases” plus a third for data warehouses. I’ll tell you right now, if your database is even mildly transactional, you will want to use io1 or io2. Whatever you choose, it will have an affect on the monthly cost – you can see this by checking it out on the AWS Calculator.

And you know what we didn’t even cover at the start? The region – the geographical location in which this instance runs – also changes the cost, sometimes significantly. Pricing for European regions is often surprisingly higher than regions in the US.

Why This Matters (TL;DR)

What I am trying to show here is that, in the course of provisioning databases in the cloud, DBAs are having to make complicated choices which not only affect the performance of their databases but also the ongoing cost. In fact, it’s a balancing act: performance and cost are two sides of the same coin. Amazon Web Services, in the example above, offers a huge and dazzling array of options which offer different trade offs for these two dimensions. That’s not a bad thing by the way – I am not criticising AWS for giving us a choice – but it’s bewildering to the uninitiated.

What’s more, if you put a database in Microsoft Azure, or Google Cloud Platform, or Oracle Cloud Infrastructure, or Alibaba Cloud or … I can’t think of any other clouds … then be prepared for the fact that everything changes again.

It’s time for DBAs to learn to juggle with yet another ball.


One Response to Cloud DBA: The Next Generation of Database Administrator?

  1. Hemant K Chitale says:

    Yes, Cloud Provisioning and Billing are complicated.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.