History of Exadata
The story of Exadata’s development is an interesting one, particularly if you combine it with a look at the way in which it has been marketed by Oracle…
It began back in the first decade of the millennium with an internal project called SAGE, an acronym for Storage Appliance for Grid Environments. This was to be an open hardware stack solution, i.e. with no proprietary hardware. The competition was Teradata and Netezza (now owned by IBM), who relied much more on defined or proprietary hardware solutions, so Oracle’s aim was to compete through a) openness, and b) Moore’s law. The latter meant that as commodity hardware improved with technological advances, the SAGE software running on it would benefit from these advances whilst the competition struggled to test and release newer appliance-based solutions. The idea of openness is also important, because it could be argued that much of Oracle’s success in becoming the dominant relational database vendor over the past few decades came from the policy of offering a wide variety of ports and platform choices.
So now let’s have a look at what happened when SAGE first went to market.
September 2008: Exadata v1
At Oracle OpenWorld in San Francisco, Oracle Corporation announced the “HP Oracle Database Machine”, a combined hardware and software stack running Oracle’s 11g Release 1 database software on top of HP commodity hardware.
The crucial ingredient to this package was the “Exadata Storage Servers”, described by Oracle as “the First-Ever Smart Storage Designed for Oracle Data Warehouses”. A full rack system contained 8 database nodes, 14 Exadata storage servers (or “cells”) and an internal private Infiniband network.
Comment: There are two interesting things to note about this. Firstly, the intelligent storage was designed specifically for data warehousing purposes rather than, say, OLTP. Secondly, the “Exadata” name only applied to the storage servers and their software. This is important because, as Kevin Closson has pointed out in the past, Exadata was originally intended to be an open hardware stack and ports were to be created for HPUX and potentially other platforms. The idea of “Engineered Systems” really did not seem to exist at the time.
September 2009: Exadata v2
One year later at the following Oracle OpenWorld – and after Oracle had announced its intended acquisition of Sun Microsystems – the Sun Oracle Database Machine (V2) was introduced, replacing the HP hardware with hardware from Sun and upgrading the database software to 11g Release 2 (a significant upgrade). In addition, the Exadata storage servers were modified to include 4x 96GB Sun SLC Flash cards per server, giving a total of 5.3TB of “Exadata Smart Flash Cache”. Oracle remarketed the product as the “The First Database Machine for OLTP”:
Comment: At this point the idea of an open hardware stack had clearly been dropped, since Oracle had now entered the hardware industry through the Sun acquisition. Obviously the HP kit had all been replaced by Sun kit, with the addition of the flash cards and a software upgrade from Oracle 11.1 to 11.2. So how justifiable was the claim that the V2 was now a “Database Machine for OLTP”? Given the fact that the storage software was “designed for Oracle data warehouses” it appeared that the basis of the whole OLTP claim was the new Exadata Smart Flash Cache residing on the Sun F20 flash cards. However, we know from Kevin Closson (who was, at the time, Performance Architect for Exadata within the Oracle Product Development organisation) that in fact the flash cards were added in order to increase the data flow capabilities of the data warehouse design. To quote Kevin:
Exadata Smart Flash Cache was originally seen as a way to improve data flow for Data Warehousing. It’s quite simple. Current-generation Exadata Storage Servers have five PCI slots over which database I/O can flow — but only 12 hard disk drives. The 12 hard disk drives are attached via a single OEM’ed LSI PCI controller. Therefore, a quarter-rack Exadata Database Machine — without Exadata Smart Flash Cache — is limited to roughly 5.5 GB/s of storage scan bandwidth with the high-performance, low-capacity SAS hard disk drives. The story changes, however, when four more PCI slots are populated and filled with cached table and index data. With all five slots actively flowing data, the storage scan rate is increased to about 16 GB/s for a quarter-rack. It was never about OLTP or ERP, because these use cases need to scale up writes along with reads. A platform that cannot scale up writes along with reads is not a good OLTP platform — much less the World’s First OLTP Machine.
This brings to light a disconnect within Oracle – the idea that the Development organisation were still improving the design based on a data warehousing workload whilst the Marketing organisation attempted to widen the scope of the product by claiming OLTP benefits. Furthermore it suggests that the core design values did not change, that Exadata remained a solution designed for data warehousing, but one that was now to be presented as an OLTP solution in spite of this.
September 2010: Exadata X2
One more year and one more Oracle Openworld later, Oracle announced the release of the Oracle Exadata Database Machine (X2). The new machine was now available in two versions, the X2-2 and the X2-8. Again the marketing messages about the scope of Exadata had grown:
In addition to the data warehousing focus (for which Exadata was originally designed and architected) and the OLTP focus (via the addition of flash cards) a new claim appeared regarding Exadata as the “best consolidation platform”.
In fact, to complete the picture of Oracle’s new push to make Exadata the default choice, the new marketing message because “Exadata is Oracle’s strategic database platform for ALL database workloads” (for an example see slide 4 of this slide deck).
The X2 version of Exadata remained current for two years, with some minor increases to the server specs during that period. One criticism levelled at the design by Oracle’s detractors was that CPU and storage could not be scaled separately – to buy more storage a customer also had to buy more Exadata servers, adding to the processing capacity (and potentially to the licensing cost) whether it was required or not. To counter this problem, Oracle introduced the Exadata Storage Expansion Rack, an identically-sized rack of Exadata storage servers (available in the usual quarter, half and full configurations) with all of the necessary Infiniband networking required to connect to Exadata.
Comment: The biggest criticism faced by the X2 model during its sale was the perceived lack of OLTP performance. The Exadata Smart Flash Cache was a “write through” cache where all database writes had to be written to disk before completion, substantially reducing the system’s random write capabilities. A new feature called Exadata Smart Flash Logging aimed to improve the performance of redo log writes by directing writes to both flash and disk, but this only had a positive effect on a small percentage of redo writes.
September 2012: Exadata X3
Two years after the release of the X2 a new generation of Exadata machine, the “Exadata X3 Database In-Memory Machine”. Available, as before, in the -2 and -8 versions, the X3 was mainly an upgrade of the Sun Fire servers used for the database and storage grids. With the exception of the X3-8 database servers, the Intel Xeon processors were all upgraded to the later Sandy Bridge architectures, with DRAM also being boosted and the flash cards being upgraded to 4x 400GB of MLC flash per storage cell. No changes were made to the storage server disks or the Infiniband network, which remained active/passive QBR.
A notable new claim for the X3 press release was the implementation of “a mass memory hierarchy that automatically moves all active data into Flash and RAM memory, while keeping less active data on low-cost disks”. This appears to be the main substance behind Oracle’s new claim that the Exadata X3 was a “Database In-Memory Machine”. For many months prior to the X3 launch SAP had been criticising Oracle’s Exalytics product and promoting their own HANA database as a high-performance “In-Memory” solution. Oracle’s counter to this criticism therefore came in the form of portraying the X3 as an In-Memory database machine, despite the fact that SAP and Oracle had different ideas about the description of “memory”.
Also released during the X3 launch was a new version of the Exadata storage software allowing for a “write back” cache, i.e. writes from the database could now be acknowledged once they had been written to flash instead of waiting for completion at disk. In effect the flash began to act as another level of buffer cache, with Oracle claiming that this gave the X3 “20 times more capacity for database writes”, whilst previous X2 and V2 models could also benefit from the same software enhancement, bringing an improvement of 10 times.
Comment: It seems that with the decision to brand Exadata as an In-Memory solution Oracle has a dilemma. The Exadata X3 database servers can only contain a maximum of 256GB of DRAM, yet server manufacturers such as Fujitsu, IBM and Cisco UCS supply similar two-socket Sandy Bridge-based servers which can contain 3x or more DRAM. SAP’s HANA in-memory database runs purely in DRAM on one or more database servers, scaling horizontally when the limit of DRAM in a single server has been reached. But for Oracle it does not make commercial sense to place the whole dataset in memory on the database server, since this could negate the need for customers to purchase many of the licensable products such as RAC and the Exadata Storage Server licenses. Oracle’s solution has therefore been to extend the concept of “memory” over the Infiniband network and onto the storage servers, which contain DRAM, flash and disk. By conceptually describing the remote DRAM and flash on the storage servers as “memory”, Oracle is able to claim that the X3 is an “In-Memory” product and yet retain the storage servers as an essential part of the solution. The technical downside to this is the IPC and Infiniband overhead required to manage this remote “memory”, which is significant in comparison to the zero overhead required to manage local DRAM, such as in the HANA solution.
December 2013: Exadata X4
In a break with tradition which suggested delivery problems, the X4 generation of Exadata was released three months after OpenWorld 2013. For this release, Oracle dispensed with the somewhat disingenuous “In-Memory” name and returned to the simple title of “Exadata Database Machine X4″. The X4-2 model shipped with improved Intel Xeon processors (upgraded to Ivy Bridge) and more DRAM, as well as double the flash capacity of its predecessor. For the first time since the original HP Oracle Database Machine (v1) the high performance disks increased in size, growing from 600GB to 1.2TB. In order to achieve this doubling of raw disk capacity, the drives had to be changed from 15k RPM models to slower 10k RPM spindles. The high capacity drive option saw a 33% increase from 3TB to 4TB models. The internal Infiniband network finally changed from using active/passive bonding on each server and storage cell to having a fully active configuration. No new X4-8 version was announced at the time, due to Intel having not yet releasing the required Ivy Bridge EX processors, but the X3-8 model was updated to ship with X4 storage cells. The X4-8 finally arrived in July 2014 and featured a 50% increase in (licensable) database CPU cores, up to 240 for the full rack.
As with all Engineered System product releases from Oracle, the claims made in the press release give an insight into Oracle’s strategy moving forward:
In the case of the X4, while the usual claims about data warehousing and OLTP benefits remain, the new phrase which dominates all marketing material is database as a service. This of course could be seen as another term for database consolidation (a phrase frequently mentioned in the X2 release but now apparently fallen out of favour), but perhaps with stronger connotations of cloud strategy. It is notable that Oracle claims the X4 is “optimized” for OLTP, database as a service and data warehousing, because this appears to cover every possible workload.
Another interesting marketing claim relates to the newly-increased flash capacity of the X4-2:
In this statement, Oracle suggests that the 44TB of raw flash in a full rack is effectively 88TB of “logical flash cache capacity” owing to a new flash cache compression feature of the Exadata Smart Flash Cache software, for which the Advanced Compression Option must be licensed. The X4-2 datasheet takes this a step further by claiming that a full rack has an “effective flash cache capacity” of 448TB, an increase of 10x. This kind of inflated claim seems very dangerous, given that it is based on data reduction techniques which offer no guarantee and requires an additional option to be licensed. At the very least customers may misunderstand the capacity in any machine they buy. Perhaps the worst example of this vague language can be found on slide 21 of this Oracle presentation, in which the term raw is used to describe flash capacity after compression – yet clearly compressed data can only be stored in usable flash and not raw:
Comment: Reading between the lines of the X4 marketing, there is an important story to be seen, which is Oracle’s movement of primary data into flash – almost by stealth. The amount of flash contained in an X4-2 model is now so large that Oracle has been forced to swap the fastest 15k RPM drives for 33% slower 10k RPM drives in order to increase their capacity. This simply became unavoidable otherwise the amount of so-called logical flash cache would have been similar to the raw disk capacity – and considerably larger than the usable disk capacity. Even Oracle’s product management VPs have admitted this, making the comment that since entire databases are “commonly sitting in flash all the time“, the disk-based back end is now “used to store colder, inactive data“. Another telling quote was that in the X4, “disk is the new tape“. This is an extraordinary statement considering the effective price of Exadata disk storage (including storage licenses and maintenance) but it is backed up by the fact that in Exadata, flash is always used as cache. To simplify that statement, data in cache always has a corresponding copy residing on persistent disk – even it is it not up to date.
So Exadata’s journey is now complete; from an open-platform storage product designed specifically for data warehousing to a closed-platform vendor-specific database “in memory” appliance aimed at (and allegedly optimised for) all database workloads. The question is… does this actually make sense? Is it realistic for Oracle to claim that Exadata is the best-fit for all of the different types of database workload? If Oracle sees Exadata as the only platform for database customers, how can customers be sure that they are buying a solution which properly fits their requirements? And indeed, is it actually a linguistically correct statement to claim that a product is optimised for every possible workload?
At the end of the day, the best product is not always the most successful – sometimes the most heavily-marketed product will win. From a technical perspective though, the segregation of the database engine into standard compute and storage-level compute with runtime offload seems like a complex solution… and history tells us that simple solutions tend to win the day. Only time will tell…