November 16, 2012 Leave a comment
You might be tempted to think that In-Memory technologies and flash are concepts which have no common ground. After all, if you can run everything in memory, why worry about the performance of your storage? However, the truth is very different: In-Memory needs flash to reach its true potential. Here I will discuss why and look at how flash memory systems can both enable In-Memory technologies as well as alleviate some of the need for them.
Note: This is an article I wrote for a different publication recently. The brief was to discuss at a high level the concepts of In-Memory Computing. It doesn’t delve into the level of technical detail I would usually use here – and the article is more Violin marketing-orientated than those I would usually publish on my personal blog, so consider yourself warned… but In-Memory is an interesting subject so I believe the concepts are worth posting about.
In-Memory Computing (IMC) is a high-level term used to describe a number of techniques where data is processed in computer memory in order to achieve better performance. Examples of IMC include In-Memory Databases (which I’ve written about previously here and here), In-Memory Analytics and In-Memory Application Servers, all of which have been named by Gartner as technologies which are being increasingly adopted throughout the enterprise.
To understand why these trends are so significant, consider the volume of data being consumed by enterprises today: in addition to traditional application data, companies have an increasing exposure to – and demand for – data from Gartner’s “Nexus of Forces”: mobile, social, cloud and big data. As more and more data becomes available, competitive advantages can be won or lost through the ability to serve customers, process metrics, analyze trends and compute results. The time taken to convert source data to business-valuable output is the single most important differentiator, with the ultimate (and in my view unattainable – but that’s the subject for another blog post) goal being output that is delivered in real-time.
But with data volumes increasing exponentially, the goal of performance must also be delivered with a solution which is highly scalable. The control of costs is equally important – a competitive advantage can only be gained if the solution adds more value than it subtracts through its total cost of ownership.
How does In-Memory Computing Deliver Faster Performance?
The basic premise of In-Memory Computing is that data processed in memory is faster than data processed using storage. To understand what this means, first consider the basic elements in any computer system: CPU (Central Processing Unit), Memory, Storage and Networking. The CPU is responsible for carrying out instructions, whilst memory and storage are locations where data can be stored and retrieved. Along similar lines, networking devices allow for data to be sent or received from remote destinations.
Memory is used as a volatile location for storing data, meaning that the data only remains in this location while power is supplied to the memory module. Storage, in contrast, is used as a persistent location for storing data i.e. once written data will remain even if power is interrupted. The question of why these two differing locations are used together in a computer system is the single most important factor to understand about In-Memory Computing: memory is used to drive up processor utilization.
Modern CPUs can perform many billions of instructions per second. However, if data must be stored or retrieved from traditional (i.e. disk) storage this results in a delay known as a “wait”. A modern disk storage system performs an input/output (I/O) operation in a time measured in milliseconds. While this may not initially seem long, when considered in the perspective of the CPU clock cycle where operations are measured in nanoseconds or less, it is clear that time spend waiting on storage will have a significant negative impact on the total time required to complete a task. In effect, the CPU is unable to continue working on the task at hand until the storage system completes the I/O, potentially resulting in periods of inactivity for the CPU. If the CPU is forced to spend time waiting rather than working then it can be considered that the efficiency of the CPU is reduced.
Unlike disk storage, which is based on mechanical rotating magnetic disks, memory consists of semiconductor electronics with no moving parts – and for this reason access times are orders of magnitude faster. Modern computer systems use Dynamic Random Access Memory (DRAM) to store volatile copies of data in a location where they can be accessed with wait times of approximately 100 nanoseconds. The simple conclusion is therefore that memory allows CPUs to spend less time waiting and more time working, which can be considered as an increase in CPU efficiency.
In-Memory Computing techniques seek to extract the maximum advantage out of this conclusion by increasing the efficiency of the CPU to its limit. By removing waits for storage where possible, the CPU can execute instructions and complete tasks with the minimum of time spent waiting on I/O.
While IMC technologies can offer significant performance gains through this efficient use of CPU, the obvious drawback is that data is entirely contained in volatile memory, leading to the potential for data loss in the event of an interruption to power. Two solutions exist to this problem: the acceptance that all data can be lost or the addition of a “persistence layer” where all data changes must be recorded in order that data may be reconstructed in the event of an outage. Since only the latter option guarantees business continuity the reality of most IMC systems is that data must still be written to storage, limiting the potential gains and introducing additional complexity as high availability and disaster recovery solutions are added.
What are the Barriers to Success with In-Memory Computing?
The main barriers to success in IMC are the maturity of IMC technologies, the cost of adoption and the performance impact associated with adding a persistence layer on storage. Gartner reports that IMC-enabling application infrastructure is still relatively expensive, while additional factors such as the complexity of design and implementation, as well as the new challenges associated with high availability and disaster recovery, are limiting adoption. Another significant challenge is the misperception from users that data stored using an In-Memory technology is not safe due to the volatility of DRAM. It must also be considered that as many IMC products are new to the market, many popular BI and data-manipulation tools are yet to add support for their use.
However, as IMC products mature and the demand for performance and scalability increases, Gartner expects the continuing success of the NAND flash industry to be a significant factor in the adoption of IMC as a mainstream solution, with flash memory allowing customers to build IMC systems that are more affordable and have a greater impact.
NAND Flash Allows for New Possibilities
The introduction of NAND flash memory as a storage medium has caused a revolution in the storage industry and is now allowing for new opportunities to be considered in realms such as database and analytics. NAND flash is a persistent form of semiconductor memory which combines the speed of memory with the persistence capabilities of traditional storage. By offering speeds which are orders of magnitude faster than traditional disk systems, Violin Memory flash memory arrays allow for new possibilities. Here are just two examples:
First of all, In-Memory Computing technologies such as In-Memory Databases no longer need to be held back by the performance of the persistence layer. By providing sustained ultra-low latency storage Violin Memory is able to facilitate customers in achieving previously unattainable levels of CPU efficiency when using In-Memory Computing.
Secondly, for customers who are reticent in adopting In-Memory Computing technologies for their business-critical applications, the opportunity now exists to remove the storage bottleneck which initiated the original drive to adopt In-Memory techniques. If IMC is the concept of storing entire sets of data in memory to achieve higher processor utilization, it can be considered equally beneficial to retain the data on the storage layer if that storage can now perform at the speed of flash memory. Violin Memory flash memory arrays are able to harness the full potential of NAND flash memory and allow users of existing non-IMC technologies to experience the same performance benefits without the cost, risk and disruption of adopting an entirely new approach.